Wednesday, November 28, 2007

NSF FIND Working Meeting

I'm writing this while stuck at attending an NSF FIND (Future Internet Design) working meeting (Nov 27-28). Part of the FIND program is that there are 3 meetings per year where all the FIND PIs are supposed to get together, to talk about the new network that is the goal of the FIND program. This is my first meeting; I missed the last one. It seems to me that this idea of having PI meetings related to grant proposals is an NSF experiment worth examining, and there are some discussions going on here that may be of broader interest, so I thought I'd report. (If this reads as stream of consciousness, I'm generally typing as things go on.)

One positive for me is the chance to catch up with various networking people who I don't see so regularly. The socializing aspect -- just talking to people, finding out what they're working on, making sure they know Harvard is hiring, all this is good. (Jennifer Rexford and I chatted during a break, and when I sat down after the break, she had already sent me a paper to look at. Which I need to go ask her some questions about... I also finally caught up on some overdue researchy discussion with Dev Shah of MIT; yes, I have to go to DC to meet up with someone from MIT...) In this sense, it's like going to a networking conference, and obviously interacting with peers is one of the pleasures of a conference.

The invited presentations aren't doing much for me, however. They're not highly technical talks -- more high-level talks, connecting network research to network users. (We're hearing two talks right now about issues in the practice of emergency networks in disasters.) It's not that the talks are bad, but it's a bit hard to see their purpose. This is a room full of networking people -- they all have well-developed individual research agendas and know what they think the problems are. I don't think the talks are adding much new. (Let's just say I'm not the only one tapping away at the keyboard when I should be listening to the talk. So again, it's just like a conference.)

Besides invited presentations, there are break-out sessions. I'm sitting in on the "network management" breakout session, which far and away seems the largest. About 25-30 people or so in here. It's actually kind of like a conference panel, only a bit more chaotic with so many people. While the discussion is high-level, and I'm not sure we're headed anyplace specific, it's actually a pretty interesting discussion that highlights the scope of the problems of network management. (So many requirements, so many heterogeneous user classes, so many ill-defined goals, so many ill-defined or at least hard to measure quantities, etc.) Interestingly, the group as a whole seems quite convinced of the importance of what I call theory -- modeling, control, algorithms. (The pro-theory comments seemed to begin from Jennifer Rexford and Anja Feldmann, two incredibly pro-theory and powerful network researchers.) This gives me a good excuse to chime in and further emphasize the importance of theory to network management, which I manage to do repeatedly.

Ellen Zegura is giving a talk about GENI and the GENI Science Council. The GENI Science Council is supposed to come up with a "research plan" for GENI, an experimental facility/infrastructure for the development of the next generation Internet. (Network people rightfully complain it's hard to do experimental research on the current Internet at a large scale, so the plan is to design a research infrastructure for these sorts of problems.) There's a notable shortage of theorists on the council, but they seem to have noticed this and added Joan Feigenbaum and Michael Kearns recently. (From my perspective, if you're going to design a whole new network infrastructure, it seems to me you'd want some theorists to help make sure it's done right.) They're still putting together the research agenda that goes along with the planned facility, and they're still looking for input. Essentially, it seems like they're looking for research examples and research experiments that would utilize or benefit from a GENI-scale system, so that they can explain clearly why building a GENI-scale system is a good idea. So if you have ideas for experiments that you'd like to do on a network-wide scale, you might start looking at what GENI's up to. Overall, GENI seems like such a huge project, and it's still a bit amorphous at this point, so that people are a bit confused. Or maybe that's just me. (I did have to get up at 4 am to catch the early plane down here.)

I talked with Ellen later, and got clarification of the GENI state. I had thought this was well on its way to becoming a possibly multi-hundred-million dollar new NSF project, but apparently it's much less further along than I had thought. If GENI is going to happen, it apparently needs to regain some momentum. And GENI seems a key component of the FIND program; it's hard to develop next-generation network architectures if you don't have a network to try them out on.

Before dinner they have a poster session. It's well done for a poster session -- they have food and a cash bar, and food and drink always seem to be requirements for a poster session where people actually stay and look around. There's a nice mix of work; my favorite is an idea multiple groups seem to be working on that there shouldn't be just one network architecture, but multiple network architectures running over the same network substrate. My interpretation is that you have routers/other infrastructure that can run multiple architectures efficiently in parallel, so running side by side you might have our standard Internet with another network with much stronger (or weaker!) quality of service guarantees, or another network where connections are based on virtual circuits, and so on. This makes sense if you believe that a one-size-fits-all Internet is no longer the best idea.

Day 2 is mostly having the breakout session coalesce their ideas into short presentations. Pablo Rodriguez, of fame for example for his work on Avalanche/network coding at Microsoft, gave a nice talk about the economics of P2P and ISPs; he's left Microsoft to work for Telefonica Barcelona, and has developed some insight from the ISP point of view.

Overall, I ended up with mixed feelings about this working meeting. It's useful to get people together to find out about what everyone is doing and to talk about high-level issues. Conferences already serve some of this purpose, though. Some of the value I got from this meeting derives from the fact I haven't been to a networking conference for a while (I didn't feel like traveling to Japan for SIGCOMM this year...). High-level issues don't necessarily get discussed at conferences, but it's also hard to get 50+ people together to talk about high-level issues and get any sort of agreement that wasn't already obvious. I'm skeptical that 3 such meetings are needed each year. However, the plan seems to be for the next meeting to be a graduate student focused meeting, which may be a good idea -- getting graduate students together to meet and talk like this seems like a potentially very interesting idea.

Going to the meeting will pay off, however, if it leads to any new collaborations. If any FIND people are reading this, and you want to add some theory to your work, send me an e-mail. (Of course, you could also go down the hall and talk to the theorist(s) in your locale, which I'd really strongly recommend, unless I'd be interested, in which case try me first!)

Monday, November 26, 2007

Introducing New Scientific Ideas, and D-Wave

Apropos of my last post on Digital Fountain is Scott's Aaronson's current report on D-Wave researchers giving talks about what's behind the hype of their building of quantum computers. For those who don't know the story, the short summary is that D-Wave Systems has claimed non-trivial advancements in the development of quantum computers, and many (most? all?) academic quantum scientists aren't buying it, based on the very limited evidence they've seen.

Why does this story remind me of Digital Fountain? I recall the early, early days going around giving talks about Tornado codes and their potential uses in networks, and in the beginning, the greatest skeptics were the coding theorists. One talk in particular, I remember, early on a very senior coding theorist insisted repeatedly that highly optimized Reed-Solomon codes could do anything we were claiming. The rest of the talk was a bit more difficult to get through after that argument. (I like to think by the end of the talk he thought that maybe there might be something new there...)

So is this to say that I think the academic crowd is being too hard on D-Wave? Absolutely not. It sounds like they're doing their job -- trying to figure out what's going on and asking tough questions. The point of my parallel is that the coding theorists had the right to be skeptical. (One reason for it was there was a huge language barrier, me not knowing what a "decibel" was, and them not really getting why linear vs. quadratic complexity was such a big deal. The divide being coding theorists and CS theorists has shrunk an awful lot this past decade.) It was our job to convince them that we had the goods. And we provided evidence with equations, papers, talks, and demos. While they took some convincing, the coding theorists listened to the evidence, and rapidly came around.

Here, the story seems to be (my interpretation!) that D-Wave is presenting painfully unconvincing demos and releasing not nearly enough information for experts to see if they have any new ideas. Then they're compounding this by excessive PR overstating what might eventually be possible with quantum computing. (There seem to be some arguments whether they themselves are doing the overstating, or uninformed journalists, in which case D-Wave is just failing to correct the press, but whatever.) Their excuse for this is that they're a private company so they have to keep their secrets. Understood! But they shouldn't then be surprised when knowledgeable academics studying the issue voice their opinion that the emperor has no clothes. I'm quite sure the scientific community has enough open-minded people that if they start producing convincing evidence, or give enough details on what they're doing so people can verify if the path they're taking is productive, they'll be understood and praised. Until then, the feedback they're getting is rightly telling them they have more to do.

Sunday, November 25, 2007

What's Up, Digital Fountain?

Every now and again, I get asked about Digital Fountain, the startup that initially grew out of work I was involved with on Tornado codes, a flavor of low-density parity-check codes. (The current codes being used are apparently based on Amin Shokrollahi's improved variant, Raptor codes -- the page has a link to the Trans. on Inf. Theory paper.) Unfortunately, I've been disconnected from Digital Fountain since the tech bust of 2000 or so, so I'm no longer privy to their internal workings. But the company indeed lives on. Every once in a while, I check for news, and I recently saw this positive piece on ZDNet, complete with a recent talk/demo from the fall Demo conference. Digital Fountain is making a push to get into IPTV.

So maybe someday I'll be watching IPTV using a Digital-Fountain enabled product.

Wednesday, November 21, 2007

Page Limits on Journal Papers

I know of at least two journals that I publish in that have page limits on the length of papers. I am completely puzzled by this. A journal paper should be as long as it needs to be to derive and explain the results the author intends to convey, since the paper should be the final record on that piece of research. Some papers might need 20 pages; others might be too long even at 10. A hard rule of say 12 pages maximum just doesn't make sense.

I can imagine two reasons for instituting page limits on journal papers: paper costs, and reviewer time. Neither seems particularly compelling to me. Are there other reasons I'm missing? (Notice that both of these are more compelling in the case of conferences, where I would agree page limits make more sense. And even in that setting, papers costs is rapidly disappearing as a reason, and at some point, the question of page limits should probably be reconsidered.)

I do admit that there are many papers where the author should be forced to edit things down further, and most authors have a tendency to write too much rather than too little. But it seems the right answer to this is to have good reviewers and a good editor firmly tell an author that the paper is not yet ready and needs further editing. A page limit doesn't adequately solve this problem and raises new, more unpleasant ones.

Anyone want to defend this practice? Or at least suggest reasons for it I haven't thought of?

Sunday, November 18, 2007

David Parkes talk Monday, 4pm

David Parkes will be giving a talk on computational mechanism design in Maxwell-Dworkin G115 at 4pm on the 19th. This is what at Harvard we call the "tenure talk" -- it's a talk he gives giving an overview of his research and research area as he's coming up for tenure. If you're a theorist in the area and want to hear a perhaps slightly different point of view on mechanism design and work at the intersection of economics/cs (David proves things, but he comes from the AI side), you should come. Heck, if you're local, in any area, and interested in the economics/cs interface you should come. It should be a very nice talk, and it would be great to have the room full to bursting as a sign of support of David.

Friday, November 16, 2007

Should IP issues affect our research?

Since I've done some legal consulting, I've become interested in intellectual property (IP) issues, and the question of whether IP issues should affect my research agenda has recently occurred to me.

Let me explain. As an algorithmist, a major interest for me is designing algorithms that people will actually use -- perhaps not in the original form that I describe them in, of course, but that the ideas make it out into the real world. And while it is not my aim that every project I'm involved in should have a clear potential practical impact, it is the case that this goal guides my overall research agenda quite significantly.

IP issues have the potential to limit the practical impact of an algorithm I design. For example, suppose someone has a patent on SuperHashSchemeX (SHSX). (If you want something more specific, suppose Bloom filters were patented.) I might come up to an improvement to SHSX, or a better analysis of SHSX, but if the patent is written well, the patent owner may have rights controlling who could use SHSX, including my improvement. There may really be no hope that anyone else will actually be able to make use of my ideas, until the patent runs out, or unless the user is willing to buy a license. Indeed, unless I go through the process of obtaining a patent, it's possible that my work might simply add value for the initial patent-holder, and I've actually rewarded those who are keeping my work from being used.

Perhaps this wouldn't bother me if my theory blood ran pure, but I like being an "applied theorist". In my mind, this affects the calculus I use in thinking about what I should work on.

Should it? Has anyone actually run into this problem?

Monday, November 12, 2007

An Experiment in Viral Marketing : Buying Tickets Online

(Warning: the following post is highly crass and commercial. If that offends you in any way, please don't read on.)

Thanks to some friends who are working in Web marketing, I have a chance to try a viral marketing experiment right here on the blog. I have been give some discount codes for a new ticket site -- ticketheroes.com -- which offers tickets for sports/concerts/other events. Anyone can use them, at least until 1/1/08, so you can use them, or tell a friend to use them, or have a friend tell a friend, or whatever. According to that whole 6 degrees of separation thing, it's possible that everyone in the world could learn my discount codes, which by the way are:

MMBLOG1 [$5 off any ticket purchase]
MMBLOG2 [5% off any ticket purchase]
[Yes, you can only use one of them...]

To be clear, I am officially acting as a consultant for the SEM company that is trying to drive traffic to TicketHeroes. I am not getting a direct cut back from sales using these codes. (Though if this experiment is hugely successful, perhaps I'll try to negotiate one!) But I will get back data -- I'll get to see how many sales used the codes. (Don't worry, I won't get your credit card number.) I find this all very interesting. Can this little blog, focusing primarily on academic topics, help sell things? Really? I kind of doubt it, but hey, I remember thinking around the year 1999 that it would be a really impressive thing if Google became a 1 billion dollar company, so clearly I don't know as much as I should about marketing.

I've done my own checks and the site's prices look to be slightly less than or the same as other similar sites, even without the discount. So this should be a win for anyone looking for tickets, although of course you shouldn't trust me, you should compare.

Next post, we'll return to less crass, commercial topics, like how to re-finance your mortgage.

Saturday, November 10, 2007

Thursday, November 08, 2007

Service and the NSF

I recently returned from an NSF (National Science Foundation) review panel. (Of course, I'm not allowed to say when, or for what.) On the way back, at the airport, I ran into a more senior colleague (from an entirely different field -- not EECS) coming back on the same flight. It came up that he had never served on an NSF review panel. He admittedly seemed a bit sheepish about it, but offered the excuse that there were more important things to do, and I will give him the benefit of the doubt and assume that he meant other service activities, not research and such. Essentially, his argument seemed to be that NSF panels were a waste of his time.

Having served on a half dozen or so NSF review panels, I have some sympathy for this point of view. In particular, being anti-travel, I'm not sure why we have to fly in for a two-day meeting to make decisions; it's expensive for the NSF and time-consuming for me. (It's not clear to me that decisions are any better because of the face-to-face meeting. Indeed, arguably, they can be worse, depending on the group interaction. But government process is government process, so unlike for PCs, not having the face-to-face meeting is not currently an option...)

However, despite the time, I've tried to make myself available when the calls for NSF panels go out, because I've always figured that if the NSF is paying my summer salary (which, in large part, they do), they have the right to ask for some of my time. Indeed, my gut reaction (without working through the arguments) is that it's objectionable that they don't actually require people who have received funding to serve on a review panel at some point during the term of their grant, though I suppose the implementation of that rule could become unpleasant. In short, my moral interpretation is that by taking their money, I'm part of their system, so when they come calling, I should respond, even without an explicit quid pro quo.

Even if one does not subscribe to this implicit understanding, I think it's important for us individually and as a community (including non-academics who don't get NSF funding directly) to do our service for the NSF, particularly in the way of review panels. In general, citizens should be keeping an eye out on how the government uses their money, and in this specific case, as scientists, we should be paying especially close attention to how the government is distributing research money to scientists in our own and related fields, and we should be attempting to influence the process to move in the right directions as much as possible. The panels give us some opportunity to influence the process, and these otherwise near-useless face-to-face meetings give us some opportunity to influence the NSF administrators who shape and make the decisions that affect us all.

So, with all due respect to my senior colleague, and any of you other slackers out there, get over yourselves, and go serve on an NSF panel every other year or so.

Monday, November 05, 2007

Graduate Students, Socialize!

The discussion that in my mind connected breadth requirements to social networks reminded me of my time as a graduate student at Berkeley, where there were lots of informal social opportunities for graduate students to meet and talk. The CS grad student association had a weekly bagel/donut hour that was always well attended (mmmm...donuts....); the theory students arranged a Friday afternoon seminar (I believe called TGIF) where theory students would present something (a practice talk, some ongoing research, a paper they liked), no faculty allowed; there was a regular Friday pre-weekend grad student get-together that would be at a semi-randomly chosen local bar (usually upwards of a dozen or so people would show, with a high percentage of theorists, who didn't have to spend the weekend coding or running jobs); a grad student poker game broke out once a month or so (again, with a high percentage of theorists, who'd add complexity to the games).

Besides helping create a more pleasant graduate student atmosphere and experience, these activities let graduate students get to know more about each other and what they were doing. And I believe expanding your own work-related social network pays dividends in the long run.

I don't know what the current state of graduate student life is like these days, but if you don't think there's enough of this sort of stuff where you are, I encourage you, take action! (Don't wait for the department to do it; you'll do it better anyhow.) Try to get a grad student poker game going. Or a biweekly pub night. Or start an informal seminar. Or an open problems session. Or whatever it is you want to do, where graduate students can just hang out together, without faculty, with it being about fun, instead of or combined with work. Besides having more fun, you'll open up more opportunities for the serendipitous events in your work-life that lead to exciting projects.

Friday, November 02, 2007

Breadth Requirements

In our department we're looking at the class requirements for Ph.D. students, and in particular breadth requirements.

I'll state clearly that my opinion is that breadth requirements are a good thing. Breadth requirements for Ph.D. students are like vegetables for kids -- they don't always like them, but they're good for them. When I was at Berkeley the breadth requirements were reasonably onerous and I think they were very positive for me (even if, as contemporaries will suggest, it seemed like I slept through most of them). I especially think it's good for theoretical people to have a background in many areas of computer science outside theory, and classes are arguably the best way to get that background. And certainly I would argue that people in systems need some theory in their background.

I'd enjoy hearing arguments from the other side. You can argue that breadth requirements in general are a bad idea, or specifically that theory people shouldn't have to take classes in other CS areas (they should be allowed to just do all theory!), or that systems people shouldn't have to take a theory class. Any EE readers or readers from other areas should chime in with arguments based on their experiences as well!

Wednesday, October 31, 2007

New Book : Algorithmic Game Theory

I recently received my "desk copy" of the new book Algorithmic Game Theory, edited by Nisan, Roughgarden, Tardos, and Vazirani.

While I haven't read it cover-to-cover yet, I'm very impressed by the book. It's taken a large area with a fairly short history, and broken it up into reasonable-sized chunks each written by an expert, with most chunks covering a new and active research area. For example, Michael Kearns writes about Graphical Games, Christos Papadimitriou explains the complexity of finding Nash equilibria, Jon Kleinberg discusses cascading behavior in networks and the corresponding economic issues, Ramesh Johari and David Parkes and Joan Feigenbaum and so many others have chapters on their specialties, and so on. Overall I count 45 contributors! The result is a solid tome that really combines breadth and depth to create a resource that I assume works well for people working in the area and is certainly useful for an outsider trying to look in and see what's going on. There are also exercises in some chapters; it could certainly be used as a textbook.

I'd like to see other books of this form, built up by a coalition of experts to cover emerging areas. You do lose something with this approach -- for example, many concepts are defined multiple times in various chapters, and while the authors have made an effort pointing out relations among chapters, you don't really have the sense of coherence you get from most textbooks or books about research written by a single author. On the other hand, you do get a much broader coverage of topics than you'd probably see from a single-author textbook, and I assume that it was easier to spread the workload among authors. It's not clear to me any single author (or group of 3-4) could have put something like this together in any reasonable amount of time. Kudos to the editors (and authors).

What other topics could benefit from a treatment like this?

Monday, October 29, 2007

Consulting

I enjoy consulting, for several reasons:
  1. It fulfills a need to be more directly relevant to the real world.
  2. It gives me a chance to work with different sets of people.
  3. It allows me to learn new things and exercise skills that I don't necessarily use as a professor.
  4. It pays well.
I’ve consulted for research labs, startups, and larger companies, doing research, development, and expert witness work. I’ve been consulting since graduate school, and it’s an activity I’d recommend for most graduate students, once they’re comfortable in their research. I view the flexibility of being able to do research while still being able to take on consulting projects as one of the major benefits of being a professor.

Although I do know several algorithmists that consult at times, my impression is that the consulting culture is surprisingly small in computer science theory, even among algorithmists, and that it is much more prevalent in networking and EE theory. (The exceptional case I can recall is around the time of Akamai, where once it seemed like a sure bet, a lot of theorists hopped on the bandwagon for at least a short stint.) I generally hear people talk about consulting much more at networking and EE conferences than I do at theory conferences. Maybe it's just impolite to talk about it. Or maybe theorists don't consult much because most don’t have the skills people are looking for in consultants. Or maybe most theorists just aren’t that interested in consulting, and that's part of why they ended up being theorists.

Do you consult? (I'm particularly curious -- do complexity theorists consult?) If not, why not? Are my impressions about theory+consulting just way off base?

Sunday, October 28, 2007

Academic jobs this season?

I’ve already mentioned Harvard is doing a junior faculty search – although not specifically for a theorist. Any other faculty hiring announcements, either broad-based or theory specific? Or any new postdoc announcements worth repeating here?

Monday, October 22, 2007

Service to the Academic Community: Goals?

Except for the rare award (like the ACM-SIGACT Distingushed Service Award or Aaron D. Wyner Distinguished Service Award), our academic community does not appear to especially value service to the community. Research dominates our mindset. At the same time, on the whole, pleasantly the community seems quite conscientious about giving back. Somehow, conferences get run, workshops get organized, journal papers get reviewed, the NSF panels meet and make their bizarre decisions, and so on.

Strangely, I can't think of any time where a "philosophy of service" has been explained to me -- in graduate school or as a new faculty member, from the theory CS community or from my institution that employs me. Thinking about this, I find it rather odd. Surely, someone should be suggesting what's an appropriate level of time and effort to put into service, and where these efforts might best be applied.

(There is some advice I've seen in books for new faculty members. For example, Robert Boice's book Advice for New Faculty Memberssuggests that new faculty essentially spend as little time as possible on service activities, and make early service opportunities as self-serving as self-benefitting as possible. I roughly agree, but we should make this part of a well-reasoned philosophy of service -- the community should protect it's youngest members and help them flourish.)

I think this topic is ripe for community discussion, and some resulting general guidelines. People should have some guidelines as to what's expected -- what's usual and what's exceptional community service. (And to be clear, here I mean service to the scientific community at large; university service, for example, is separate.)

I'll start with two basic, over-arching questions. (I have more specific ones in mind, but I'll save that for the next post.)
  1. How much time should be spent on academic service activities? [My take -- 1/8 time, or at least one hour a day on average, for senior faculty. Less for junior faculty starting out, but increasing steadily toward that. Ostensibly, people in research labs should also be at 1/8 service time -- anyone know of any policies on that?]
  2. Should we be better rewarding service, and if so, how?

Saturday, October 20, 2007

Brief FOCS Update

I went to FOCS for the day of tutorials. Great talks by all of the speakers (Terence Tao, Dan Boneh, and Dan Spielman) -- and it seemed to me like a very large attendance for the tutorials. I understand slides and such will be put on the appropriate FOCS page soon, so rather than give detailed comments, I'll provide a pointer when they're up.

Also, I wanted to give a big thanks to the local arrangements team of Philip Klein, Anna Lysyanskaya, and Claire Mathieu, as well as everyone assisting them, since I know from experience that local arrangements is a thankless job. I found today's setup wonderful. The room at Brown was great -- perfect size, fine acoustics. Lots of food and refreshments were put out, and to save the conference money, it wasn't catered, but the local arrangements team brought stuff in themselves. (I arrived early and saw Anna driving up with a car full of goodies!) That's the kind of extra dedication that helps keep conferences affordable for everyone, at the expense of the personal time of the people who have to make it all run. So thanks again!

The FOCS setup has reminded me that I wanted to do a series of posts on the theme of service, something I'll try to get to this week.

Friday, October 19, 2007

Harvard Computer Science Hiring

We're eagerly looking for candidates in CS in all areas at the junior level (i.e., assistant profs). (Be sure to tell your non-theory friends as well.) Here's the ad for the search.

Wednesday, October 17, 2007

Harry Lewis's book, Excellence Without a Soul

Since my colleague Harry Lewis is kind enough to stop by and post comments from time to time, I would be remiss not to encourage everyone with an interest in the education of college students -- in the broad sense of whether universities are and how universities should be teaching students to become adults -- to pick up his book Excellence Without a Soul: How a Great University Forgot Education (also now available in paperback as well). I highly recommend the book, and plan to give it to my daughters to read when they're teenagers so they're better prepared for the college experience.

The book takes a stark look at the directions of modern college education, painting a picture of increasing directionlessness at the leading research universities, using Harvard and Harry's experience as Dean of Harvard college as a backdrop. Whether you agree with it or not -- and I have to admit I found a strong resonance with most of the issues in the book -- it is definite food for thought, well-reasoned and well-argued through and through.

There are two very small points of criticism I can make with the book. The first is that while the book sheds light on a number of challenging problems, it frustratingly offers little advice in the way of solutions. However, I think this was quite intentional. Indeed, one of the points in the book is that for many of these problems, what is needed isn't a "solution", but leadership, discussion, and the building of a consensus within the university.

The second nitpick is that one issue raised repeatedly in the book is the invasion of the consumer culture in education. Students pay a great deal for an education, particularly at private institutions, and they expect to get what they pay for; Harry argues forcefully that this trend is not good for the education of students. It would seem to me that this should be another compelling reason why Harvard shouldn't charge for tuition, as it might lessen the "I paid for this" attitude of many students (and parents), but perhaps Harry believes that even if there was no tuition, the consumer attitude would remain.

Monday, October 15, 2007

Broadening the Teaching of Theory

Offhand, I can think the following primary "types" of computer science classes:
  1. Classes specialized for graduate students in a subfield (e.g., theory, or usually the next level down, e.g. cryptography, pseudorandomness).
  2. Classes for graduate students in CS.
  3. Classes specialized for undergraduate CS majors.
  4. Classes for undergraduates primarily in other sciences.
  5. Classes for general, non-science undergraduates.
I think most CS professors naturally gravitate to teaches types 1 and 3 -- classes in their area of specialization. I think, as a community, there is generally at least some effort by theorists to broaden the scope of some at least classes to types 2 and 4. Certainly I try to encourage non-CS majors into my undergraduate algorithms class -- what scientist shouldn't be learning to think algorithmically? -- and to get non-theory graduate students to take my randomized algorithms and algorithms at the end of the wire courses. I suspect, though, as a community, we could be doing more. While personally theory seems less isolated within CS today to me than it did say a decade ago, there still seems to be plenty of bridges left to be built.

Perhaps more challenging is classes of type 5 -- classes for non-scientists. The question of how to design a science distribution requirement course for non-scientists is not specific to computer science, but in CS it perhaps especially challenging. Programming is often taken to be off-limits, since how much worthwhile programming can be taught to people who have never seen a computer program before? And theory seems to require too much math. Plus there is the age-old question of how to make it all relevant to non-scientists.

Harry Lewis co-designed a course, Bits, which I think is a great example of what's possible, and is the course I wish I could have designed, if I wasn't so busy teaching my regular classes. I enjoy this first paragraph from the course description:
This is a course about information for students who are not technically or mathematically oriented. It aims to explain the dramatic increase in the amount of information that is being communicated and stored, and the consequences for society of that increase in capacity.We are on the cusp of a revolution, potentially as consequential as the invention of printing or the replacement of animals by steam engines as the vehicles of transportation.The course aims to equip those who will determining policies, whether as legislators, corporate leaders, or ordinary citizens in their multiple roles, with an awareness of the choices that lie only a short time ahead of us.
I am also strongly biased in that Harry followed my advice, with much of the technical meat in the beginning of the course covering basic information theory, specifically compression and coding. (Note: I emphasize that Harry followed my advice, not that he took my advice. As far as I know, he always planned to structure the course this way, and it's just happenstance that his take on what was worthwhile matched mine. I'm pleased nonetheless.)

The course is an interesting mix of science, technology, and policy. The lectures are now apparently freely available online now that the course is over for those who want to see it. Someday, perhaps I'll get to design a course like this. I'd be curious to hear from others what other examples there are for CS-based courses with non-trivial technical content meant for non-science majors. (Jon Kleinberg's course new course comes to mind, but it really seems to be type 4, built for scientists, including mathematically oriented economists and sociologists.)

Tuesday, October 09, 2007

The simplest insertion/deletion channel

The simplest binary insertion/deletion channel that I can think of is the following: with probability p, each bit independently results in two copies of itself. This is a special case of a subclass of channels that I have dubbed sticky channels, which are like sticky keyboards: each symbol can result in a random number of copies of that symbol.

Sticky channels have the nice property that contiguous blocks of (resp 1s) at the input correspond to contiguous blocks of 0s (resp 1s) at the output. This property makes sticky channels easier than more general insertion/deletion channels.

I've just had a paper on sticky channels accepted to Transactions on Information Theory; here's a link to a preprint. The main result is that for that simplest channel above, I can numerically obtain very tight bounds on the channel capacity. But of course I'd still like to know -- is there a simple formula that gives the capacity as a function of p? And is there are simple and efficient coding scheme that nearly reaches the capacity?