As you may have noticed, this blog has been on a bit of a hiatus. On the one hand, my expectations for politics-by-blog have decreased to a minimum and my boredom with the theory community has grown. On the other hand, I have not had enough time for exposition-by-blog --- or, better said, I discovered that my thought process is too obsessive and erratic to leave room for regular expository interruptions.
Tuesday, December 22, 2009
Blog happenings
Posted by Mihai at 1:59 PM 14 comments
Friday, December 4, 2009
Talks
- a more practical one at Poly on Monday (December 7) at 6pm in room EC 101 (Fac. Automatică şi Calculatoare)
- a more theoretical one at UniBuc on Friday (December 11). Room and time to be announced here, but it will be earlier in the afternoon.
- Funcţii de hash tabulare
Implementarea tabelelor de hash folosind căutare liniară (linear probing) este mai eficientă decât alte implementări dacă funcţia de hash este suficient de alteatoare. Din păcate, linear probing se comportă foarte prost în conjuncţie cu funcţiile de hash bazate pe înmulţire (cele mai folosite în practică), şi ca atare, acest algoritm este deseori evitat.
Voi descrie o funcţie de hash foarte simplă, care este la fel de rapidă ca înmulţirea pe procesoarele actuale, dar se bazează pe indexarea în tabele precalculate. O analiză matematică ne demonstrează că această funcţie garantează timp de rulare constant pentru tabelele de hash. - Rezultate negative pentru structuri de date
Cum demonstrăm că anumite rezultate algoritmice sunt imposibil de obţinut? Spre exemplu, cum demonstrăm că nu există nicio structură de date cu spațiu liniar care poate suporta range queries în timp constant? În acest curs, voi descrie o demonstrație completă a acestui rezultat, trecând prin mai mulți pași simpli, dar interesanți.
Posted by Mihai at 8:02 AM 8 comments
Tuesday, November 24, 2009
FOCS 2010
The FOCS 2010 website is already up. This promises to be a very interesting conference.
Posted by Mihai at 11:41 AM 1 comments
Tuesday, November 10, 2009
A Simple Encoding Proof
In this post, I discuss a nice and simple example of an encoding proof, showing that maintaining partial sums require Ω(lg n) time per operation.
Go ahead and read the lecture notes in [PDF]. Or, if you prefer a more visual exposition, try the [PPTX] or [PPT] presentation. (These were extracted from my thesis and job talk.)
Posted by Mihai at 2:56 PM 20 comments
Friday, October 23, 2009
Encoding Proofs
Various fields have various notions of "nice proofs," be they combinatorial, or elementary, or bijective. In TCS, perhaps the correct standard for lower bound proofs should be "encoding proofs." In these proofs, one starts with the assumption that some algorithm exists, and derives from that some impossible encoding algorithm, e.g. one that can always compress n bits into n-1 bits.
- Most lower bounds cannot be taught in a regular class. But we can't go on saying how problems like P-vs-NP are so awesome, and keep training all our graduate students to round LPs better and squeeze randomness from stone.
- The reader will often not understand and appreciate the simple and beautiful idea, as it is too hard to pull apart from its technical realization. Many people in TCS seem to think lower bounds are some form of dark magic, which involves years of experience and technical development. There is certainly lots of dark magic in the step where you find small-but-cool tricks that are the cornerstone of the lower bound; the rest can be done by anybody.
- You start having lower-bounds researchers who are so passionate about the technical details that they actually think that's what was important! I often say "these two ideas are identical" only to get a blank stare. A lower bound idea never talks about entropy or rectangle width; such things are synonymous in the world of ideas.
Posted by Mihai at 8:48 PM 20 comments
Thursday, October 8, 2009
Nobels
Since we've been talking about prizes, let me mention the recently announced Nobel awards for 2009.
Posted by Mihai at 7:57 AM 8 comments
Friday, October 2, 2009
Follow-up
I was on a hiring committee in one of these schools that decided not to interview you. Although I hesitated to post this comment, I think what I have to say will be helpful to your career.Thank you for the comment; I appreciate the difficulty in writing on such a topic.
The reason we decided against further considering your case was because of your reputation as a very difficult, arrogant, and opinionated person. We even read your blog and found many posts that confirmed this reputation.I am aware that suggestions of this form were made. Of course, I may point out the significant pitfalls in searching for evidence (in a blog, of all places) for something that you are already biased to believe. I may also point out that the places that actually know me (the labs were I did internships) both made me offers, and a notable fraction of the universities where I interviewed were also interested in hiring me. So the suggestions may be a bit exaggerated, or perhaps there is a high variance in how people perceive me. If for some reason your university finds itself interested in my case, I would consider a conversation at a conference or an invitation for a talk as more reliable courses of action.
At least in our university, a post like this one significantly hurts your chances of getting a job. We don't want to hire a person who writes a blog post insulting their departmental colleagues every time they're not nominated for a fellowship or an award. "I can't believe my department decided to nominate X for a Sloan Fellowship when my research is so much deeper than X's."If your department has no faculty engaging in the common activity of "bitching and moaning," I am not sure whether I should envy you for your luck, or worry about the hyper-formal environment that you may have in place. It is safe to say that I am not good at keeping rank formation; but it is also fair to say that I choose the battles where I pull out of the formation very selectively.
You're smart, but there are *many* (equally) smart (or smarter) people who are also publicly nice.It is my hope that I will prove you wrong on this statement.
Posted by Mihai at 9:23 PM 34 comments
Thursday, September 10, 2009
Labs vs Academia
The topic of the day seems to be labs vs academia (sparked by this, and indirectly by Muthu, picked up by Michael Mitzenmacher and Jon Katz). I thought I would contribute my own perspective, since I've been in labs for about a year, and went through university/lab and lab/lab transitions recently.
- In the Bay Area, Berkeley / Stanford weren't interested in interviewing me, but IBM gave me jobs (both a temporary and a full-time one, which was great).
- In the NY area, Princeton / NYU / Columbia weren't interested, but, again, ATT gave me a job.
- In Boston, the problem remains unsolved since MSR New England is not in the mood to hire young people. But arguably there are many more non-top-but-good universities, so this compensates (around San Francisco, for instance, this alternative is entirely lacking).
- Tradition and culture encourage you to leave for universities at some point;
- Labs reliably go through periods of instability, where things turn really bad, and most of the group leaves. Some labs never recover, while others recovered quite successfully, but with a significantly changed group (see IBM Almaden and ATT).
- If you spend too many years in a lab, there is a danger that you fade into TCS oblivion. (You get integrated more into practical groups, and stop being a theorist; you lose contact with the recent trends; etc.)
- The perspective of having students seems much more attractive once you get tired of the nitty gritty details and would like to delegate some.
Posted by Mihai at 9:21 PM 16 comments
Wednesday, September 9, 2009
SODA / Data Structures
The SODA list of accepted papers, with abstracts, is here.
- Fully-Functional Succinct Trees (Kunihiko Sadakane, Gonzalo Navarro)
- On the Cell Probe Complexity of Dynamic Membership (Ke Yi, Qin Zhang)
- Data Structures for Range Minimum Queries in Multidimensional Arrays (Hao Yuan, Mikhail J. Atallah)
- A locality-sensitive hash for real vectors (Tyler Neylon)
- Deletion Without Rebalancing in Balanced Binary Trees (Siddhartha Sen, Robert E. Tarjan)
- Cache-Oblivious Dynamic Dictionaries with Optimal Update/Query Tradeoff (Gerth Stølting Brodal, Erik D. Demaine, Jeremy T. Fineman, John Iacono, Stefan Langerman, J. Ian Munro)
- Cell-probe lower bounds for prefix sums and matching brackets (Emanuele Viola)
+ A Lower Bound for Succinct Rank Queries (Mihai Patrascu) - Applications of Forbidden 0-1 Matrices to Search Trees and Path Compression-Based Data Structures (Seth Pettie)
- A Nearly Optimal Algorithm for Approximating Replacement Paths and k Shortest Simple Paths in General Graphs (Aaron Bernstein) -- the main contribution is better preprocessing time of a data structure
- Counting Inversions, Offline Orthogonal Range Counting, and Related Problems (Timothy M. Chan, Mihai Patrascu) -- improves the preprocessing and offline times of some data structures
- Highway Dimension, Shortest Paths, and Provably Efficient Algorithms (Ittai Abraham, Amos Fiat, Andrew Goldberg, Renato Werneck) -- distance oracles are a very important topic in data structures; this paper is more about modelling, though.
- On the Exact Space Complexity of Sketching and Streaming Small Norms (Daniel M. Kane, Jelani Nelson, David P. Woodruff) -- whenever you talk about streaming and care about time complexity, this is clearly a data structure; if you solve the problem with k-wise independent hash functions, even more so.
- The 1+β Choice Process and Weighted Balls-into-Bins (Yuval Peres, Kunal Talwar, Udi Wieder) -- more on the power of two choices, though the process is probably unrealistic in a data structures context.
- Regular Expression Matching with Multi-Strings (Philip Bille, Mikkel Thorup) -- stringology is close enough :)
- Fast distance multiplication of unit-Monge matrices (Alexander Tiskin) -- also stringology.
Posted by Mihai at 11:34 PM 1 comments
Thursday, September 3, 2009
What is efficient?
In my previous post, I wrote about how calling P "efficient computation" is nothing more than bad philosophy. I think a lot more examples are in order.
Posted by Mihai at 4:31 PM 20 comments
Wednesday, September 2, 2009
Food for thought
Here is a short movie (with subtitles) on Romanian parents whose kids emigrated to the US. It is hilarious, at least for people who have gone through this.
Posted by Mihai at 9:55 PM 23 comments
Friday, August 28, 2009
Romania
Posted by Mihai at 9:28 AM 16 comments
Wednesday, August 26, 2009
Longest Multipermutation
Here's an algorithmic problem for your enjoyment:
Define a multipermutation to be a sequence of integers that contains all numbers from 1 to some K at least once. Examples include (1,3,1,2,2), or (1,4,3,2); but not (1,2,3,5,5).I saw a recent paper doing this in superlinear time, and of course I had to find a linear-time algorithm :) Can you also find one?
Given an array of n integers, find the longest multipermutation in O(n) time.
Posted by Mihai at 3:05 PM 17 comments
Sunday, August 23, 2009
A Technical Lemma
[You can also read this post in PDF.]
In a recent SODA submission, I had to prove a small technical lemma that I considered more or less straightforward. But what exactly is straightforward is in the eye of the beholder, and one of the referees did not like my (succinct and semicoherent) proof. I was thus asked to provide as detailed an explanation as I could, pronto. (Sound familiar? I maintain that we are all too often missing the forest for the trees, but that's a topic for another post.)
Anyway, I thought I would take the opportunity to explain this small technical lemma to my readers who fancy themselves working with entropy inequalities in the future. If you do lower bounds, such things will sooner or later feel quite standard (as standard as, say, maintaining some forward and backward pointers in your algorithm). So if you fancy yourself working in this area, you might consider going through this proof carefully as an exercise.
(A word of caution: Don't be discouraged if this seems overly technical. This is not "what we do" in lower bounds — the point is to discover non-straightforward stuff. Piano skills are not what makes a great composer.)
So, what exactly are we proving? Let A[0..n-1] be a bit vector, chosen uniformly at random. Let Q(i)=A[0]+...+A[i] be the partial sums of the vector (these were queries in my original problem).
I imagine the vector A being split into k blocks of n/k bits each. I denote by QΔ the Δ-th partial sum inside each block, i.e. the vector ( Q(Δ), Q(Δ + n/k), Q(Δ + 2n/k), ...).
We now look at Q0 (the first query in each block) and QΔ for some arbitrary Δ. The main technical question is: how much more efficient is it to encode Q0 and QΔ jointly, as opposed to encoding them separately?
Lemma 1. H(Q0) + H(QΔ) − H(Q0, QΔ)) = Ω(k)
Proof. The lemma is simple and intuitive. First remark that H(a, a+b, a+b+c) = H(a, b, c); furthermore, this is H(a)+H(b)+H(c) if a, b, and c are independent. Thus, H(Q0) = H(sum in block 1) + H(sum in block 2) + ... Of course, the sum in a block is a binomial with n/k Bernoulli trials. Thus, H(Q0) = (k-1) * h(n/k), where I use h to denote the entropy of a binomial.
Just the same, H(QΔ) = h(Δ) + (k-1) * h(n/k). Thus H(Q0) + H(QΔ) ≈ 2k * h(n/k).
Finally, H(Q0, QΔ)) = h(Δ) + h(n/k - Δ) + h(Δ) + h(n/k - Δ) + ... ≈ k*h(Δ) + k*h(n/k-Δ). Indeed, I can break (Q0, QΔ) into 2k independent components: the sum up to Δ, the sum from there up to n/k, the sum from there up to n/k + Δ, the sum from there up to 2n/k, and so on.
To conclude, we look up the entropy of a binomial on Wikipedia and find that h(m) = 0.5 * ln(πem/2) + O(1/m). Thus, doubling the number of trials m increases the entropy by ln(2)/2 - O(1/m) = Ω(1) bits. The lemma follows: either Δ ≥ n/2k, and we have h(n/k) - h(n/k-Δ) = Ω(1), or otherwise h(n/k) - h(Δ) = Ω(1). □
Now we get to the interesting part. I need to prove that there is a lot of loss in a separate encoding of Q0 and QΔ, even in a probabilistic universe where I've conditioned on some event. Of course, I need a lower bound on the probability of the event (for instance, if the event is A[0]=A[1]=...=0, all entropies are zero, so I can't prove anything).
How rare an event can I handle? The magic answer is Pr[E] ≥ 2-εk, for some small constant ε. Why is this the right answer?
- First, I should be able to handle such an event. If I have some variable X with some entropy H(X), and I tell you some event E has happened, how much did I knock off the entropy of X? You've just learned roughly lg(1/Pr[E]) bits of entropy (the fact that E happened). Intuitively, H(X) cannot decrease by more than what you learned. Thus, if I had a lower bound of Ω(k) on some entropic quantities, I should be able to maintain it under an event of measure 2-εk.
- Second, conditioning on such an event E is a very frequent trick in lower bounds — and 2-εk is always the answer :) The event E fixes something (say, some M bits of memory have been read by the algorithm, so Pr[E] ≈ 2-M). How can I bound the amount by which E simplified the problem? (In particular, I need to show the answer is not already known, otherwise there's no more
lowerbounding to do.)
A good way to bound the help E can render is to break the problem into 99*M subproblems. Intuitively, the M bits that the algorithm knows cannot help with 99*M independent subproblems; for some of the them, the algorithm must be clueless. Thus, the logic goes in the other direction: I'm considering k blocks precisely because I need to argue about an algorithm that has learned εk bits, i.e. I am in a universe where we condition on an event of probability 2-εk.
Lemma 2. For any event E with Pr[E] ≥ 2-εk, we have H(Q0 | E) + H(QΔ | E) − H(Q0, QΔ) | E) = Ω(k)
Proof. Remember the intuition from above: if I tell you that some event E happened, I am affecting your entropies by lg 1/Pr[E]. There's only one problem in implementing this intuition, namely that it's false. Imagine the following distribution: on odd days, a uniformly random vector of n bits; on even days, the vector (0,...,0). The entropy, which is an average, is n/2, i.e. quite high. But if I condition on an event with probability 1/2, I can bring the entropy down to zero.
What I need is concentration for entropies. And how do we always prove concentration? By using independence, which, luckily, we have here — remember how we broke the entropies into independent binomials. Of course, the formal implementation may not seem obvious yet, so let's do it carefully.
Let's assume Δ ≤ n/2k by symmetry. In this case, Lemma 1 boils down to: H(Q0) + H(QΔ) − H(Q0, QΔ)) ≈ 2k * h(n/k) - k*h(Δ) + k*h(n/k-Δ) ≥ k * (h(n/k) - h(Δ)) = Ω(k). I want to exploit the independence of k quantities, each with expectation h(n/k) - h(Δ).
These quantities come from the definition of entropy. Remember that H(X) = E[ lg 1/p(X) ], where p(.) is the probability density function of X. Let's make the following convenient definitions:
- Xi = the sum of the bits in the i-th block. Let X = (X0, X1, ...).
- Yi = the sum of the first Δ bits in the i-th block. Let Y = (Y0, Y1, ...). Denote the probability densities for Y by q(.).
In the regime Δ ≤ n/2k, we used the lower bound H(Q0) + H(QΔ) − H(Q0, QΔ)) ≥ H(X) - H(Y). Thus, I am interested in analyzing H(X|E) - H(Y|E).
To this end, I write H(X) - H(Y) = E[ lg q(Y)/p(X) ] = E[ lg (q(Y1) * q(Y2) * ...) / (p(X1) * p(X2) * ...) ] = E[ lg q(Y1) / (p(X1) + lg q(Y2) / p(X2) + ...].
This is in the desired form, the sum of k independent quantities, since (X1, Y1) is independent of (X2, Y2), etc. Each term has expectation h(n/k) - h(Δ) ≥ ln(2)/2 - O(1/m), as we showed in Lemma 1. Thus, I can apply Chernoff and conclude that the sum is Ω(k) with probability 1 - 2-Ω(k).
This probability is so large, that an event happening with probability 2-εk does not scare me any more. Indeed, even if I condition on E, the sum is Ω(k) with probability 1-o(1). Thus, it's expectation is also Ω(k).
Have I just shown H(X|E) - H(Y|E) = Ω(k)? Well, not quite. I've shown E[ lg q(Y)/p(X) | E ] = E[lg 1/p(X) | E] - E[lg 1/q(Y) | E] = Ω(k). But the probability density function p(.) is just another function when I condition on E. Under this conditioning, the real probability density is another function p'(.), i.e. H(X|E) = E[ lg 1/p'(X) ].
First remark that the probability mass inside E is a subset of the mass in the general universe, i.e. p(x) ≥ p'(x) * Pr[E], for any x. Thus p'(x) ≤ p(x) * 2εk and lg 1/p'(x) ≥ lg 1/p(x) - εk. So I can switch between p'(.) and p(.): E[lg 1/p'(X) | E] ≥ E[lg 1/p(X) | E] - εk.
Switching between q(.) and q'(.) is more tricky, since I want an upper bound. It is true that for many y, q'(y) could be much smaller than q(y), thus increasing the "entropy" lg 1/q'(y). Let's call y bad if q'(y) ≤ q(y) / 22εk. How much mass could the bad occupy? We has Sumy q(y) = 1, so Sumy∈Bad q'(y) ≤ 2-2εk. Thus, even inside the event E, the bad y's only have mass 2-εk. Now H(Y|E) is the average over the good y's and the bad y's. For the good, lg 1/q'(y) ≤ lg 1/q(y) + 2εk. For the bad, lg 1/q'(y) ≤ n (the smallest event in my distribution has probability 2-n). But the weight of the bad inside E is 2-εk = o(1/n) for reasonable k. Thus, the contribution of the bad is o(1).
We have thus shown E[lg 1/q'(Y) | E] ≤ E[lg 1/q(Y)] + 2εk. We conclude that H(X|E) - H(Y|E) ≥ E[ lg q(Y)/p(X) | E ] - 3εk = Ω(k). □
That's it. My hope was to present the proof as a sequence of interesting ideas, rather than a long technical thing filled with calculations. I don't know if I succeeded in conveying this feeling; let me know if you need help with any of the steps.
Posted by Mihai at 5:41 AM 19 comments
Friday, August 21, 2009
An Evening in Bucharest
5:30pm. I am at the "Brewers' Court," meeting with an old friend from computer olympiads, now faculty at Bucharest Polytechnic. We order 1-liter beers and I describe an algorithm answering the main open problem from a recent paper. The first question I get is quite polite (I tend to get a lot of respect in Romania), but can roughly be translated as "Dude, does this make any sense?" As is often the case, I show that I cannot focus without coauthors. My algorithm is obviously non-sense, except that it contains enough original ideas that a correct algorithm can be reconstructed in 10 minutes.
7:00pm. I take a few more steps on a path that I have been pursuing for a long time: I try to communicate some of the energy, passion, and standards of US research (sorely lacking in the parts of Europe that I have had contact with). We both have 1-liter beers.
8:00pm. My godfather arrives. We all have 1-liter beers, and talk about work in networking (he is working in software at a major corporation in this field; I am at ATT, of course).
9:00pm. Another friend from high school (CN Carol I, class of 2001) arrives. We have 1-liter beers with some food, and gossip about another high-school friend (now CEO of a software company).
11:00pm. Two Romanian re-pats (old friends from MIT) arrive. We have 1-liter beers, and talk about their start-up, the child that one has, etc.
12:30am. Another re-pat (Harvard) arrives. We have 1-liter beers, and talk about the cars that they drive at 150mph, the good old times at Physics Olympiads, and life.
1:30am. Yet another re-pat (MIT) arrives. We have 1-liter beers, and talk about the hedge fund where he works, the likely inflation for the dollar in the next 10 years, and life in Romania vs the US.
3:30am. I make it to my sister's apartment and write blog posts (at the end of my European trip, I am still jet-lagged).
Posted by Mihai at 8:08 PM 0 comments
Monday, August 3, 2009
More on Problem 2
You might be wondering what caused the hiatus in posting CEOI problems. My modus operandi as a researcher is to induldge in whatever obsession I currently have. For the past week, the raging obsession has been Problem 2 (tempered, of course, by the beaurocracy of joining AT&T).
Observation 1. In an optimal solution, any two rectangles are either disjoint or "nested" (one of them is taller and has the base included in the other). If two rectangles were to intersect properly, the shorter one can be shrunk horizontally until the rectangles become disjoint.
Let the points have coordinates (x[i], y[i]). Assume they are sorted by x.
Observation 2. We can construct an optimal solution in which every rectangle satisfies:
- it extends horizontally from some x[i] to some x[j];
- it extends vertically up to A / (x[i] - x[j]), i.e. as high as possible.
- it covers points i and j, i.e. y[i], y[j] ≤ A / (x[i] - x[j]).
An O(n4) solution. We thus know that there are only O(n2) distinct rectangles to consider. Denote them by R(i,j); though not all pairs (i,j) are valid rectangles.
If R(i,j) is used in the solution, we have a well-contained subproblem to solve: cover all points between x[i] and x[j], which are not already covered by R(i,j), that is, they are above the rectangle. Let A(i,j) be the minimal number of rectangles needed to cover all such points. This is our dynamic program.
We fill the dynamic program by increasing values of j-i. To compute A(i,j), let S be the set of points k (i < k < j) which are not covered by the rectangle. Note that i and j are always covered (by condition 3. above). Thus, any information pertitent to covering S has already been computed by our dynamic program.
Let S={X1, X2, ... }. To compute A(i,j), we need to find splitters a < b < c < ... which minimize A(X1, Xa) + A(X{a+1}, Xb) + A(X{b+1}, Xc) + ...
This is a classic dynamic programming problem itself, or, if you wish, a shortest paths problem where A(x,y) is the length of the path from node x to node y. This subproblem can be solved in O(n2) time (which is "linear time", since there are n2 subproblem A(x,y)). Thus, the original dynamic program is filled in O(n4).
An O(n3) solution. A common strategy for improving the running time is to try to maintain more information. Thus, I first reformulate the dynamic program above to use O(n3) state.
Let T(i,j,k) be the optimal number of rectangles to cover all points (x,y) with x[i] ≤ x ≤ x[j] and y ≥ heights[k], where heights contains all y's in sorted order. Unlike the previous solution, I consider all heights k, not just the maximal one for a fixed i and j.
This dynamic program is easy to fill in O(n4). For some triple (i,j,k), I will either:
- draw a rectangle from x[i] to x[j] of maximal height. If this height is more than y[k], I get the optimum for covering the remaining points from T(i,j,new height).
- if there is no rectangle spanning x[i] to x[j], there must be a break point m, such that T(i,j,k) = T(i,m,k) + T(m+1,j,k). Iterate to find the optimal such m.
To compute the optimal T(i,j,0), proceed exactly as above, taking time O(n). The rest of T(i,j,k) for k>0 can be computed in constant time per entry. For each T(i,j,k+1), I have two possibilities:
- T(i,j,k+1) = T(i,j,k).
- T(i,j,k+1) < T(i,j,k). Thus, the optimal solution for k+1 is different from the optimal solution for k. But the set of points that must be covered by these two solutions only differ by one point P (with y[P]=k). Thus, point P must not be covered by the solution for T(i,j,k+1) — if it were, the better solution for T(i,j,k+1) would also hold for T(i,j,k). Thus, x[P] is a break point in the solution T(i,j,k+1), and we have T(i,j,k+1) = T(i,P-1,k+1) + T(P+1,j,k+1).
Posted by Mihai at 3:13 PM 11 comments
Sunday, July 19, 2009
CEOI: Problem 2
The discussion on Problem 1 was quite active; I even posted a lower bound for the streaming complexity of the problem. The official solution is:
The basic idea is to sweep through each line i, and maintain the “height” of each column, i.e., the number of ones in that column extending upwards from line i. Given the heights of each column, one has to find a subset S of columns maximizing |S| • min(S). You can find this subset by trying values for min(S) and counting how many heights are greater or equal. By maintaining the heights in sorted order, this computation becomes linear time.
To make the algorithm run in O(N•M) time, the key observation is that, while the sweep line advances to another row, a height is either incremented or reset to zero. This means that one can maintain the sorted order in linear time: place the values that become zero up front, and maintain the order for the rest (which is unchanged by incrementing them).Let us now move to a problem of average difficulty. Originally, I was worried that it would be too easy (the contestants really know dynamic programming), but it seems the details behind the dynamic program were unusual enough, and some people missed them.
Posted by Mihai at 5:40 AM 18 comments
Friday, July 17, 2009
CEOI: Problem 1
As I've announced before, I was the chair of the Scientific Committee at The 16th Central European Olympiad in Informatics (CEOI'09). I will have a few non-technical post on the experience, but in the next 6 posts, I will simply let you enjoy the problems.
- You can form your own impression on the level of these contests (I think it is quite high, and it will allow us to recuit many researchers down the road. Consider that one has to solve and implement 3 problems in 5 hours.)
- These are great algorithmic puzzles, of the kind that is hard to come by (as opposed to math puzzles, which are everywhere, but I find boring).
- You may finally answer the question: Are you smarter than an 11th grader? :)
Posted by Mihai at 4:13 AM 16 comments
Friday, June 26, 2009
The Conceptual Wars
Scott Aaronson, in his regular post-FOCS-notification column, has indicated me as a leader of the Technicians Resistance Army of TCS (a leader, at least, in irășcibility).
Posted by Mihai at 9:26 PM 5 comments
Thursday, June 4, 2009
Conceptual Contributions Conference
At STOC, there was an announcement of a new conference dedicated to conceptual contributions, titled ICS (Innovations in Computer Science). Michael asks for more discussion on this conference.
Personally, I am enthusiastic about ICS. It seems like one of the best ideas in decades for improving the quality of STOC/FOCS.
Posted by Mihai at 6:02 PM 9 comments
Tuesday, May 12, 2009
Conference in Romania
I am on the program committee for Advances in the Theory of Computing (AITC'09), to take place in Timișoara, Romania on September 26-29. The conference is a new, experimental theory track for a larger and more established conference (The 11th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC'09).
The deadline for regular papers is May 30. You know you always wanted to visit Romania, so use the next couple of weeks to wrap up a result and send it in!
Alternatively, the conference will double up as a workshop (papers in the "presentation track" will not be published in proceedings, allowing you to submit elsewhere). The deadline for such presentations is July 1st.
So come to discuss the most exciting developments in theory, while enjoying your 1-liter beers with a view of one of the most beautiful Romanian cities.
Posted by Mihai at 5:38 PM 8 comments
Friday, May 8, 2009
Letter Grades
Posted by Mihai at 3:42 AM 13 comments
Sunday, May 3, 2009
Complexity Everywhere
I know that whenever I write about TCS politics on this blog, it ends up bad. For instance, I get a comment such as the following one (left by an anonymous to my last post):
What makes it tough for some of the papers you cite is the view that shaving off log factors is often viewed as much less interesting than larger improvements.This, of course, makes my latin blood run even hotter, and I cannot help writing follow-up posts (this is the first). If only I could keep my promise of not writing about politics, my life would be so much simpler. (If only I could learn from history... I got to observe my father become a leading figure in Romanian Dermatology a decade before he could get a faculty position — mainly due to his latin blood. He got a faculty position well into his 50s, essentially going straight to department chair after the previous chair retired.)
So, let's talk about shaving off log factors (a long overdue topic on this blog). As one of my friends once said:
All this talk about shaving off log factors from complexity people, who aren't even capable of shaving on a log factor into those circuit lower bounds...There is something very deep in this quote. Complexity theorists have gone way too long without making progress on proving hardness, their raison d'être. During this time, drawing targets around the few accidental arrows that hit walls became the accepted methodology. For instance, this led to an obsession about the polynomial / non-polynomial difference, where at least we had an accepted conjecture and some techniques for proving something.
Complexity theory is not about polynomial versus non-polynomial running times. Complexity theory is about looking at computational problems and classifying then "structurally" by their hardness. There are beautiful structures in data structures:
- dictionaries take constant time, randomized. (But if we could prove that deterministically, dynamic dictionaries need superconstant time per operation, it would be a very powerful message about the power of randomness — one that computer scientists could understand better than "any randomized algorithm in time nc can be simulated deterministically in time n10c if E requires exponential size circuits.")
- predecessor search requires log-log time. The lower bound uses direct sum arguments for round elimination in communication complexity, a very "complexity topic." A large class of problems are equivalent to predecessor search, by reductions.
- the hardness of many problems is related to the structure of a binary hierarchy. These have bound of Θ(lg n) or Θ(lg n / lglg n) depending on interesting information-theoretic issues (roughly, can you sketch a subproblem with low entropy?). There are many nonobvious reductions between such problems.
- we have a less sharp understanding of problems above the logarithmic barrier, but knowledge is slowly developing. For instance, I have a conjecture about 3-player number-on-forehead games that would imply nΩ(1) for a large class of problems (reductions, again!). [This was in my Dagstuhl 2008 talk; I guess I should write it down at some point.]
- the last class of problems are the "really hard" ones: high-dimensional problems for which there is a sharp transition between "exponential space and really fast query time" and "linear space and really slow query time." Whether or not there are reductions among these is a question that has preoccupied people for quite a while (you need some gap amplification, a la PCP). Right now, we can only prove optimal bounds for decision trees (via communication complexity), and some weak connections to NP (if SAT requires strongly exponential time, partial match requires weakly exponential space).
Let's look at algorithms:
- Some problems take linear time (often in very non-obvious ways).
- Sorting seems to take super-linear time, and some problems seem to be as fast as sorting. My favorite example: undirected shortest paths takes linear time, but for directed graphs it seems you need sorting. Why?
- FFT seems to require Θ(n lg n) time. I cannot over-emphasize how powerful an interdisciplinary message it would be, if we could prove this. There are related problems: if you can beat the permutation bound in external memory, you can solve FFT in o(n lg n). The permutation bound in external memory is, to me, the most promissing attack to circuit lower bounds.
- some problems circle around the Θ(n sqrt(n)) bound, for reasons unclear. Examples: flow, shortest paths with negative lengths, min convolution with a mask. But we do have some reductions (bipartite matching is as hard as flow, bidirectionally).
- some problems circle around the n2 bound. Here we do have the beginning of a classification: 3SUM-hard problems. But there are many more things that we cannot classify: edit distance and many other dynamic programs, min convolution (signal processing people thought hard about it), etc.
- some problems have an n*sort(n) upper bound, and are shown to be X+Y-hard. Though the time distinction between n2 and n*sort(n) is tiny, the X+Y question is as tantalizing as they get.
- some problems can be solved in nω by fast matrix multiplication, while others seem to be stuck at n3 (all pairs shortest paths, given-weight triangle). But interestingly, this class is related to the n2 problems: if 3SUM needs quadratic time, given-weight triangle requires cubic time; and if min-convolution requires quadratic time, APSP requires cubic time.
- what can we say about all those dynamic programs that run in time n5 or something like that? To this party, TCS comes empty-handed.
- how about problems in super-polynomial sub-exponential running time? Ignoring this regime is why the misguided "polynomial / non-polynomial" distinction is often confused with the very meaningful "exponential hardness." There is much recent work here in fixed-parameter tractability. One can show, for instance, that k-clique requires nΩ(k) time, or that some problems require 2Ω(tree-width) time.
And what can we say about k-SUM and all the k-SUM-hard problems (computational geometry in k dimensions)? This is an important illustration of the "curse of dimensionality" in geometry. I can show that if 3SAT takes exponential time, k-SUM takes nΩ(k) time.
Finally, what can we say about PTAS running times? In my paper with Piotr and Alex, we showed that some geometric problems requires nΩ~(1/ε^2) running time. This has a powerful structural message: the best thing to do is to exhaustive search after a Johnson-Lindenstrauss projection. - inside exponential running time, there is the little-known work of [Impagliazzo-Paturi] showing, for instance, that sparse-3SAT is as hard as general 3SAT. Much more can be done here.
Posted by Mihai at 3:21 PM 29 comments