Robert Haas

Thursday, June 28, 2012

Absurd Shared Memory Limits

Today, I fixed a problem. Or at least, I think I fixed it. Time will tell. But Thom Brown seems pretty happy, and so does Dan Farina. So let me tell you about it. Here's the executive summary: assuming the patch I committed today holds up, PostgreSQL 9.3 will largely eliminate the need to fiddle with operating system shared memory limits.

perf: the good, the bad, the ugly

For the past few months, I've spent a good deal of time using the Linux utility perf to profile PostgreSQL. On the whole, I've been favorably impressed. Before I started using perf, trying to profile generally meant using oprofile or gprof. Unfortunately, both of those tools have serious problems.

PostgreSQL 9.3 Development Schedule

At the PGCon 2012 Developer Meeting, we discussed the schedule for PostgreSQL 9.3 development. Here's what we agreed:

CommitFest #1: June 15, 2012 - July 15, 2012
CommitFest #2: September 15, 2012 - October 15, 2012
CommitFest #3: November 15, 2012 - December 15, 2012
Planning Week: December 8, 2012 - December 15, 2012
CommitFest #4: January 15, 2013 - February 15, 2013
Final Triage: February 1, 2013 - February 7, 2013

This is basically the same schedule we used for the PostgreSQL 9.2 development cycle, but we've added a "planning week" and a "final triage". The idea of the planning week, and also of the final triage week, is to identify which large patches still have a hope of getting done in time to be committed to PostgreSQL 9.3. In that way, if a patch has no chance of being ready for commit, people can stop working on it and focus their efforts on those that do. Hopefully, this will make the final CommitFest go a little more smoothly than it did this time, with less burn-out and a quicker denouement.

We also agreed on one significant change to the review process: if you submit a patch, you must also volunteer to review a patch. In earlier development cycles, we have been able to rely on volunteer reviewers to do much of the reviewing work, but that didn't work as well during the 9.2 cycle. Although I'm hopeful that this will ease some of the reviewing crunch, I'd also like to underscore the importance of volunteer reviewers. Reviewing is a great way to get involved in the development process without spending too much time on it, and often leads to significant improvements in design and usability as well as code quality. It also helps to ensure a variety of opinions about important patches, which usually leads to a stronger product overall, and, at least in my experience, it is also a great way to learn more about PostgreSQL.

Monday, April 09, 2012

Don't Take Me Too Seriously

My recent blog post showing linear read-scaling out to 64 cores has generated a lot of attention, which I frankly did not expect. To be sure, it's a cool result, but keep in mind that it's testing a certain very specific workload: very short transactions running very simple queries to do repeated primary key lookups on a single table. This, of course, is not an unimportant workload; in fact, you could argue that some of the NoSQL products on the market are basically purpose-built databases for exactly this kind of workload (sometimes coupled with a facility for durable or non-durable writes).

On the flip side, it's also not the only workload. The reason why PostgreSQL 9.2devel so completely stomps the performance of earlier releases on this test is that the transactions are very, very short. If you ran the same test, but with BEGIN before the first query and END after the last one, or if you striped the queries across multiple tables, you would eliminate the lock manager contention that holds performance on PostgreSQL 9.1 in check on this test.

For most people, therefore, there's probably no reason to panic just because you're running one of the existing releases. You may get better performance on PostgreSQL 9.2, when it's released, but chances are that if you had the sort of workload for which these changes make a truly dramatic difference, you wouldn't have picked PostgreSQL in the first place. What I think is exciting about these changes is not so much that existing users will see huge performance benefits (although some will; we have a lot of good changes in this release) but that PostgreSQL will become usable in environments where it currently can't compete.

Of course, that's not to say that we're going to put memcached out of business. There will probably always be cheaper alternatives to an RDBMS if the only work you need to do is primary key lookups and stores, and especially if you don't need durability. But many people need good performance on large numbers of simple queries and additionally need the ability to do some more complex processing, and I'm hopeful that these scalability changes will make it much simpler to deploy PostgreSQL effectively in such environments.

Although for most people there's no huge rush to upgrade, if you're running PostgreSQL 8.3.x or older, it's time to think hard about getting onto a newer version. Community support for PostgreSQL 8.3.x will end in Q1 of next year. If you're running anything older than that, it's already unsupported; moreover, every release from 7.4 through 8.3 featured major performance improvements.

Tuesday, April 03, 2012

Did I Say 32 Cores? How about 64?

Remember when I blogged about linear read scalability out to 32 cores? Well, the awesome Nate Boley provided me with access to his brand new 64-core server. I ran my usual suite of read-only pgbench tests just to baseline its performance, and found that the performance scaled linearly all the way out to 64 clients. OK, it wasn't quite linear: the 64-client performance was only 63.68 times the single-client performance. Still, I'll take it. Graph is below.

Why Is My Database Corrupted?

Working in the support department (as I do) is a good way to hear lots of horror stories. From time to time, we get reports of what can only be described as database corruption: errors reading tables (or even trying to connect to the database) due to missing files, corrupted data blocks or tuples, or rows that show up in the table but not its indexes. Attempting to decode a corrupted tuple can cause PostgreSQL to attempt to allocate a gigantic amount of memory (and fail with an ERROR) or even to die with a segmentation fault, aborting all in-progress transactions and restarting the server. Such occurrences are painful, to say the least, and users are often left wondering how it happens. How exactly does the database get corrupted?

Tuning shared_buffers and wal_buffers

I spend a lot of time answering questions about PostgreSQL, and one of the questions I get asked frequently is: how should I set shared_buffers? And, a bit less often, how should I set wal_buffers? I've got canned answers that I can rattle off so fast it'll make your head spin. Exceptions to my canned answers keep popping up, and it's starting to get hard to give an answer that actually captures all the complexity in this area, so here's a longer explanation.