Robert Haas

Thursday, May 26, 2011

Performance Optimization

There have been several lively discussions this week on possible performance optimizations for PostgreSQL 9.2. PostgreSQL 9.1 is still in beta, and there's still plenty of bug-fixing going on there, but the number of people who need to and can be involved in that process is not as large as it was a month or two ago. So it's a good time to start thinking about development for PostgreSQL 9.2. Not a lot of code is being written yet, which is probably just right for where we are in the development cycle, but we're kicking the tires of various ideas and trying to figure out what projects are worth spending time on. Here are a few that I'm excited about right now.

PostgreSQL 9.2 Development: CommitFest Managers Needed

There's an old joke that project managers don't know how to do anything; they only know how to do it better. Of course, the project manager is the butt of the joke: how can you know how to do something better, if you don't know how to do it in the first place?

PostgreSQL 9.1 In Review

As the PostgreSQL 9.1 development cycle winds down to a close (we're now in beta), I find myself reflecting on what we got done during this development cycle, and what we didn't get done. Just over a year ago, I published a blog post entitled Big Ideas, basically asking for feedback on what we should attempt to tackle in PostgreSQL 9.1. The feedback was overwhelming, and led to a follow-on blog post summarizing the most-requested features. Here's the short version: Of the 10 or 11 most requested features, we implemented (drum roll, please) one. Credit for granular collation support goes to Peter Eisentraut, with much additional hackery by Tom Lane.

You can find a second list of possible 9.1 features in the minutes of the PGCon 2010 Developer Meeting. Of the 30 features listed there, we got, by my count, 12.5, or more than 40%. While even that may seem like a somewhat small percentage, a dozen or so major enhancements to the development process is nothing to sneeze at. And certainly, if you were a feature longing to be implemented, you'd much prefer to be on the second list than the first.

shared_buffers on 32-bit systems

Every once in a while, I read something that's so obvious after the fact that I wonder why I didn't think of it myself. A recent thread on pgsql-performance had that effect. The PostgreSQL documentation gives some rough guidelines for tuning shared_buffers, which recommends (on UNIX-like systems) 25% of system memory up to a maximum of 8GB. There are similar guidelines on the PostgreSQL wiki page, in the article on Tuning Your PostgreSQL Server.

What became obvious to me in reading the thread - and probably should have been obvious to me all along - is that this advice only makes sense if you're running a 64-bit build of PostgreSQL, because, on a 32-bit build, each process is limited to 4GB of address space, of which (at least on Linux) 1GB is reserved for the kernel. That means that no matter how much physical memory the machine has, each PostgreSQL backend will be able to address at most 3GB of data. That includes (most significantly) shared_buffers, but also the process executable text, stack, and heap, including backend-local memory allocations for sorting and hashing, various internal caches that each backend process uses, local buffers for accessing temporary relations, and so on. And 3GB is really a theoretical upper limit, if everything is packed optimally into the available address space: I'm not sure how close you can get to that limit in practice before things start breaking.

So, if you're running a 32-bit PostgreSQL on UNIX-like operating system, you probably need to limit shared_buffers to at most 2GB or 2.5GB. If that is less than about 25% of your system memory, you should seriously consider an upgrade to a 64-bit PostgreSQL.

Tuesday, April 26, 2011

A Change of Role

I'm pleased to report that I have changed roles within EnterpriseDB and am now officially a member of EnterpriseDB's support and professional services team. I have a number of other responsibilities as well, so I will not be spending all of my time on support, but I have been and will continue to spend a significant chunk of time on it each day. When I've talked about this with people, they sometimes look at me as if I have two heads. After all, I am a backend server developer, and developers as a breed are not known for enjoying support.

PostgreSQL East, and The MySQL Conference and Expo

Last month, I attended (and spoke at) PostgreSQL East in New York City, which this year featured a MongoDB track. This past week, I was in Santa Clara at the O'Reilly MySQL Conference & Expo, which had a substantial PostgreSQL track this year, where I also spoke.

Both conferences had some very good talks. The first talk I attended at the MySQL conference turned out to be one of the best - it was entitled Linux and H/W optimizations for MySQL. I had a little difficulty understanding Yoshinori Matsunobu's accent at times, but the slides were excellent, and very detailed. Some of his more interesting findings: (1) SSDs speed things up both on the master and on replication slaves, but the speedup is larger on the slaves; so it's useful to put hard disks on the master and SSDs on the slave to make it possible for single-threaded recovery there to keep up with the master; (2) while SSDs are much faster for random access, they are actually slower for sequential access and fsync, so a RAID array with a battery-backed or flash-backed write cache may still be a better option in those cases, (3) Fusion I/O drives were FAR faster than Intel drives, (4) the Intel Nehalem architecture was much faster than the AMD Opteron architecture when used in combination with SSDs, and (5) HyperThreading helps more in SSD environments than it does otherwise, because the system, overall, becomes more heavily CPU-bound; for the same reasons, mutex contention hurts more.

Another very good talk was Peter Zaitsev's discussion of Innodb and XtraDB Architecture and Performance Optimization, which gave me the feeling of looking into a sort of carnival mirror, where you recognize yourself, but it's all distorted. Two of the problems that give PostgreSQL DBAs heartburn - bloat, and checkpoint I/O spikes (and less frequently, purge not keeping up a la vacuum not keeping up) - are apparently problems for MySQL as well, though with significantly different details. I'm not even going to attempt to summarize the differences, or say which problem is worse or occurs more often, because I honestly have no idea. I was a bit surprised to hear dump-and-reload recommended to recover from certain worst-case scenarios, though.

There were other good talks, too, which helped me understand what's going on in the world of MySQL forks. Apparently, the Drizzle team is busy removing features that they consider half-baked and modularizing the code so that it is easier to understand and improve, while the MariaDB team is busy adding optimizer features, including support for hash joins and persistent statistics. From what I understand, the MySQL optimizer has typically worked by gathering statistics through on-the-fly index probes, which can be a problem in some situations. It's not so easy to categorize the work that Oracle is doing, but it seems to involve a fair amount of filing down of rough edges, and various improvements to replication, including, perhaps most significantly, parallel replication apply.

At PostgreSQL East, I think my favorite talk was Ken Rosensteel's talk, somewhat misleadingly titled Large Customers Want PostgreSQL, Too. This talk turned to be about migrating a large Oracle mainframe application to use PostgreSQL, and the challenges faced during that migration. He, or his team, built an Oracle-to-PostgreSQL converter for stored procedures; it was interesting to see that they got bitten by our bizarre casting rules around the smallint data type. They also ended up doing some very interesting work optimizing the performance of ECPG for small FETCH statements; these are areas of the code that I think don't normally get a lot of attention, and it was great to hear about the optimization work that got done.

I was disappointed that Jon Hoffman's talk on Experiences with Postgres and MongoDB at foursquare.com got cancelled; I think that would have been an interesting talk. I did have an opportunity to attend Jake Luciani's talk Comparing the Apache Cassandra Architecutre to PostgreSQL, which turned out to be more about Cassandra than PostgreSQL, but was nevertheless interesting. I would have been interested to hear a more technical talk, though, about how problems like distributed serialization anomalies and distributed checkpointing are handled.

Next month, I'll be speaking at PGCon 2011 on Using The PostgreSQL System Catalogs and How To Get Your PostgreSQL Patch Accepted. And after that, unlike Bruce, I'm going to stay home for a few months!

Thursday, April 07, 2011

One-Way Tuning

I've been noticing that many of the parameters in postgresql.conf only ever need to be adjusted in one direction from the default.

shared_buffers, temp_buffers, work_mem, maintenance_work_mem, checkpoint_segments, checkpoint_timeout, and checkpoint_completion_target, join_collapse_limit, and from_collapse_limit only ever need to be adjusted upward from the default.

bgwriter_delay, seq_page_cost, and random_page_cost only ever need to be adjusted downward from the default.

Does this mean we should change some of the defaults?