Friday, September 10, 2010

SE-Linux For PostgreSQL: Part 1

I made the trip down to OmniTI headquarters just south of Baltimore, MD this Wednesday for BWPUG. This month's topic was the ongoing project to integrate SE-Linux with PostgreSQL. Besides myself, Stephen Frost, Greg Smith, Robert Treat were all there from the PostgreSQL community, along with David Quigley and Joshua Brindle from the SE-Linux community. It was a very productive meeting and I learned a lot about what the SE-Linux community is looking for from PostgreSQL.

We first discussed the current status of the project. Following discussions with Stephen Frost, KaiGai Kohei, and Greg Smith at PGCon 2010, I wrote and committed four patches which have, I think, helped to clear the way for an eventual loadable module implementing basic SE-Linux support for PostgreSQL; and I also committed a fifth patch by KaiGai Kohei. These were, in order of commit:

1. Add a hook in ExecCheckRTPerms(). This is just a very simple hook to allow a loadable module to gain control at the time DML relation permissions are checked. Whenever a SELECT, INSERT, UPDATE, or DELETE statement is invoked, this function gets a listed of the relation OIDs and can choose to allow the statement to proceed or throw an error. (It could also potentially do other things, like write to a log file, if that were useful for some reason.)

2. Centralize DML permissions-checking logic. KaiGai Kohei spotted the fact that the previous patch didn't actually work for a couple of important cases. In particular, COPY did not previously go through ExecCheckRTPerms(), and there is some hairy code inside the foreign key stuff that also needed adjustment to work properly with this hook. This patch, by KaiGai Kohei, cleaned all of that up. So as far as I know, we now have a single point for all DML permissions checking, and a hook function at that point. Yeah!

Unfortunately, in order to do label-based security, a simple hook function is not enough. You also need a place to store the labels, and ideally that place should be a PostgreSQL system catalog. I had initially thought that we would add a security label column to the system catalog for each object type, but that would require fairly invasive changes across the whole system and carry some minor performance penalty even for people who did not use it. At PGCon, we came up with the idea of storing all security labels for all objects in a separate catalog. Security labels are, in essence, just strings, which we don't try to interpret but which have some meaning (the details of which we need not understand) to an external security provider such as SE-Linux.

As luck would have it, we already have a model for such a facility: the COMMENT statement already knows how to store arbitrary strings which it does not attempt to interpret for arbitrary database objects, using a catalog (actually two catalogs) dedicated to that purpose. Unfortunately, the comment code is quite large, and, as it turned out, buggy, so it didn't seem like a good idea to copy-and-paste it into a new file and then hack it up from there, as I had initially hoped to do. So that led to three more patches.

3. Standardize get_whatever_oid functions for object types with unqualified names. As it turns out, one of the things that the comment code needed to do over and over again was examine the parse tree representation of an object and convert it to an OID by looking up the name in a system catalog. But there wasn't any standard way to do this, and in some cases the code was quite lengthy and already duplicated in multiple places throughout our source base. This patch cleaned that up, by introducing a standard API and adjusting the existing OID-getter functions, or adding new ones, for tablespaces, databases, roles, schemas, languages, and access methods, to conform to that API.

4. Standardize get_whatever_oid functions for other object types. More of the same, this time for text search parsers, dictionaries, templates, and configs; as well as for conversions, constraints, operator classes, operator families, rules, triggers, and casts.

5. Rewrite comment code for better modularity, and add necessary locking. This patch took the refactoring in the previous two patches one step further. The functions in the previous patches provide a way to translate a named object of a different type to an OID. This patch creates a more general API that can be passed an object type and a parse tree and return an ObjectAddress, which is an internal representation that can point to a database object of any type. The ObjectAddress representation is used for management of dependencies between database objects (e.g. you can't drop a table if there's a view using it, unless you also drop the view) as well as by the comment code, and they will be useful for security label support as well.

This new facility also fixes a longstanding locking bug in the COMMENT code, which still exists (and likely won't be fixed) in 9.0 and all prior releases. An object that is dropped concurrently with a COMMENT operation on that same object could lead to an orphaned comment in the pg_description or pg_shdescription catalog. If another object of the same type is subsequently assigned the same OID, it will inherit the orphaned comment. This is fairly unlikely and, for comments, fairly innocuous, but it would obviously create a potential security hole for security labels.

With these preliminary patches in place, I think we're now well-positioned to introduce the major piece of functionality which we will need to support SE-Linux integration: an in-core security label facility for use by SE-Linux and perhaps other label-based security systems. Stephen Frost, KaiGai Kohei, and I have had extensive discussions about the design of this facility and there are currently two pending patches by KaiGai Kohei which are intended to implement that design: one adds the basic security facility and commands for manually applying labels, and the other adds hooks at table creation time to allow enhanced security providers to automatically set a label on newly created tables. I have not yet reviewed these patches in detail, but I hope to see them committed - likely with some modifications - within the next month.

In the second part of this blog post, I'll go over what I learned from David and Joshua (who were extremely helpful in explaining SE-Linux to me), the additional facilities which they felt would be necessary for a minimally useful SE-Linux integration, and what they'd like to see over the longer term.

Tuesday, August 24, 2010

Version Numbering

Over the last few days, there's been a debate raging on pgsql-hackers on the subject of version numbering. There are many thoughtful (and some less-thoughtful) opinions on the thread that you may wish to read, but I thought the most interesting was a link posted by Thom Brown to a blog post called The Golden Rules of Version Naming. If you haven't seen it, it's definitely worth a read.

Monday, August 16, 2010

Why We're Conservative With PostgreSQL Minor Releases

Last week, a PostgreSQL user filed bug #5611, complaining about a performance regression in PostgreSQL 8.4 as compared with PostgreSQL 8.2. The regression occurred because PostgreSQL 8.4 is capable of inlining SQL functions, while PostgreSQL 8.2 is not. The bug report was also surprising to me, because in my experience, inlining SQL queries has always improved performance, often dramatically. But this user managed unluckily hit a case where the opposite is true: inlining caused a function which had previously been evaluated just once to be evaluated multiple times. Fortunately, there is an easy workaround: writing the function using the "plpgsql" language rather than the "sql" language defeats inlining.

Although the bug itself is interesting (let's face it, I'm a geek), what I found even more interesting was that I totally failed to appreciate the possibility that inlining an SQL function could ever fail to be a performance win. Prior to last week, if someone had asked me whether that was possible, I would have said that I didn't think so, but you never know...

And that is why the PostgreSQL project maintains stable branches for each of our major releases for about five years. Stable branches don't get new features; they don't get performance enhancements; they don't even get tweaks for things we wish we'd done differently or corrections to behavior of doubtful utility. What they do get is fixes for bugs (like: without this fix, your data might get corrupted; or, without this fix, the database might crash), security issues, and a smattering of documentation and translation updates. When we release a new major release (or actually about six months prior to when we actually release), development on that major release is over. Any further changes go into the next release.

On the other hand, we don't abandon our releases once they're out the door, either. We are just now in the process of ceasing to support PostgreSQL 7.4, which was released in November 2003. For nearly seven years, any serious bugs or security vulnerabilities which we have discovered either in that version or any newer version have been addressed by releasing a new version of PostgreSQL 7.4; the current release is 7.4.29. Absent a change in project policy, 7.4.30 will be the last 7.4.x release.

If you're running PostgreSQL 8.3 or older, and particularly if you're running PostgreSQL 8.2 or older, you should consider an upgrade, especially once PostgreSQL 9.0 comes out. Each release of PostgreSQL includes many exciting new features: new SQL constructions, sometimes new data types or built-in functions, and performance and manageability enhancements. Of course, before you upgrade to PostgreSQL 8.4 (or 9.0), you should carefully test your application to make sure that everything still works as you expect. For the most part, things tend to go pretty smoothly, but as bug #5611 demonstrates, not always.

Of course, this upgrade path is not for everyone. Application retesting can be difficult and time-consuming, especially for large installations. There is nothing wrong with staying on the major release of PostgreSQL that you are currently using. But it is very wise to upgrade regularly to the latest minor version available for that release. The upgrade process is generally as simple as installing the new binaries and restarting the server (but see the release notes for your version for details), and the PostgreSQL community is firmly committed to making sure that each of these releases represents an improvement to performance and stability rather than a step backwards.

Friday, August 13, 2010

How I Hack on PostgreSQL

Today's post by Dimitri Fontaine gave me the idea of writing a blog posting about the tools I use for PostgreSQL development. I'm not saying that what I do is the best way of doing it (and it's certainly not the only way of doing it), but it's one way of doing it, and I've had good luck with it.

What commands do I use? The following list shows the ten commands that occur most frequently in my shell history.

[rhaas pgsql]$ history  | awk '{print $2}' | sort | uniq -c | sort -rn | head
250 git
57 vi
31 %%
25 cd
24 less
20 up
18 make
13 pg_ctl
10 ls
8 psql

Wow, that's a lot of git. I didn't realize that approximately half of all the commands I type are git commands. Let's see some more details.

[rhaas pgsql]$ history  | awk '$2 == "git" { print $3}' | sort | uniq -c | sort -rn | head
93 diff
91 grep
15 log
10 commit
8 checkout
7 add
6 reset
5 clean
4 pull
3 branch

As you can see, I use git diff and git grep far more often than any other commands. The most common things I do with git diff are just plain git diff, which displays the unstaged changes in my working tree (so I can see what I've changed, or what a patch I've just applied has changed) and git diff master (which shows all the differences between my working tree and the master branch; this is because I frequently use git branches to hack on a patch I'm working on). A great deal of the work of writing a good patch - or reviewing one - consists in looking at the code over and over again and thinking about whether every change can be justified and proven correct.

git grep does a recursive grep starting at the current directory, but only examines files checked into git (not build products, for example). I use this as a way to find where a certain function is defined (by grepping for the name of the function at the start of a line) and as a way to find all occurrences of an identifier in the code (which is an absolutely essential step in verifying the correctness of your own patch, or someone else's).

As you can also see, my preferred editor is vi (really vim). This might not be the best choice for everyone, but I've been using it for close to 20 years, so it's probably too late to learn something else now. I think Dimitri Fontaine said it well in the post linked above: the best editor you can find is the one you master. Having said that, if you do even a small amount of programming, you're likely to spend a lot of time in whatever editor you pick, so it's probably worth the time it takes to learn a reasonably powerful one.

Saturday, August 07, 2010

Git is Coming to PostgreSQL

As discussed at the PGCon 2010 Developer Meeting, PostgreSQL is scheduled to adopt git as its version control system some time in the next few weeks. Andrew Dunstan, who maintains the PostgreSQL build farm, has adapted the build farm code to work with either CVS or git; meanwhile, Magnus Hagander has done a trial conversion so that we can all see what the new repository will look like. My small contribution was to write some documentation for the PostgreSQL committers, which has subsequently been further edited by Heikki Linnakangas (the link here is to his personal web page, whose one complete sentence is one of the funnier things I've read on the Internet).

I don't think the move to git is going to be radical change; indeed, we're taking some pains to make sure that it isn't. But it will make my life easier in several small ways. First, the existing git clone of the PostgreSQL CVS repository is flaky and unreliable. The back-branches have had severe problems in this area for some time (some don't build), and the master branch (aka CVS HEAD) has periodic issues as well. At present, for example, the regression tests for contrib/dblink fail on a build from git, but pass on a build from CVS. While we might be able to fix (or minimize) these issues by fixing bugs in the conversion code, switching to git should eliminate them. Also, since I do my day-to-day PostgreSQL work using git, it will be nice to be able to commit that way also - it should be both faster (CVS is very slow by comparison) and less error-prone (no cutting and pasting the commit message, no forgetting to add a file in CVS that you already added in git).

Thursday, July 29, 2010

Multi-Tenancy and Virtualization

In a recent blog post on Gigaom, Simeon Simeonov argues that virtualization is on the way out, and discusses VMware's move toward platform-as-a-service computing. In a nutshell, his argument is that virtualization is inefficient, and is essentially a last resort when legacy applications can't play nicely together in the same sandbox. In other words, the real goal for IT shops and service providers is not virtualization per se, but multi-tenancy, cost-effective use of hardware, and high availability. Find any two servers in the average corporate data center and ask why they're not running on the same machine. It's a good bet you'll get one of the following four answers: (1) machine A is running a piece of software that misbehaves if run on the same machine as some piece of software running on machine B, (2) a single server couldn't handle the load, (3) one of those servers provides redundancy for the other, or (4) no particular reason, but we haven't gotten around to consolidating them yet. In my experience, the first answer is probably the most common. But as Simeonov points out, the ideal solution is not virtualization, but better software - specifically, platforms that can transparently service multiple customers.

PostgreSQL is very strong in this area. Hosting providers such as hub.org provision databases for multiple customers onto a single PostgreSQL instance; and here at EnterpriseDB, we support several customers who do much the same thing. Databases in PostgreSQL provide a high degree of isolation: many configuration parameters can be set on a per-database basis, extensions can be installed into a single database without affecting other databases that are part of the same instance, and each database can in turn contain multiple schemas. The ability to have multiple databases, each containing multiple schemas, makes the PostgreSQL model more flexible than Oracle or MySQL, which have only a single tier system. In the upcoming PostgreSQL 9.0 release, the new grant on all in schema and default privileges features will further simplify user administration in multi-user and multi-tenant environments. Behind the scenes, a PostgreSQL instance uses a single buffer pool which can be efficiently shared among any number of databases without excessive lock contention. This is critical. Fragmenting memory into many small buffer pools prevents databases from scaling up (using more memory) when under heavy load, and at the same time prevents databases from scaling down (using less memory) when not in use. By managing all databases out of a single pool, PostgreSQL can allow a single database to use every block in the buffer pool - if no other databases are in use - or no blocks at all - if the database is completely idle.

Simeonov seems to feel that virtualization has already nearly run its course, and predicts that the market will hit its peak within three years. That doesn't seem likely to me. I think there is an awful lot of crufty hardware and software out there that could benefit from virtualization, but it's working right now, so no one is eager to make changes that might break something. As the physical equipment starts to fail, IT administrators will think about virtualization, but hardware that isn't touched can sometimes run for a surprisingly long time, so I don't expect server consolidation projects to disappear any time soon. More importantly, Simeonov seems to assume that all new applications will be developed using platform-as-a-service architectures such as Google App Engine, Bungee, Engine Yard, and Heroku. While some certainly will be, it seems unlikely that the traditional model of application development, using a dedicated web server and a dedicated database running on a physical or virtual machine will disappear overnight. For one thing, choosing one of those vendors means being locked into that vendor's API - and choice of programming language. Bungee and Heroku are Ruby environments, for example, while Google App Engine offers Java and Python. Good luck making the switch!

So, if plain old virtual machines are going to be around for a while, how does PostgreSQL stack up in that environment? Not too bad. Of course, write-intensive workloads will suffer from the generalized slowness of virtualized I/O. But PostgreSQL is designed to run well even in a very small memory footprint, to take good advantage of the OS buffer cache and process scheduler, and to be portable across a wide variety of platforms. If your database is small enough to fit in memory, performance should be good. And if your database isn't small enough to fit in memory, there's not much point in virtualizing it: you're going to need a dedicated machine either way.

Sunday, July 25, 2010

Google and our Documentation

As I mentioned in a previous blog post, trying to find pages in the PostgreSQL documentation using Google doesn't work very well: most often, one gets links to older versions.

A recent thread on pgsql-performance (somewhat off-topic for that mailing list, but that's where it was) suggested that perhaps we could use Google's canonical URL feature to work around this problem.

Another suggestion was that we ask people who link to our docs to link to http://postgresql.org/docs/current/ (or some sub-page) rather than linking to a specific version (e.g. the same URL with 8.4 in place of current). That way, as new versions come out, everyone's links will still be pointing at the latest version of the docs, helping the new versions accumulate "Google karma" more quickly than they would otherwise. Or at least, that's the idea: I have no idea whether it would actually work.