In a recent blog post on Performance Optimization, I mentioned that Noah Misch and I had discussed some methods of reducing the overhead of frequent relation locks. Every transaction that touches a given table or index locks it and, at commit time, unlocks it. This adds up to a lot of locking and unlocking, which ends up being very significant on machines with many CPU cores. I ended up spending a good chunk of last week hacking on this problem, with very promising results: I have a prototype patch that improves throughput on a SELECT-only pgbench test by about 3.5x on a system with 24 cores. Not bad for a couple days work.
And it looks like we haven't exhausted the possibilities of this line of attack. Some further investigation (inspired by a note from Stefan Kaltenbrunner) revealed that even with the patch applied, locking contention is still a severe problem on this workload. Each transaction takes and releases a lock on it's virtual transaction ID, or VXID. This scheme allows other processes to wait for the completion of a particular transaction. However, we don't actually use that mechanism very much: it appears that it is only needed for CREATE INDEX CONCURRENTLY, and for Hot Standby. I think it may be possible to arrange things so that the VXID locks are only taken when someone wants to wait for them, which would probably lead to a significant further boost in throughput.
These optimizations will primarily benefit read-only workloads. On at least some read-write workloads, throughput is limited by an LWLock known as WALInsertLock, which serializes insertions into the write-ahead log stream. I don't have a clear idea how to solve that problem at the moment, but intend to look into it further.
Nice work! Curious what the improvement is on an 8 way machine?
ReplyDeleteYes! It's a really important optimization. We encounter heavy locking on large SELECT's.
ReplyDelete@WildRAiD: I do not think PostgreSQL has any problems with locks on large SELECTs, assuming you mean selects which touch many rows. I believe it is when you have many short read-only transactions with a high level of concurrency you will hit this bottleneck.
ReplyDelete