Friday, November 04, 2022
pg_basebackup could not set compression worker count - unsupported parameter
Monday, May 02, 2022
Parallel Server-Side Backup Compression
I decided to do a little more research on the performance of server-side backup compression, which will be a new feature in PostgreSQL 15 unless, for some reason, the changes need to be reverted prior to release time. The network link I used for my previous testing was, as I mentioned, rather slow, and handicapped by both a VPN link and an SSH tunnel. Furthermore, I was testing using pgbench data, which is extremely compressible. In addition, at the time I did those tests, we had added support for LZ4 compression, but we had not yet added support for Zstandard compression. Now, however, we not only have Zstandard as an option, but it is possible to use the library's multi-threading capabilities. So, I wanted to find out how things would work out on a faster network link, with a better test data set, and with all the compression algorithms that we now have available.
Friday, February 11, 2022
Server-Side LZ4 Backup Compression
I have been working with my colleagues Tushar Ahuja, Jeevan Ladhe, and Dipesh Pandit to make some improvements to pg_basebackup for version 15. A lot of that work has felt a bit like boring but necessary refactoring, and it's easy to find yourself wondering whether it will really do anybody any good. I was feeling optimistic after today's commits, so I decide to give it a try.
Tuesday, January 18, 2022
Surviving Without A Superuser - Part Two
If PostgreSQL had the ability to give to a privileged non-superuser the right to administer objects belonging to some designated group of superusers just as if the privileged account were superuser, it would get us much closer to a world in which the database can be effectively administered by a non-superuser. A highly privileged user - let's call him sauron - could be given the right to administer tables, schemas, functions, procedures, and a variety of other objects owned by dependent users witchking and khamul just as if sauron were superuser. sauron might indeed feel himself to be virtually a superuser, at least within his own domain, as long as he didn't spend too much time thinking about the users over which he had not been given administrative rights. However, sauron might notice a few irksome limitations.
Wednesday, January 12, 2022
Who Contributed to PostgreSQL Development in 2020 and 2021?
I have done a few previous blog posts on who has contributed to PostgreSQL, but I did not do one last year. A couple people mentioned to me that they missed it, so I decided to do one this year, and I decided to gather statistics, using basically the same methodology that I have in the past, for both 2020 and 2021. As always, it is important to remember, first, that many people contribute to the project in ways that these statistics don't capture, and second, that the statistics themselves are prone to mistakes (since there is a bunch of manual work involved) and bias (since each commit is attributed to the first author for lack of knowledge of the relative contributions of the various authors). As usual, I have posted a dump of the database I used to generate this in case anyone wants to check it over for goofs or use it for any other purpose.
I calculate that there were 176 people who were the principal author of at least one PostgreSQL commit in 2020 and 182 such people in 2021. In 2020, 13 people contributed 66% of the lines of new code, and 35 people contributed 90% of the lines of new code. In 2021, these numbers were 14 and 41 respectively. In 2020, there were a total of 2181 commits from 26 committers, and in 2021, there were 2269 commits from 28 committers. In each year, about 5 committers committed about two thirds of the non-self-authored patches, with Tom Lane leading the pack in both years.
Here are the top 35 contributors by lines of new code contributed in 2020. Asterisks indicate non-committers. Note that some of these people are committers now but were not committers at the time.
# | author | lines | pct_lines | commits
----+---------------------------+-------+-----------+---------
1 | Tom Lane | 65203 | 25.95 | 436
2 | Peter Eisentraut | 28771 | 11.45 | 229
3 | Paul Jungwirth [*] | 10723 | 4.27 | 2
4 | Heikki Linnakangas | 8293 | 3.30 | 31
5 | Robert Haas | 7831 | 3.12 | 37
6 | Tomas Vondra | 7461 | 2.97 | 32
7 | Andres Freund | 6614 | 2.63 | 59
8 | John Naylor | 6060 | 2.41 | 14
9 | Michael Paquier | 5744 | 2.29 | 103
10 | Ashutosh Bapat [*] | 5515 | 2.20 | 7
11 | Bruce Momjian | 5077 | 2.02 | 121
12 | Nikita Glukhov [*] | 4982 | 1.98 | 7
13 | Thomas Munro | 4820 | 1.92 | 57
14 | James Coleman [*] | 4798 | 1.91 | 16
15 | Mark Dilger [*] | 4779 | 1.90 | 12
16 | Kyotaro Horiguchi [*] | 4775 | 1.90 | 35
17 | Peter Geoghegan | 3696 | 1.47 | 109
18 | Amit Langote [*] | 3650 | 1.45 | 28
19 | Anastasia Lubennikova [*] | 3611 | 1.44 | 4
20 | Jeff Davis | 3408 | 1.36 | 40
21 | Alvaro Herrera | 3073 | 1.22 | 75
22 | Pavel Stehule [*] | 2752 | 1.10 | 4
23 | Julien Rouhaud [*] | 2656 | 1.06 | 19
24 | Alexander Korotkov | 2613 | 1.04 | 33
25 | Masahiko Sawada [*] | 2540 | 1.01 | 20
26 | Dilip Kumar [*] | 2306 | 0.92 | 15
27 | Justin Pryzby [*] | 2222 | 0.88 | 53
28 | Fujii Masao | 2139 | 0.85 | 51
29 | Daniel Gustafsson | 1835 | 0.73 | 54
30 | Corey Huinker [*] | 1777 | 0.71 | 1
31 | Dmitry Dolgov [*] | 1628 | 0.65 | 3
32 | Dean Rasheed | 1512 | 0.60 | 9
33 | David Rowley | 1382 | 0.55 | 35
34 | Vik Fearing [*] | 1285 | 0.51 | 5
35 | Karl Pinc [*] | 1278 | 0.51 | 1
And here are the top 41 contributors by lines of new code contributed in 2021.
# | ?column? | lines | pct_lines | commits
----+------------------------------+-------+-----------+---------
1 | Tom Lane | 66210 | 26.09 | 438
2 | Tomas Vondra | 15357 | 6.05 | 50
3 | Dagfinn Ilmari Mannsåker [*] | 14715 | 5.80 | 10
4 | Peter Eisentraut | 12976 | 5.11 | 214
5 | Robert Haas | 7035 | 2.77 | 46
6 | Bruce Momjian | 7010 | 2.76 | 58
7 | Peter Geoghegan | 6889 | 2.71 | 91
8 | Amit Langote [*] | 6859 | 2.70 | 24
9 | Heikki Linnakangas | 6706 | 2.64 | 38
10 | Mark Dilger [*] | 6203 | 2.44 | 23
11 | David Rowley | 5848 | 2.30 | 50
12 | Alvaro Herrera | 5582 | 2.20 | 79
13 | Andres Freund | 5288 | 2.08 | 53
14 | Michael Paquier | 5057 | 1.99 | 127
15 | Thomas Munro | 4356 | 1.72 | 78
16 | Peter Smith [*] | 4194 | 1.65 | 29
17 | Vignesh C [*] | 3886 | 1.53 | 19
18 | Dilip Kumar [*] | 3496 | 1.38 | 19
19 | Craig Ringer [*] | 3070 | 1.21 | 6
20 | Masahiko Sawada [*] | 2879 | 1.13 | 32
21 | Andrew Dunstan | 2461 | 0.97 | 48
22 | Bharath Rupireddy [*] | 2336 | 0.92 | 41
23 | Daniel Gustafsson | 2127 | 0.84 | 38
24 | Justin Pryzby [*] | 2087 | 0.82 | 51
25 | Hayato Kuroda [*] | 2080 | 0.82 | 5
26 | Ajin Cherian [*] | 2021 | 0.80 | 10
27 | Kyotaro Horiguchi [*] | 1896 | 0.75 | 28
28 | John Naylor | 1814 | 0.71 | 20
29 | Greg Nancarrow [*] | 1745 | 0.69 | 12
30 | Julien Rouhaud [*] | 1707 | 0.67 | 16
31 | Edmund Horner [*] | 1654 | 0.65 | 1
32 | Noah Misch | 1605 | 0.63 | 30
33 | Dmitry Dolgov [*] | 1536 | 0.61 | 3
34 | Amit Kapila | 1434 | 0.57 | 21
35 | Fabien Coelho [*] | 1398 | 0.55 | 7
36 | Gilles Darold [*] | 1338 | 0.53 | 1
37 | Jacob Champion [*] | 1247 | 0.49 | 7
38 | Andrey Borodin [*] | 1224 | 0.48 | 6
39 | Dean Rasheed | 1131 | 0.45 | 12
40 | Nathan Bossart [*] | 1113 | 0.44 | 15
41 | Alexander Korotkov | 1049 | 0.41 | 17
Next let's look at which committers did the most work committing patches that they did not themselves write. Here is the data for 2020.
# | committer | lines | pct_lines | commits
----+--------------------+-------+-----------+---------
1 | Tom Lane | 18728 | 19.25 | 132
2 | Alexander Korotkov | 17043 | 17.51 | 16
3 | Michael Paquier | 11932 | 12.26 | 143
4 | Amit Kapila | 10327 | 10.61 | 74
5 | Alvaro Herrera | 6442 | 6.62 | 51
6 | Etsuro Fujita | 5393 | 5.54 | 2
7 | Tomas Vondra | 5008 | 5.15 | 17
8 | Peter Geoghegan | 3823 | 3.93 | 6
9 | Robert Haas | 3468 | 3.56 | 12
10 | Peter Eisentraut | 3051 | 3.14 | 32
11 | Noah Misch | 3046 | 3.13 | 5
12 | Fujii Masao | 2646 | 2.72 | 61
13 | Heikki Linnakangas | 1954 | 2.01 | 25
14 | Dean Rasheed | 1458 | 1.50 | 2
15 | David Rowley | 583 | 0.60 | 8
16 | Andres Freund | 516 | 0.53 | 6
17 | Thomas Munro | 516 | 0.53 | 13
18 | Andrew Gierth | 374 | 0.38 | 3
19 | Magnus Hagander | 257 | 0.26 | 18
20 | Michael Meskes | 201 | 0.21 | 1
21 | Bruce Momjian | 182 | 0.19 | 12
22 | Stephen Frost | 134 | 0.14 | 2
23 | Jeff Davis | 130 | 0.13 | 2
24 | Andrew Dunstan | 97 | 0.10 | 4
And here are the non-self-authored commits for 2021.
# | committer | lines | pct_lines | commits
----+--------------------+-------+-----------+---------
1 | Tom Lane | 25692 | 26.83 | 111
2 | Amit Kapila | 16269 | 16.99 | 105
3 | Robert Haas | 8886 | 9.28 | 30
4 | Michael Paquier | 8343 | 8.71 | 163
5 | Peter Eisentraut | 6787 | 7.09 | 35
6 | Alvaro Herrera | 6103 | 6.37 | 44
7 | Fujii Masao | 4379 | 4.57 | 70
8 | Tomas Vondra | 3277 | 3.42 | 19
9 | David Rowley | 2807 | 2.93 | 20
10 | Etsuro Fujita | 2216 | 2.31 | 5
11 | Michael Meskes | 1653 | 1.73 | 2
12 | Alexander Korotkov | 1540 | 1.61 | 5
13 | Thomas Munro | 1469 | 1.53 | 16
14 | Bruce Momjian | 1283 | 1.34 | 9
15 | Daniel Gustafsson | 1030 | 1.08 | 25
16 | Peter Geoghegan | 960 | 1.00 | 11
17 | Magnus Hagander | 823 | 0.86 | 12
18 | Noah Misch | 578 | 0.60 | 8
19 | Heikki Linnakangas | 553 | 0.58 | 7
20 | Stephen Frost | 291 | 0.30 | 3
21 | Dean Rasheed | 260 | 0.27 | 2
22 | John Naylor | 253 | 0.26 | 6
23 | Andrew Dunstan | 110 | 0.11 | 4
24 | Jeff Davis | 103 | 0.11 | 2
25 | Andres Freund | 76 | 0.08 | 3
26 | Joe Conway | 10 | 0.01 | 1
Finally, let's look at who sent a lot of emails to pgsql-hackers each year. Many of these people are also prolific patch authors, but some of them are more involved in discussion and review than in actually writing code. Here is everyone who sent at least 100 emails to the list in 2020.
count | name
-------+-----------------------
2539 | Tom Lane
1484 | Justin Pryzby
1389 | Michael Paquier
1156 | Amit Kapila
1068 | Alvaro Herrera
992 | Kyotaro Horiguchi
986 | Andres Freund
869 | Robert Haas
803 | Tomas Vondra
658 | Fujii Masao
620 | Peter Eisentraut
569 | Thomas Munro
530 | Masahiko Sawada
507 | Bruce Momjian
489 | Pavel Stehule
486 | Peter Geoghegan
453 | Julien Rouhaud
445 | Dilip Kumar
368 | Daniel Gustafsson
356 | David Rowley
355 | Stephen Frost
332 | Amit Langote
298 | Bharath Rupireddy
270 | Heikki Linnakangas
245 | James Coleman
235 | Ranier Vilela
214 | Andy Fan
210 | Andrew Dunstan
208 | Mark Dilger
203 | Magnus Hagander
197 | Takayuki Tsunakawa
189 | Alexander Korotkov
189 | Fabien Coelho
185 | David G. Johnston
177 | Konstantin Knizhnik
174 | David Steele
161 | Vignesh C
159 | Ashutosh Bapat
153 | Jeff Davis
152 | Noah Misch
145 | John Naylor
139 | Craig Ringer
130 | Alexey Kondratov
118 | Laurenz Albe
114 | Anastasia Lubennikova
112 | Dmitry Dolgov
110 | Wenjing Zeng
101 | Vik Fearing
And here is the same for 2021.
count | name
-------+-----------------------
2476 | Tom Lane
1385 | Amit Kapila
1378 | Michael Paquier
1178 | Justin Pryzby
1056 | Andres Freund
1056 | Alvaro Herrera
966 | Robert Haas
862 | Bharath Rupireddy
831 | Kyotaro Horiguchi
775 | Tomas Vondra
675 | Masahiko Sawada
648 | Andrew Dunstan
623 | Bruce Momjian
619 | Peter Geoghegan
615 | Dilip Kumar
525 | Fujii Masao
517 | Vignesh C
512 | Thomas Munro
501 | Peter Eisentraut
484 | Daniel Gustafsson
478 | Mark Dilger
460 | Julien Rouhaud
415 | Nathan Bossart
411 | Peter Smith
397 | Pavel Stehule
381 | David Rowley
360 | Stephen Frost
356 | Amit Langote
354 | Zhijie Hou
348 | Greg Nancarrow
291 | Zhihong Yu
242 | Joel Jacobson
235 | Heikki Linnakangas
231 | John Naylor
216 | Jacob Champion
199 | Fabien Coelho
196 | Takayuki Tsunakawa
196 | Magnus Hagander
183 | Ranier Vilela
177 | Noah Misch
175 | Osumi Takamichi
168 | Japin Li
155 | Yugo Nagata
153 | Jeff Davis
152 | Haiying Tang
149 | Amul Sul
147 | Andrey Borodin
140 | Ajin Cherian
138 | David Steele
132 | Laurenz Albe
128 | David G. Johnston
122 | Ronan Dunklau
111 | Simon Riggs
108 | Matthias Van De Meent
107 | Euler Taveira
106 | Andy Fan
103 | Dean Rasheed
I apologize if these email numbers are not completely accurate, and especially to our contributors from China and Japan. Some people posted under multiple names, or using names not written in the character set with which I am most familiar, and I did my best to figure out which posts were actually from the same person and how to best render that person's name. However, I suspect that I have not been able to be entirely consistent about the ordering of family names as opposed to given names, and I may have made some other mistakes as well. I apologize for and regret my errors.
As always, thanks to all who have contributed in any way, whether these lists capture that contribution or not!