ZODB Benchmarks
There doesn't seem to be any ZODB benchmarks readily available. Compelled by a project that needs to scale to very high numbers I started compiling some.
PLEASE NOTE THAT THE RESULTS IN THIS ARTICLE IS INCORRECT. ACCURATE RESULTS ARE DOCUMENTED IN A FOLLOW-UP POST NAMED ZODB BENCHMARKS REVISITED.
During the Plone conference in Naples this year I started working on collective.zodbbench with the goal to collect performance data for basic operations like inserts and lookups so anybody using the ZODB can make more informed decisions when developing ZODB applications or when choosing the database for your application. Another goal I had was to take this data to the ZODB-DEV mailing list and get explanations for areas where performance is poor and even inspire some improvements where we really need it.
The first benchmark shows how fast can you insert 1Kb objects into a OOBTree. This is not a lot of data and is probably not representative of what we typically store in the ZODB, but it does exercise the BTree implementation. You will notice that performance deteriorates linearly from around 2000 inserts per second when we start to around 250 inserts per second when the BTree contains 10 million objects. One very important observation I made when doing this benchmark is what the impact of the cache_size parameter is. When I ran the test with the default ZODB cache size the insertion rate deteriorated a lot more rapidly and it was barely 50 inserts per second on BTree with 10 million records.

In an attempt to determine if the ZODB's drop in performance is normal I created a test with Postgres purely to observe transaction rate and not to compare it with the ZODB. Notice how the insert rate remains fairly consistent at around 7000 inserts per second. Roughly every million inserts there is sharp drop in the insert rate. Laurence Rowe on the ZODB-DEV explained this as follow:
It looks like ZODB performance in your test has the same O(log n) performance as PostgreSQL checkpoints (the periodic drops in your graph). This should come as no surprise. B-Trees have a theoretical Search/Insert/Delete time complexity equal to the height of the tree, which is (up to) log(n). So why is PosgreSQL so much faster? It's using a Write-Ahead-Log for inserts. Instead of inserting into the (B-Tree based) data files at every transaction commit it writes a record to the WAL. This does not require traversal of the B-Tree and has O(1) time complexity. The penalty for this is that read operations become more complex, they must look first in the WAL and overlay those results with the main index. The WAL is never allowed to get too large, or its in memory index would become too big.
I was digging further trying to find any bottlenecks that might help in speeding up the ZODB so I did some profiling and noticed that there was a huge amount of calls to the persistent_id method of ObjectWriter in serialize.py. There was 1.3 million calls to this method while only 20000 objects were being persisted while profiling.
6108977 function calls (6108973 primitive calls) in 57.280 CPU seconds
Ordered by: cumulative time
List reduced from 232 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 57.280 57.280 profile_zodb.py:70(run)
1 0.000 0.000 57.280 57.280 :1(?)
1 0.260 0.260 57.280 57.280 profile_zodb.py:24(_btrees_insert)
1 0.000 0.000 57.280 57.280 profile:0(run())
1001 0.030 0.000 51.060 0.051 _manager.py:88(commit)
1001 0.040 0.000 50.990 0.051 _transaction.py:365(commit)
1001 0.110 0.000 50.730 0.051
_transaction.py:486(_commitResources)
1001 0.020 0.000 48.060 0.048 Connection.py:496(commit)
1001 0.220 0.000 48.040 0.048 Connection.py:512(_commit)
9889 0.940 0.000 47.340 0.005 Connection.py:561(_store_objects)
20372 0.480 0.000 39.790 0.002 serialize.py:381(serialize)
20372 0.500 0.000 38.950 0.002 serialize.py:409(_dump)
40750 7.790 0.000 38.020 0.001 :0(dump)
1338046 17.560 0.000 30.230 0.000 serialize.py:184(persistent_id)
2177223 9.150 0.000 9.150 0.000 :0(isinstance)
20373 1.550 0.000 5.240 0.000 FileStorage.py:631(store)
2964 0.050 0.000 4.980 0.002 Connection.py:749(setstate)
2964 0.100 0.000 4.930 0.002 Connection.py:769(_setstate)
2964 0.080 0.000 4.180 0.001 serialize.py:603(setGhostState)
2964 0.030 0.000 4.100 0.001 serialize.py:593(getState)
Jim Fulton explained that this is because it's called for *all* objects, not just persistent objects. This includes, ints, strings (including attribute names), etc. And then he revealed an undocumented feature that made an impressive difference to ZODB performance:
Note that there is a undocumented feature in cPickle that I added
years ago to deal with this issue but never got around to pursuing.
Maybe someone else would be able to spend the time to try it out and
report back.
If you set inst_persistent_id, rather than persistent_id, on a
pickler, then the hook will only be called for instances. This
should eliminate that vast majority of the calls.
At least for the first million inserts the improvement is significant:

The final benchmark tested average lookup speed on BTrees:
| Number of Objects |
Average Lookup Time in Seconds |
|---|---|
| 100000 | 0.000311 |
| 1000000 | 0.000648 |
| 10000000 | 0.23082 |
Notice that lookup speed drops to 230 milliseconds when there is about 10 million records in the database. There was a bug in the code that caused this result to be incorrect. The true average lookup speed is 0.0023082s (23 ms) which is very fast and completely acceptable.
I have tried many things during my benchmarks and it has took up to 3 months to compile them in between working on projects. I takes a very long time to do these test and I often had them running for a whole weekend. Some of the things I tried that did not yield any significant performance gain was:
- Reducing number of calls to fsync
- Remove calls to fsync completely
- Recompiling BTrees with increased bucket sizes to reduces the overhead of bucket splits.
I find the statistics gathered from these benchmarks worrying and I hope that they inspire further investigation into the ZODB and encourage improvement. It is clear to me that ZODB performance (the BTree implementation specifically) for large datasets must be investigated. For an application like Plone an insertion rate of 250 objects per second results in terrible performance if you consider that storing a single document instance leads to close to 120 objects being inserted in a single transaction. And this explains why you can hardly insert more than 2 or 3 documents per second if you are uploading them into a Plone site. Separating data into separate BTrees or files will not solve the problem either since catalog indexes can easily contain 10 million or more objects in large deployments. If the lookup speed on a large index is 200 milliseconds and 1 second is the maximum response time you can afford, realise that you can only do 5 lookups in that second. You see the problem? As pointed out above, there is nothing wrong with the lookup speed. So the only real concern I have is that the insertion rate is dropping to rapidly.
More conclusions in a later post.
PLEASE NOTE THAT THE RESULTS IN THIS ARTICLE IS INCORRECT. ACCURATE RESULTS ARE DOCUMENTED IN A FOLLOW-UP POST NAMED ZODB BENCHMARKS REVISITED.






