Plone performance sprint bristol 2008
It's been some time since the performance sprint in Bristol now, but it certainly deserves a summary.
Location
Turns out Bristol is a great city with a wonderful atmosphere. There is plenty of history and old buildings (unlike in Norway). Netsight also picked a great location for the sprint, and provided plenty of refreshments to keep the geeks doing long days.
Sprint topics
Automated performance tests
The most important result from the performance sprint is probably the automated funkload tests. In Copenhagen in 2007 we looked into using JMeter, but it was too much hassle to work with, and we didn't get beyond creating some test plan templates for Plone.
With collective.loadtesting we have something that is a lot easier to set up and use in a buildout, and we can have daily performance runs and keep track of the general performance of Plone (although it doesn't seem to be set up yet). This means we can easily and immediately spot changes that affect performance in a negative way, and make informed decisions whether the advantages of functionality or code changes is worth the performance loss.
Instrumentation
Instrumentation was the other important topic at the sprint. We always need more performance data, more accurate information, and better ways of presenting that data to enable us to understand and improve the system we're profiling.
Enter mr.bent. Mr. Bent knows his numbers, and is a framework for allowing profile data to be collected in a Python application and viewed at different logical levels.
One example of usage is collective.performancecolouriser which enables the developer to visually see timing info for different page components by color.
ExtendedPathIndex refactoring
While the ExtendedPathIndex has worked quite well for a long time, the implementation was rather obtuse, or obfuscated if you like.
Martijn Pieters has been planning a rewrite for quite some time, and at the sprint he finally had the opportunity to do so. He found several bugs in the old implementation, added tests and implemented all features for all scenarios, while also improving performance by being smarter about set operation ordering.
The improvements were released as version 2.5, and will be included in Plone at the earliest convenience.
Catalog
experimental.catalogqueryplan saw two improvements. The first one is faster set operations when intersecting a large and a small set. In Plone, especially with Membrane, there are some cases where you have a rather small result set from one index (like user or group ids) and a large result set from another (implemented interfaces or permissions). In those cases it is more efficient to check if the large set has a certain key by direct lookup than to scan all keys for a match. The size check is implemented in Python as a temporary monkeypatch used for the catalogqueryplan indexes, and gives a noticeable improvement. The exact improvement depends on whether all values in the small set are found at the beginning ('Low values' in graph) or end ('high values' in graph) of the large set. It might be as much as 20x faster to check for a key.
The graph shows the timing of an intersection between a set of 10000 items and a subset of 10 items, repeated 100 times. The green bar is the regular C implementation from IIBTree, the yellow bar is our python implementation using has_key with fallback to the C implementation, and the blue bar is using the builtin Python set (instead of the IISet from Zope).
Inside the indexes, we now sort the sets on length when doing intersection operations, to use the smallest sets first. When doing OR operations, we use multiunion instead of several unions. This should be noticeable for the permissions index (allowedRolesAndUsers) for example.
The improvements are released as version 1.1, and can easily be tested by including experimental.catalogqueryplan in your buildout.
The second improvement is the inclusion of a new index. BooleanIndex is a simplified FieldIndex which only stores True values and ignores everything else. This will lower the object count slightly for indexes like default page and folderish and will enable the use of the intersection improvements for smaller sets. This only works as long as there are fewer True than False values indexed.
-- Helge Tesdal