Catalog query plan
While the catalog tool in Zope is immensely useful, we have seen some slowdowns in large Plone sites with a combination of additional indexes and lots of content.
Analyzing Catalog implementation
The catalog implementation is using BTree set operations like union, multiunion and intersection. Those operations are fairly fast, especially when everything is in memory. However, the catalog implementation is rather naïve which leads to lots of set operations on rather big sets.
Example
A typical simple scenario is an intranet with 150000 content items, most of them published, and local roles used to handle access to most of the content.
We want to look for pending Documents in a section of the site. In addition to the specified requirements, the catalog tool in Plone would add a search for allowedRolesAndUsers and effectiveRange to only return results the user is allowed to see.
There is no explicit ordering of the indexes, so in our first example our ordering is portal_type, allowedRolesAndUsers, review_state, effectiveRange, path. The first index, portal_type returns 90000 items. Then we continue on to allowedRolesAndUsers, which returns 40000 items. This is intersected with the the previous set, yielding a set of 20000. As all the content in this section is published, there are no items in the review_state=pending, which gives an empty set when intersected with the previous set. effectiveRange returns 140000 items, which is intersected with empty set. path returns 400 items which is once again intersected with the empty result set. The end result is an empty set and the query typically takes 0.15 seconds on my laptop.
As a basic first improvement, it would help to immediately return after getting an empty result set from an index, but we can go even further.
Query plan
Search engines and databases uses query optimizers to select query plans that will minimize the result set as early as possible, because working with large amounts of data is time consuming.
What we want to do is to search against the indexes giving the smallest result set first. However, for that to be useful, we need to pass that result along into the indexes to allow the indexes to limit the result set as soon as possible internally. When calculating a path search, there is no need to look in all 150000 results if the portal type index has already limited the possible result to 10000. If we have already limited the result to 10000 results, all set operations are going to be significantly faster.
Solution
We identify different searches by the list of indexes that are searched. If there are no query plans for a set of indexes, the query is run like normal while storing the number of results for each index. When all indexes have been checked, the list is sorted on number of results and stored as a query plan. Next time a search on the same indexes comes in, the query plan is looked up.
To get different query plans for similar queries, you can provide additional bogus index names. They will be ignored by the catalog, but will become part of the key. This means that if you search for Documents in draft state for a worklist, you can have a different ordering than when searching for published Documents, as there are likely to be very few items in draft state, but many in published state.
Results
I tested this in an example site locally with a 15GB Data.fs, 150000 content items, and running a JMeter test plan that logged in, viewed a couple of pages and logged out again.
When measuring in the regular catalog, there were 1220 queries taking 81 seconds with the regular catalog, which is 66ms per query. When using the query plan, the queries took 14.1 seconds, which is 11ms per query. The catalog implementation is also used for membrane and reference catalog to look up users and objects, but those have significantly fewer indexes than the regular catalog tool, so we also measure on the Catalog Tool.
When measuring only the catalog tool, there were 510 queries taking 80.9 seconds with the regular catalog, which is 159ms per query. With the query plan it took 14.3 seconds, which is 28ms per query. We consider 14.3 vs 14.1 to be within normal variations when measuring, and the numbers indicate that the time spent in membrane and reference catalog can be ignored compared to the time spent in the catalog tool.
Going from 159ms to 28ms for typical queries is a noticeable improvement. If you are mildly abusing the catalog, this might add up to a second or more on a single page view.
Interested?
You can try these performance improvements by installing the experimental.catalogqueryplan package in your Plone site. Just add it to the eggs list in your buildout configuration and either import experimental.catalogqueryplan from your code, or load its zcml.
Sponsored by
--
Helge Tesdal