How we speeded-up our API
Our backend, the eXist database (https://exist-db.org) is an integral part of the ART-DECOR® tool suite. Add-on Roaster, the Open API Router for eXist from the TEI Publisher Project Team, routes all REST requests to the corresponding XQueries.
The ART-DECOR® Expert Team is not only our tool smith team and drives the development of ART-DECOR®, it’s also a kind of an alchemist’s kitchen. Recently, we re-arranged the architecture of our Open API Roaster software component (see repository) and tremendously speeded up the performance.
The API allows full create-read-update-delete (crud) access to various objects. Each type of object has its own recognizable url path. Every combination of a url path and the http method connects to an XQuery logic through the Open API operationId. This function handles the requested crud logic. The functions are organized by object type into XQuery Modules (xqm) files.
This means that the central controller, by inspecting the requested path, can determine exactly which xqm file will hold the logic for handling the request. This is an extremely light weight introspection without overhead. It then hands the request off to a second object specific controller.
The second controller can operate in ‘user’ or ‘admin’ mode through sticky bits on user and group. The controller imports the required function module, and lets Roaster orchestrate the request logic. Roaster cannot resolve the function called from Open API without this import. The controller only needs to handle paths for a single object, so it only has one, sometimes two, xqm imports instead of the 22 xqm files needed to power all paths. The more xqms you import the longer it takes before eXist-db has loaded the xql and gets to processing the request. This happens again upon every incoming request so every request benefits from improvements in the load times.
A few theories were considered.
Theory 1: Assuming that whenever you load large JSON blobs to convert into maps like in the API JSON file with the paths, it takes long. Splitting up the API JSON file with the paths could help performance. That theory was investigated but it wasn’t the issue per se.
Theory 2: Every time you call upon a path the called xql file walks through imports and loads other libraries until it can start the actual processing. Re-arranging that procedure and splitting it up to an initial choice of what to call may bring a performance boost. That did it!
Now the controller is re-arranged to be a little smarter. Of course, it still recognizes our permission requirements, but the controller now calls distinct xqls based on the first part of the REST call. As an example, a call starting with /project is handled by the controller by just calling the project xql, a /scenario calls the scenario xql etc.
With only a few exceptions in our setting we now split the ART-DECOR API into 22 xql files and got a big performance boost without doing anything other than refactoring the eXist controller and query logic. Special paths go to special places and that reduced loading times dramatically.
The exact benefits differ. Sure we have long running queries that eat up most of the time for processing. The initial load part is relatively small compared to the total processing time here, but these queries profit from the redesign, too. For the majority of the calls our performance tests – comparing old and new API – showed that the median processing time is reduced by the factor 0.1–0.09, corresponding to a performance gain of the new API to be 10 to 11x faster.