Optimizing Kube’s storage

Near the middle / end of my internship, I got to modify parts of the storage system in Sink, the library handling all the data of Kube.

The goal was to both to speed up the storage and reducing disk space. These two goals often go hands in hand in databases, since smaller data means faster disk lookup, and more data put in memory, available for direct usage.

Reducing key sizes

The first important modification was for keys to be stored in a binary format instead of the displayable format. This allowed data retrieval to be faster, because the binary format would be much smaller than the display format.

On the memory usage side, this is a bit more awkward to measure: on the one hand, the since the size of keys is smaller in memory, but on the other hand, LMDB (the database system we are using) will try to put more data in memory.

But on the whole, either we are using less memory, or Kube will be faster altogether. The reality is probably a mix of the two.

As for the numbers, these are the results of two different benchmarks:

Develop branch This patch
Current Rss usage [kb]: 40700 Current Rss usage [kb]: 39112
On disk [kb]: 10788 On disk [kb]: 8836
Write amplification: 12.0075 Write amplification: 9.83485
Develop branch This patch
Total pages: 760 Total pages: 603
Used size: 1425408 Used size: 1191936
Total on disk: 3293184 Total on disk: 2650112
Write amplification: 3.63268 Write amplification: 3.51402

As we can see on both tables, we use less disk space after the patch, but memory usage has not gone down in all places.

Overall we use approximately 20% less disk space.

Separating UIDs and revisions

Another important modification on the storage system was separating UIDs and revisions. Before this patch, we used to store the UID and the revision as keys inside the database.

What’s more is that the key was in the format “{this-is-a-uid}0000000000042”. The reason for padding the number with zeroes was that we need the data to be ordered, and we needed the keys to be stored as a string for the UID.

However, every revision is unique, so we can store only the revision as a key. This also allows us to store the key as a number, use the “integer key” feature of LMDB so the sorting stays correct and save a nice amout of space.

But, since we can keep old revisions, we need to track them. Therefore we need another table for mapping UIDs and revisions.

The rationale behind this patch is that since the “uid to revisions” database will be much smaller than the main database, it can be put into memory, leading to faster lookups.

Even though the results were not as consequent as the “reducing key sizes” patch, it seems we still got a small improvement performance wise. We will better see how different the performance is in the near future, though.

In the meantime, we can use the benchmarks graphics to see how performance, memory and disk usage was impacted (courtesy of our CI):

Disk usage with dummy writes
Disk usage with dummy writes
Write amplification
Write amplification
Memory usage with dummy writes
Memory usage with dummy writes
Initial query time
Initial query time
Pipeline performance
Pipeline performance

In these figures above, the blue circles shows the impact of the first patch. The green circles shows the impact of the second patch.

The important part of these wonderful graphs are that:

  • Disk usage has gone down, especially the write amplification
  • Memory usage has gone down a bit.
  • In “Memory usage with dummy writes”, we can also see the difference between counting the memory mapping of LMBD or not when measuring memory usage. This was explained in another blog post by Christian.
  • The “Initial query time” graph is interesting: it shows that in the initial version of the first patch, some keys were converted from the binary and the displayable format back and forth, leading to performance regression. This has been fixed quickly after and shows the true improvement of reducing key sizes.
  • Querying a mail from the database is now faster, but the whole pipeline is slower, so this still needs to be investigated.

Overall, I’m quite happy with both of these modifications, even if the second one has smaller benefits. That’s because in addition to the disk usage improvement and memory improvement, the code is a bit cleaner and more readable.


2 thoughts on “Optimizing Kube’s storage”

  1. Is there any potential for Kolab to hire you? You’ve made an impact on Kube, and so far it’s a fantastic client. Excited to see where it goes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s