The year of Kube

After having reached the first milestone of a read-only prototype of Kube, it’s time to provide a lookout of what we plan to achieve in 2016.
I have put together a Roadmap, of what I think are realistic goals that we can achieve in 2016. Obviously this will evolve over time and we’ll keep adjusting this as we advance faster/slower or simply move in other directions.

Since we’re building a completely new technology stack, a lot of the roadmap revolves around ensuring that we can create what we envision technology wise,
and that we have the necessary infrastructure to move fast while having confidence in the quality. It’s important that we do this before growing the codebase too much so we can still make the necessary adjustments without having too much code to adjust.

On the UX side we’ll want to work on concepts and prototypes, although we’ll probably keep the first implemented UI’s to something fairly simple and standard.
Over time we have to build a vision where we want to go in the long run so this can steer the development. This will be a long and ongoing process involving not only wire-frames and mockups, but hopefully also user research and analysis of our problem space (how do we communicate rather than how does GMail work).

However, since we can’t just stomp that grander vision out of the ground, the primary goal for us this year, is a simple email client that doesn’t do much, but does what it does well. Hopefully we can go beyond that with some other components available (calendar, addressbook, …), or perhaps something simple available on mobile already, but we’ll have to see how fast it goes first. Overall we’ll want to focus on quality rather than quantity to prove what quality level we’re able to reach and to ensure we’re well lined up to move fast in the following year(s).

The Roadmap

I split the roadmap into four quarters, each having it’s own focus. Note that Akonadi Next has been renamed to Sink to avoid confusion (now that Akonadi 5 is released and we planned for Akonadi2…).

1. Quarter

– Read-only Kube Mail prototype.
– Fully functional Kube Mail prototype but with very limited functionality set (read and compose mail).
– Testenvironment that is also usable by designers.
– Logging mechanism in Sink and potentially Kube so we can produce comprehensive logs.
– Automatic gathering of performance statistics so we can benchmark and prove progress over time.
– The code inventory1 is completed and we know what features we used to have in Kontact.
– Sink Maildir resource.
– Start of gathering of requirements for Kube Mail (features, ….).
– Start of UX design work.

We focus on pushing forward functionality wise, and refactoring the codebase every now and then to get a feeling how we can build applications with the new framework.
The UI is not a major focus, but we may start doing some preparatory work on how things eventually should be. Not much attention is paid to usability etc.
Once we have the Kube Mail prototype ready, with a minimum set of features, but a reasonable codebase and stability (so it becomes somewhat useful for the ones that want to give it a try), we start communicating about it more with regular blogposts etc.

2. Quarter

– Build on Windows.
– Build on Mac.
– Comprehensive automated testing of the full application.
– First prototype on Android.
– First prototype on Plasma Mobile?
– Sink IMAP resource.
– Sink Kolab resource.
– Sink ICal resource.
– Start of gathering of performance requirements for Kube Mail (responsiveness, disk-usage, ….)
– Define target feature set to reach by the end of the year.

We ensure the codebase builds on all major platforms and ensure it keeps building and working everywhere. We ensure we can test everything we need, and work out what we want to test (i.e. including UI or not). Kube is extended with further functionality and we develop the means to access a Kolab/IMAP Server (perhaps with mail only).

3. Quarter

– Prototype for Kube Shell.
– Prototype for Kube Calendar.
– Potentially prototype for other Kube applications.
– Rough UX Design for most applications that are part of Kube.
– Implementation of further features in Kube Mail according to the defined feature set.

We start working on prototypes with other datatypes, which includes data access as well as UI. The implemented UI’s are not final, but we end up with a usable calendar. We keep working on the concepts and designs, and we approximately know what we want to end up with.

4. Quarter

– Implementation of the final UI for the Kube Mail release.
– Potentially also implementation of a final UI for other components already.
– UX Design for all applications “completed” (it’s never complete but we have a version that we want to implement).
– Tests with users.

We polish Kube Mail, ensure it’s easy to install and setup on all platforms and that all the implemented features work flawlessly.

Progress so far

Currently we have a prototype that has:
– A read-only maildir resource.
– HTML rendering of emails.
– Basic actions such as deleting a mail.

My plan is to hook the Maildir resource up with offlineimap, so I can start reading my mail in Kube within the next weeks 😉

Next to this we’re working on infrastructure, documentation, planning, UI Design…
Current progress can be followed in our Phabricator projects 23, and the documentation, while still lagging behind, is starting to take shape in the “docs/” subdirectory of the respective repositories45.

There’s meanwhile also a prototype of a docker container to experiment with available 6, and the Sink documentation explains how we currently build Sink and Kube inside a docker container with kdesrcbuild.

Join the Fun

We have weekly hangouts on that you are welcome to join (just contact me directly or write to the kde-pim mailinglist). The notes are on and are regularly sent to the kdepim mailinglist as well.
As you can guess the project is in a very early state, so we’re still mostly trying to get the whole framework into shape, and not so much writing the actual application. However, if you’re interested in trying to build the system on other platforms, working on UI concepts or generally tinker around with the codebase we have and help shaping what it should become, you’re more than welcome to join =)

  1. git:// 
  4. git:// 
  5. git:// 

Akonadi Next Cmd

For Akonadi Next I built a little utility that I intend to call “akonadi_cmd” and it’s slowly becoming useful.

It started as the first Akonadi Next client, for me to experiment a bit with the API, but it recently gained a bunch of commands and can now be used for various tasks.

The syntax is the following:
akonadi_cmd COMMAND TYPE ...

The Akonadi Next API always works on a single type, so you can i.e. query for folders, or mails, but not for folders and mails. Instead you query for the mails with a folder filter, if that’s what you’re looking for. akonadi_cmd’s syntax reflects that.


The list command allows to execute queries and retreive results in form of lists.
Eventually you will be able to specify which properties should be retrieved, for now it’s a hardcoded list for each type. It’s generally useful to check what the database contains and whether queries work.
Like list, but only output the result count.
Some statistics how large the database is, how the size is distributed accross indexes, etc.
Allows to create/modify/delete entities. Currently this is only of limited use, but works already nicely with resources. Eventually it will allow to i.e. create/modify/delete all kinds of entities such as events/mails/folders/….
Drops all caches of a resource but leaves the config intact. This is useful while developing because it i.e. allows to retry a sync, without having to configure the resource again.
Allows to synchronize a resource. For an imap resource that would mean that the remote server is contacted and the local dataset is brought up to date,
for a maildir resource it simply means all data is indexed and becomes queriable by akonadi.

Eventually this will allow to specify a query as well to i.e. only synchronize a specific folder.

Provides the same contents as “list” but in a graphical tree view. This was really just a way for me to test whether I can actually get data into a view, so I’m not sure if it will survive as a command. For the time being it’s nice to compare it’s performance to the QML counterpart.

Setting up a new resource instance

akonadi_cmd is already the primary way how you create resource instances:

akonadi_cmd create resource org.kde.maildir path /home/developer/maildir1

This creates a resource of type “org.kde.maildir” and a configuration of “path” with the value “home/developer/maildir1”. Resources are stored in configuration files, so all this does is write to some config files.

akonadi_cmd list resource

By listing all available resources we can find the identifier of the resource that was automatically assigned.

akonadi_cmd synchronize org.kde.maildir.instance1

This triggers the actual synchronization in the resource, and from there on the data is available.

akonadi_cmd list folder org.kde.maildir.instance1

This will get you all folders that are in the resource.

akonadi_cmd remove resource org.kde.maildir.instance1

And this will finally remove all traces of the resource instance.


What’s perhaps interesting from the implementation side is that the command line tool uses exactly the same models that we also use in Kube.

    Akonadi2::Query query;
    query.resources << res.toLatin1();

    auto model = loadModel(type, query);
    QObject::connect(, &QAbstractItemModel::rowsInserted, [model](const QModelIndex &index, int start, int end) {
        for (int i = start; i <= end; i++) {
            std::cout << "\tRow " << model->rowCount() << ":\t ";
            std::cout << "\t" << model->data(model->index(i, 0, index), Akonadi2::Store::DomainObjectBaseRole).value<Akonadi2::ApplicationDomain::ApplicationDomainType::Ptr>()->identifier().toStdString() << "\t";
            for (int col = 0; col < model->columnCount(QModelIndex()); col++) {
                std::cout << "\t|" << model->data(model->index(i, col, index)).toString().toStdString();
            std::cout << std::endl;
    QObject::connect(, &QAbstractItemModel::dataChanged, [model, &app](const QModelIndex &, const QModelIndex &, const QVector<int> &roles) {
        if (roles.contains(Akonadi2::Store::ChildrenFetchedRole)) {
    if (!model->data(QModelIndex(), Akonadi2::Store::ChildrenFetchedRole).toBool()) {
        return app.exec();

This is possible because we’re using QAbstractItemModel as an asynchronous result set. While one could argue whether that is the best API for an application that is essentially synchronous, it still shows that the API is useful for a variety of applications.

And last but not least, since I figured out how to record animated gifs, the above procedure in a live demo 😉


Bringing Akonadi Next up to speed

It’s been a while since the last progress report on akonadi next. I’ve since spent a lot of time refactoring the existing codebase, pushing it a little further,
and refactoring it again, to make sure the codebase remains as clean as possible. The result of that is that an implementation of a simple resource only takes a couple of template instantiations, apart from code that interacts with the datasource (e.g. your IMAP Server) which I obviously can’t do for the resource.

Once I was happy with that, I looked a bit into performance, to ensure the goals are actually reachable. For write speed, operations need to be batched into database transactions, this is what allows the db to write up to 50’000 values per second on my system (4 year old laptop with an SSD and an i7). After implementing the batch processing, and without looking into any other bottlenecks, it can process now ~4’000 values per second, including updating ten secondary indexes. This is not yet ideal given what we should be able to reach, but does mean that a sync of 40’000 emails would be done within 10s, which is not bad already. Because commands first enter a persistent command queue, pulling the data offline is complete even faster actually, but that command queue afterwards needs to be processed for the data to become available to the clients and all of that together leads to the actual write speed.

On the reading side we’re at around 50’000 values per second, with the read time growing linearly with the amount of messages read. Again far from ideal, which is around 400’000 values per second for a single db (excluding index lookups), but still good enough to load large email folders in a matter of a second.

I implemented the benchmarks to get these numbers, so thanks to HAWD we should be able to track progress over time, once I setup a system to run the benchmarks regularly.

With performance being in an acceptable state, I will shift my focus to the revisioned, which is a prerequisite for the resource writeback to the source. After all, performance is supposed to be a desirable side-effect, and simplicity and ease of use the goal.


Coming up next week is the yearly Randa meeting where we will have the chance to sit together for a week and work on the future of Kontact. This meetings help tremendously in injecting momentum into the project, and we have a variety of topics to cover to direct the development for the time to come (and of course a lot of stuff to actively hack on). If you’d like to contribute to that you can help us with some funding. Much appreciated!

Kontact on Windows

I recently had the dubious pleasure of getting Kontact to work on windows, and after two weeks of agony it also yielded some results =)

Not only did I get Kontact to build on windows (sadly still something to be proud off), it is also largely functional. Even timezones are now working in a way that you can collaborate with non-windows users, although that required one or the other patch to kdelibs.

To make the whole excercise as reproducible as possible I collected my complete setup in a git repository [0]. Note that these builds are from the kolab stable branches, and not all the windows specific fixes have made it back upstream yet. That will follow as soon as the waters calm a bit.

If you want to try it yourself you can download an installer here [1],
and if you don’t (I won’t judge you for not using windows) you can look at the pretty pictures.



Reproducible testing with docker

Reproducible testing is hard, and doing it without automated tests, is even harder. With Kontact we’re unfortunately not yet in a position where we can cover all of the functionality by automated tests.

If manual testing is required, being able to bring the test system into a “clean” state after every test is key to reproducibility.

Fortunately we have a lightweight virtualization technology available with linux containers by now, and docker makes them fairly trivial to use.


Docker allows us to create, start and stop containers very easily based on images. Every image contains the current file system state, and each running container is essentially a chroot containing that image content, and a process running in it. Let that process be bash and you have pretty much a fully functional linux system.

The nice thing about this is that it is possible to run a Ubuntu 12.04 container on a Fedora 22 host (or whatever suits your fancy), and whatever I’m doing in the container, is not affected by what happens on the host system. So i.e. upgrading the host system does not affect the container.

Also, starting a container is a matter of a second.

Reproducible builds

There is a large variety of distributions out there, and every distribution has it’s own unique set of dependency versions, so if a colleague is facing a build issue, it is by no means guaranteed that I can reproduce the same problem on my system.

As an additional annoyance, any system upgrade can break my local build setup, meaning I have to be very careful with upgrading my system if I don’t have the time to rebuild it from scratch.

Moving the build system into a docker container therefore has a variety of advantages:
* Builds are reproducible across different machines
* Build dependencies can be centrally managed
* The build system is no longer affected by changes in the host system
* Building for different distributions is a matter of having a couple of docker containers

For building I chose to use kdesrc-build, so building all the necessary repositories is the least amount of effort.

Because I’m still editing the code from outside of the docker container (where my editor runs), I’m simply mounting the source code directory into the container. That way I don’t have to work inside the container, but my builds are still isolated.

Further I’m also mounting the install and build directories, meaning my containers don’t have to store anything and can be completely non-persistent (the less customized, the more reproducible), while I keep my builds fast and incremental. This is not about packaging after all.

Reproducible testing

Now we have a set of binaries that we compiled in a docker container using certain dependencies, so all we need to run the binaries is a docker container that has the necessary runtime dependencies installed.

After a bit of hackery to reuse the hosts X11 socket, it’s possible run graphical applications inside a properly setup container.

The binaries are directly mounted from the install directory, and the prepared docker image contains everything from the necessary configurations to a seeded Kontact configuration for what I need to test. That way it is guaranteed that every time I start the container, Kontact starts up in exactly the same state, zero clicks required. Issues discovered that way can very reliably be reproduced across different machines, as the only thing that differs between two setups is the used hardware (which is largely irrelevant for Kontact).

..with a server

Because I’m typically testing Kontact against a Kolab server, I of course also have a docker container running Kolab. I can again seed the image with various settings (I have for instance a John Doe account setup, for which I have the account and credentials already setup in client container), and the server is completely fresh on every start.

Wrapping it all up

Because a bunch of commands is involved, it’s worthwhile writing a couple of scripts to make the usage a easy as possible.

I went for a python wrapper which allows me to:
* build and install kdepim: “devenv srcbuild install kdepim”
* get a shell in the kdepim dir: “devenv srcbuild shell kdepim”
* start the test environment: “devenv start set1 john”

When starting the environment the first parameter defines the dataset used by the server, and the second one specifies which client to start, so I can have two Kontact instances with different users for invitation handling testing and such.

Of course you can issue any arbitrary command inside the container, so this can be extended however necessary.

While that would of course have been possible with VMs for a long time, there is a fundamental difference in performance. Executing the build has no noticeable delay compared to simply issuing make, and that includes creating a container from an image, starting the container, and cleaning it up afterwards. Starting the test server + client also takes all of 3 seconds. This kind of efficiency is really what enables us to use this in a lather, rinse, repeat approach.

The development environment

I’m still using the development environment on the host system, so all file-editing and git handling etc. happens as usual so far. I still require the build dependencies on the host system, so clang can compile my files (using YouCompleteMe) and hint if I made a typo, but at least these dependencies are decoupled from what I’m using to build Kontact itself.

I also did a little bit of integration in Vim, so my Make command now actually executes the docker command. This way I get seamless integration and I don’t even notice that I’m no longer building on the host system. Sweet.

While I’m using Vim, there’s no reason why that shouldn’t work with KDevelop (or whatever really..).

I might dockerize my development environment as well (vim + tmux + zsh + git), but more on that in another post.

Overall I’m very happy with the results of investing in a couple of docker containers, and I doubt we could have done the work we did, without that setup. At least not without a bunch of dedicated machines just for that. I’m likely to invest more in that setup, and I’m currently contemplating dockerizing also my development setup.

In any case, sources can be found here:

Progress on the prototype for a possible next version of akonadi

Ever since we introduced our ideas the next version of akonadi, we’ve been working on a proof of concept implementation, but we haven’t talked a lot about it. I’d therefore like to give a short progress report.

By choosing decentralized storage and a key-value store as the underlying technology, we first need to prove that this approach can deliver the desired performance with all pieces of the infrastructure in place. I think we have mostly reached that milestone by now. The new architecture is very flexible and looks promising so far. We managed IMO quite well to keep the levels of abstraction to a necessary minimum, which results in a system that is easily adjusted as new problems need to be solved and feels very controllable from a developer perspective.

We’ve started off with implementing the full stack for a single resource and a single domain type. For this we developed a simple dummy-resource that currently has an in-memory hash map as backend, and can only store events. This is a sufficient first step, as turning that into the full solution is a matter of adding further flatbuffer schemas for other types and defining the relevant indexes necessary to query what we want to query. By only working on a single type we can first carve out the necessary interfaces and make sure that we make the effort required to add new types minimal and thus maximize code reuse.

The design we’re pursuing, as presented during the pim sprint, consists of:

  • A set of resource processes
  • A store per resource, maintained by the individual resources (there is no central store)
  • A set of indexes maintained by the individual resources
  • A clientapi that knows how to access the store and how to talk to the resources through a plugin provided by the resource implementation.

By now we can write to the dummyresource through the client api, the resource internally queues the new entity, updates it’s indexes and writes the entity to storage. On the reading part we can execute simple queries against the indexes and retrieve the found entities. The synchronizer process can meanwhile generate also new entities, so client and synchronizer can write concurrently to the store. We therefore can do the full write/read roundtrip meaning we have most fundamental requirements covered. Missing are other operations than creating new entities (removal and modifications), and the writeback to the source by the synchronizer. But that’s just a matter of completing the implementation (we have the design).

To the numbers: Writing from the client is currently implemented in a very inefficient way and it’s trivial to drastically improve this, but in my latest test I could already write ~240 (small) entities per second. Reading works around 40k entities per second (in a single query) including the lookup on the secondary index. The upper limit of what the storage itself can achieve (on my laptop) is at 30k entities per second to write, and 250k entities per second to read, so there is room for improvement =)

Given that design and performance look promising so far, the next milestone will be to refactor the codebase sufficiently to ensure new resources can be added with sufficient ease, and making sure all the necessary facilities (such as a proper logging system), or at least stubs thereof, are available.

I’m writing this on a plane to Singapore which we’re using as gateway to Indonesia to chase after waves and volcanoes for the next few weeks, but after that I’m  looking forward to go full steam ahead with what we started here. I think it’s going to be something cool =)

On Domain Models and Layers in kdepim

In our current kdepim code we use some classes throughout the codebase. I’m going to line out the problems with that and propose how we can do better.

The Application Domain

Each application has a “domain” it was created for. KOrganizer has for instance the calendar domain, and kmail the email domain, and each of those domains can be described with domain objects, which make up the domain model. The domain model of an application is essential, because it is what defines how we can represent the problems of that domain. If Korganizer didn’t have a domain model with attendees to events, we wouldn’t have any way to represent attendees internally, and thus couldn’t develop a feature based on that.

The logic implementing the functionality on top of those domain objects is the domain logic. It implements for instance what has to happen if we remove an event from a calendar, or how we can calculate all occurrences of a recurring event.

In the calendaring domain we use KCalCore to provide many of those domain objects and a large part of the domain logic. KCalCore::Event for instance, represents an event, can hold all necessary data of that event, and has the domain logic directly built-in to calculate recurrences.
Since it is a public library, it provides domain-objects and the domain-logic for all calendaring applications, which is awesome, right? Only if you use it right.


KCalCore provides additionally to the containers and the calendaring logic also serialization to the iCalendar format, which is also why it more or less tries to adhere to the iCalendar RFC, for both representation an interpretation of calendaring data. This is of course very useful for applications that deal with that, and there’s nothing particularly wrong with it. One could argue that serialization and interpretation of calendaring data should be split up, but since both is described by the same RFC I think it makes a lot of sense to keep the implementations together.


A problem arises when classes like KCalCore::Event are used as domain objects, and interface for the storage layer, and as actual storage format, which is precisely what we do in kdepim.

The problem is that we introduce very high coupling between those components/layers and by choosing a library that adheres to an RFC the whole system is even locked down by a fully grown specification. I suppose that would be fine if only one application is using the storage layer,
and that application’s sole purpose is to implement exactly that RFC and nothing else, ever. In all other cases I think it is a mistake.

Domain Logic

The domain logic of an application has to evolve with the application by definition. The domain objects used for that are supposed to model the problem at hand precisely, in a way that a domain logic can be built that is easy to understand and evolve as requirements change. Properties that are not used by an application only hide the important bits of a domain objects, and if a new feature is added it must be possible to adjust the domain object to reflect that. By using a class like KCalCore::Event for the domain object, these adjustments become largely impossible.

The consequence is that we employ workarounds everywhere. KCalCore doesn’t provide what you need? Simply store it as “custom property”. We don’t have a class for calendars? Let’s use Akonadi::Collection with some custom attributes. Mechanisms have been designed to extend these rigid structures so we can at least work with it, but that only lead to more complex code that is ever harder to understand.

Instead we could write domain logic that expresses precisely what we need, and is easier to understand and maintain.

Zanshin for instance took the calendaring domain, and applied the GettingThingsDone (GTD) methodology to it. It takes a rather simple approach to todo’s and initially only required a description, a due date and a state. However, it introduced the notion that only “projects” can have subtodo’s. This restriction needs to be reflected in the domain model, and implemented in the domain logic.
Because there are no projects in KCalCore, it was simply defined that todo’s with a magic property “X-Project” are defined as project. There’s nothing wrong with that itself, but you don’t want to litter your code with “if (todo->hasProperty(X-Project))”. So what do you do? You create a wrapper. And that wrapper is now already your new domain object with a nice interface. Kevin fortunately realized that we can do better, and rewrote zanshin with its own custom domain objects, that simply interface with the KCalCore containers in a thin translation layer to akonadi. This made the code much clearer, and keeps those “x-property”-workarounds in one place only.


A useful approach to think about application architecture are IMO layers. It’s not a silver bullet, and shouldn’t be done too excessively I think, but in some cases layer do make a lot of sense. I suggest to think about the following layers:

  • The presentation layer: Displays stuff and nothing else. This is where you expose your domain model to the UI, and where your QML sits.
  • The domain layer: The core of the application. This is where all the useful magic happens.
  • The data access layer: A thin translation layer between domain and storage. It makes it possible to use the same storage layer from multiple domains and to replace the storage layer without replacing all the rest.
  • The storage layer: The layer that persists the domain model. Akonadi.

By keeping these layer’s in mind we can do a better job at keeping the coupling at a reasonable level, allowing individual components to  change as required.

The presentation layer is required in any case if we want to move to QML. With QML we can no longer have half of the domain logic in the UI code, and most of the domain model should probably be exposed as a model that is directly accessible by QML.

The data access layer is where akonadi provides a standardized interface for all data, so multiple applications can shared the same storage layer. This is currently made up by the i.e. KCalCore for calendars, the akonadi client API, and a couple of akonadi objects, such as Akonadi::Item and Akonadi::Collection. As this layer defines what data can be accessed by all applications, it needs to be flexible and likely has to be evolved frequently.

The way forward

For akonadi’s client API, aka the data access layer, I plan on defining a set of interfaces for things like calendars, events, mailfolders, emails, etc. This should eventually replace KCalCore, KContacts and friends from being the canonical interface to the data.

Applications should eventually move to their own domain logic implementation. For reading and structuring data, models are IMO a suitable tool, and if we design them right this will also pave the way for QML interfaces. Of course i.e. KCalCore still has its uses for its calendaring routines, or as a serialization library to create iTip messages, but we should IMO stop using it for everything. The same of course applies to KContacts.

What we still could do IMO, is share some domain logic between applications, including some domain objects. A KDEPIM::Domain::Contact could be used across applications, just like KContact::Adressee was. This keeps different applications from implementing the same logic, but of course also introduces coupling between those again.

What IMO has to stay separate is the data access layer, which implements an interface to the storage layer, and that doesn’t necessarily conform to the domain layer (you could i.e. store “Blog posts” as notes in storage). This separation is IMO useful, as I expect the application domain to evolve separately from what actual storage backends provide (see zanshin).

This is of course quite a chunk of work, that won’t happen at once. But need to know now where we want to end up in a couple of years, if we intend to ever get there.