Ever since we introduced our ideas the next version of akonadi, we’ve been working on a proof of concept implementation, but we haven’t talked a lot about it. I’d therefore like to give a short progress report.
By choosing decentralized storage and a key-value store as the underlying technology, we first need to prove that this approach can deliver the desired performance with all pieces of the infrastructure in place. I think we have mostly reached that milestone by now. The new architecture is very flexible and looks promising so far. We managed IMO quite well to keep the levels of abstraction to a necessary minimum, which results in a system that is easily adjusted as new problems need to be solved and feels very controllable from a developer perspective.
We’ve started off with implementing the full stack for a single resource and a single domain type. For this we developed a simple dummy-resource that currently has an in-memory hash map as backend, and can only store events. This is a sufficient first step, as turning that into the full solution is a matter of adding further flatbuffer schemas for other types and defining the relevant indexes necessary to query what we want to query. By only working on a single type we can first carve out the necessary interfaces and make sure that we make the effort required to add new types minimal and thus maximize code reuse.
The design we’re pursuing, as presented during the pim sprint, consists of:
A set of resource processes
A store per resource, maintained by the individual resources (there is no central store)
A set of indexes maintained by the individual resources
A clientapi that knows how to access the store and how to talk to the resources through a plugin provided by the resource implementation.
By now we can write to the dummyresource through the client api, the resource internally queues the new entity, updates it’s indexes and writes the entity to storage. On the reading part we can execute simple queries against the indexes and retrieve the found entities. The synchronizer process can meanwhile generate also new entities, so client and synchronizer can write concurrently to the store. We therefore can do the full write/read roundtrip meaning we have most fundamental requirements covered. Missing are other operations than creating new entities (removal and modifications), and the writeback to the source by the synchronizer. But that’s just a matter of completing the implementation (we have the design).
To the numbers: Writing from the client is currently implemented in a very inefficient way and it’s trivial to drastically improve this, but in my latest test I could already write ~240 (small) entities per second. Reading works around 40k entities per second (in a single query) including the lookup on the secondary index. The upper limit of what the storage itself can achieve (on my laptop) is at 30k entities per second to write, and 250k entities per second to read, so there is room for improvement =)
Given that design and performance look promising so far, the next milestone will be to refactor the codebase sufficiently to ensure new resources can be added with sufficient ease, and making sure all the necessary facilities (such as a proper logging system), or at least stubs thereof, are available.
I’m writing this on a plane to Singapore which we’re using as gateway to Indonesia to chase after waves and volcanoes for the next few weeks, but after that I’m looking forward to go full steam ahead with what we started here. I think it’s going to be something cool =)
Wouldn’t it be great if Kontact would allow you to select a set of folders you’re interested in, that setting would automatically be respected by all your devices and you’d still be able to control for each individual folder whether it should be visible and available offline?
I’ll line out a system that allows you to achieve just that in a groupware environment. I’ll take Kolab and calendar folders as example, but the concept applies to all groupware systems and is just as well applicable to email or other groupware content.
User Scenarios
Anna has access to hundreds of shared calendars, but she usually only uses a few selected ones. She therefore only has a subset of the available calendars enabled, that are shown to her in the calendar selection dialog, available for offline usage and also get synchronized to her mobile phone. If she realizes she no longer requires a calendar, she simply disables it and it disappears from the Kontact, the Webclient and her phone.
Joe works with a small team that shares their calendars with him. Usually he only uses the shared team-calendar, but sometimes he wants to quickly check if they are in the office before calling them, and he’s often doing this in the train with unreliable internet connection. He therefore disables the team member’s calendars but still enables synchronization for them. This hides the calendars from all his devices, but he still can quickly enable them on his laptop while being offline.
Fred has a mailing list folder that he always reads on his mobile, but never on his laptop. He keeps the folder enabled, but hides it on his laptop so his folder list isn’t unnecessarily cluttered.
What these scenarios tell us is that we need a flexible mechanism to specify the folders we want to see and the folders we want synchronized. Additionally we want, in today’s world where we have multiple devices, to synchronize the selection of folders that are important to us. It is likely I’d like to see the calendar I have just enabled in Kontact also on my phone. However, we always want to keep the possibility to alter that default setting on specific devices.
Current State
If you’re using a Kolab Server, you can use IMAP subscriptions to control what folders you want to see on your devices. Kontact currently respects that setting in that it makes all folders visible and available for offline usage. Additionally you have local subscriptions to disable certain folders (so they are not downloaded or displayed) on a specific device. That is not very flexible though, and personally I ended up having pretty much all folders enabled that I ever used, leading to cluttered folder selections and lot’s of bandwith and storage space used to keep everything available offline.
To change the subscription state, KMail offers to open the IMAP-subscription dialog which allows to toggle the subscription state of individual folders. This works, but is not well integrated (it’s a separate dialog), and is also not well integrable since it’s IMAP specific.
Because the solution is not well integrated, it tends to be rather static in my experience. I tend to subscribe to all folders that I ever use, which results in a very long and cluttered folder-list.
A new integrated subscription system
What would be much better, is if the back-end could provide a default setting that is synchronized to the server, and we could quickly enable or disable folders as we require them. Additionally we can override the default settings for each individual folder to optimize our setup as required.
To make the system more flexible, while not unnecessarily complex, we need a per folder setting that allows to override a backend provided default value. Additionally we need an interface for applications to alter the subscription state through Akonadi (instead of bypassing it). This allows for a well integrated solution that doesn’t rely on a separate, IMAP-specific dialog.
Each folder requires the following settings:
An enabled/disabled state that provides the default value for synchronizing and displaying a folder.
An explicit preference to synchronize a folder.
An explicit preference to make a folder visible.
A folder is visible if:
There is an explicit preference that the folder is visible.
There is no explicit preference on visibility and the folder is enabled.
A folder is synchronized if:
There is an explicit preference that the folder is synchronized.
There is no explicit preference on synchronization and the folder is enabled.
The resource-backend can synchronize the enabled/disabled state which should give a default experience as expected. Additionally it is possible to override that default state using the explicit preference on a per folder level.
User Interaction
By default you would be working with the enabled/disabled state, that is synchronized by the resource backend. If you enable a folder it becomes visible and synchronized, if you disable it, it becomes invisible and not synchronized. For the enabled/disabled state we can build a very easy user interface, as it is a single boolean state, that we can integrate into the primary UI.
Because the enabled/disabled state is synchronized, an enabled calendar will automatically appear on your MyKolab.com web interface and your mobile. One click, and you’re all set.
Example mockup of folder sync properties
In the advanced settings, you can then override visibility and synchronization preference at will as a local-only setting, giving you full flexibility. This can be hidden in a properties dialog, so it doesn’t clutter the primary UI.
This makes the default usecase very simple to use (you either want a folder or you don’t want it), while we keep full flexibility in overriding the default behaviour.
IMAP Synchronization
The IMAP resource will synchronize the enabled/disabled state with IMAP subscriptions if you have subscriptions enabled in the resource. This way we can use the enabled/disabled state as interface to change the subscriptions, and don’t have to use a separate dialog to toggle that state.
Interaction with existing mechanisms
This mechanism can probably replace local subscriptions eventually. However, in order not to break existing setups I plan to leave local subscriptions working as they currently are.
Conclusion
By implementing this proposal we get the required flexibility to make sure the resources of our machine are optimally used, while different clients still interact with each other as expected. Additionally we gain a uniform interface to enable/disable a collection that can be synchronized by backends (e.g. using the IMAP subscription state). This will allow applications to nicely integrate this setting, and should therefore make this feature a lot easier to use and overall more agile.
New doors are opened as this will enable us to do on-demand loading of folders. By having the complete folder list available locally (but disabled by default and thus hidden), we can use the collections to load their content temporarily and on-demand. Want to quickly look at that shared calendar you don’t have enabled? Simply search for it and have a quick look, the data is synchronized on-demand and the folder is as quickly gone as you found it, once it is no longer required. This will diminish the requirement to have folders constantly clutter your folder list even further.