Putting the code where it belongs

I have been working on better ways to write asynchronous code. In this post I’m going to analyze one of our current tools, KJob, in how it helps us writing asynchronous code and what is missing. I’m then going to present my prototype solution to address these problems.

KJob

In KDE we have the KJob class to wrap asynchronous operations. KJob gives us a framework for progress and error reporting, a uniform start method, and by subclassing it we can easily write our own reusable asynchronus operations. Such an asynchronous operation typically takes a couple of arguments, and returns a result.

A KJob, in it’s simplest form, is the asynchronous equivalent of a function call:

int doSomething(int argument) {
    return getNumber(argument);
}
struct DoSomething : public KJob {
    KJob(int argument): mArgument(argument){}

    void start() {
        KJob *job = getNumberAsync(mArgument);
        connect(job, SIGNAL(result(KJob*)), this, SLOT(onJobDone(KJob*)));
        job->start();
    }

    int mResult;
    int mArgument;

private slots:
    void onJobDone(KJob *job) {
        mResult = job->result;
        emitResult();
    }
};

What you’ll notice immediately that this involves a lot of boilerplate code. It also introduces a lot of complexity in a seemingly trivial task. This is partially because we have to create a class when we actually wanted a function, and partially because we have to use class members to replace variables on the stack, that we don’t have available during an asynchronous operation.

So while KJob gives us a tool to wrap asynchronous operations in a way that they become reusable, it comes at the cost of quite a bit of boilerplate code. It also means that what can be written synchronously in a simple function, requires a class when writing the same code asynchronously.

Inversion of Control

A typical operation is of course slightly more complex than doSomething, and often consists of several (asynchronous) operations itself.

What in imperative code looks like this:

int doSomethingComplex(int argument) {
    return operation2(operation1(argument));
}

…results in an asynchronous operation that is scattered over multiple result handlers somewhat like this:

...
void start() {
    KJob *job = operation1(mArgument);
    connect(job, SIGNAL(result(KJob*)), this, SLOT(onOperation1Done(KJob*)));
    job->start();
}

void onOperation1Done(KJob *operation1Job) {
    KJob *job = operation2(operation1Job->result());
    connect(job, SIGNAL(result(KJob*)), this, SLOT(onOperation1Done(KJob*)));
    job->start();
}

void onOperation2Done(KJob *operation1Job) {
    mResult = operation1Job->result();
    emitResult();
}
...

We are forced to split the code over several functions due to the inversion of control introduced by the handler based asynchronous programming. Unfortunately these additional functions (the handlers), that we are now forced to use, are not helping the program strucutre in any way. This manifests itself also in the rather useless function names that typically follow a pattern such as on”$Operation”Done() or similar. Further, because the code is scattered over functions, values that are available on the stack in a synchronous function have to be stored explicitly as class member, so they are available in the handler where they are required for a further step.

The traditional way to make code easy to comprehend is to split it up into functions that are then called by a higher level function. This kind of function composition is no longer possible with asynchronous programs using our current tools. All we can do is chain handler after handler. Due to the lack of this higher level function that composes the functionality, a reader is also force to read very single line of the code, instead of simply skimming the function names, only drilling deeper if more detailed information about the inner workings are required.
Since we are no longer able to structure the code in a useful way using functions, only classes, and in our case KJob’s, are left to structure the code. However, creating subjob’s is a lot of work when all you need is a function, and while it helps the strucutre, it scatters the code even more, making it potentially harder to read and understand. Due to this we also often end up with large and complex job classes.

Last but not least we loose all available control structures by the inversion of control. If you write asynchronous code you don’t have the if’s, for’s and while’s available that are fundamental to write code. Well, obviously they are still there, but if you write asynchronous code you can’t use them as usual because you can’t plug a complete asynchonous operation inside an if{}-block. The best that you can do, is initiate the operation inside the imperative control structures, and dealing with the results later on in handlers. Because we need control structures to build useful programs, these are usually emulated by building complex statemachines where each function depends on the current class state. A typical (anti)pattern of that kind is a for loop creating jobs, with a decreasing counter in the handler to check if all jobs have been executed. These statemachines greatly increase the complexity of the code, are higly error prone, and make larger classes incomprehensible without drawing complex state diagrams (or simply staring at the screen long enough while tearing your hear out).

Oh, and before I forget, of course we also no longer get any useful backtraces from gdb as pretty much every backtrace comes straight from the eventloop and we have no clue what was happening before.

As a summary, inversion of control causes:

  • code is scattered over functions that are not helpful to the structure
  • composing functions is no longer possible, since what would normally be written in a function is written as a class.
  • control structures are not usable, a statemachine is required to emulate this.
  • backtraces become mostly useless

As an analogy, your typical asynchronous class is the functional equivalent of single synchronous function (often over 1000 loc!), that uses goto’s and some local variables to build control structures. I think it’s obvious that this is a pretty bad way to write code, to say the least.

JobComposer

Fortunately we received a new tool with C++11: lambda functions
Lambda functions allow us to write functions inline with minimal syntactical overhead.

Armed with this I set out to find a better way to write asynchronous code.

A first obvious solution is to simply write the result handler of a slot as lambda function, which would allow us to write code like this:

make_async(operation1(), [] (KJob *job) {
    //Do something after operation1()
    make_async(operation2(job->result()), [] (KJob *job) {
        //Do something after operation2()
        ...
    });
});

It’s a simple and concise solution, however, you can’t really build reuasable building blocks (like functions) with that. You’ll get one nested tree of lambda’s that depend on each other by accessing the results of the previous jobs. The problem making this solution non-composable is that the lambda function which we pass to make_async, starts the asynchronous task, but also extracts results from the previous job. Therefore you couldn’t, for instance, return an async task containing operation2 from a function (because in the same line we extract the result of the previous job).

What we require instead is a way of chaining asynchronous operations together, while keeping the glue code separated from the reusable bits.

JobComposer is my proof of concept to help with this:

class JobComposer : public KJob
{
    Q_OBJECT
public:
    //KJob start function
    void start();

    //This adds a new continuation to the queue
    void add(const std::function<void(JobComposer&, KJob*)> &jobContinuation);

    //This starts the job, and connects to the result signal. Call from continuation.
    void run(KJob*);

    //This starts the job, and connects to the result signal. Call from continuation.
    //Additionally an error case continuation can be provided that is called in case of error, and that can be used to determine wether further continuations should be executed or not.
    void run(KJob*, const std::function<bool(JobComposer&, KJob*)> &errorHandler);

    //...
};

The basic idea is to wrap each step using a lambda-function to issue the asynchronous operation. Each such continuation (the lambda function) receives a pointer to the previous job to extract results.

Here’s an example how this could be used:

auto task = new JobComposer;
task->add([](JobComposer &t, KJob*){
    KJob *op1Job = operation1();
    t.run(op1Job, [](JobComposer &t, KJob *job) {
        kWarning() << "An error occurred: " << job->errorString()
    });
});
task->add([](JobComposer &t, KJob *job){
    KJob *op2Job = operation2(static_cast<Operation1*>(job)->result());
    t.run(op2Job, [](JobComposer &t, KJob *job) {
        kWarning() << "An error occurred: " << job->errorString()
    });
});
task->add([](JobComposer &t, KJob *job){
    kDebug() << "Result: " << static_cast<Operation2*>(job)->result();
});
task->start();

What you see here is the equivalent of:

int tmp = operation1();
int res = operation2(tmp);
kDebug() << res;

There are several important advantages of using this to writing traditional asynchronous code using only KJob:

  • The code above, which would normally be spread over several functions, can be written within a single function.
  • Since we can write all code within a single function we can compose functions again. The JobComposer above could be returned from another function and integrated into another JobComposer.
  • Values that are required for a certain step can either be extracted from the previous job, or simply captured in the lambda functions (no more passing of values as members).
  • You only have to read the start() function of a job that is written this way to get an idea what is going on. Not the complete class.
  • A “backtrace” functionality could be built in to JobComposer that would allow to get useful information about the state of the program although we’re in the eventloop.

This is of course only a rough prototype, and I’m sure we can craft something better. But at least in my experiments it proved to work very nicely.
What I think would be useful as well are a couple of helper jobs that replace the missing control structures, such as a ForeachJob which triggers a continuation for each result, or a job to execute tasks in parallel (instead of serial as JobComposer does).

As a little showcase I rewrote a job of the imap resource.
You’ll see a bit of function composition, a ParallelCompositeJob that executes jobs in parallel, and you’ll notice that only relevant functions are left and all class members are gone. I find the result a lot better than the original, and the refactoring was trivial and quick.

I’m quite certain that if we build these tools, we can vastly improve our asynchronous code making it easier to write, read, and debug.
And I think it’s past time that we build proper tools.

Advertisements

Author: cmollekopf

Christian Mollekopf is an open source software enthusiast with a special interest in personal organization tools. He started to contribute actively to KDE in 2008 and currently works for Kolab Systems leading the development for the next generation desktop client.

7 thoughts on “Putting the code where it belongs”

    1. Threads are a much clumsier solution to certain problems than jobs driven by an event loop. In my experience, I see mostly bad software spawn a thread for every little task.

    2. I don’t think so. Threads are a useful tool for many problems, but my no means a silver bullet for code that has to wait on I/O. They come with a certain performance overhead, but most importantly they come at the cost of complexity. While it of course helps if you can write synchronous code in a thread, that only works if the task is fairly isolated, and as soon as you have synchronisation points between threads things quickly get complex. The only thing that is better with threads is that we have some more tools available such as QFuture and the QtConcurrent framework.

      Writing asynchronous code with continuations is also not an entirely new thing, it’s just not very popular yet in the Qt world AFAIK.

  1. Very nice!

    I wonder if we might want to model the new agent/resource base interface along a concept like this.
    E.g. have the base class create a JobComposer that holds the operation’s context (e.g. the item for “itemAdded”) and then the subclass would add its continuations as a kind of “functional payload”

    1. Thanks =)

      I’m not yet sure how the ideal resource interface would look like, but what I could see working is that an implementation would simply provide jobs for the various operations. The context could be made available by passing a “context object (containing all the necessary values)” to those jobs.

      The jobs themselves could then be implemented using JobComposer.

      This way the scheduler would simply create and execute jobs, and not rely on some taskDone() method hopefully being called eventually.
      Implementing a resource would thuse only require implementing a bunch of jobs.

  2. After a cursory 1st read and without being an expert on either technology my 1st reaction was ‘or you migrate to ObjC++ already’ 😉

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s