Bounded Workloads

Pace yourself - do less work, run slower, and have happy users!

When we have some complex piece of software with multiple subsystems or components, we'll sometimes find that we can't just let them all run as hard as they can. For example, we may want to keep the CPU usage very light on a device so it can regularly enter low power mode.

For today's post, I'm going to use a game loop as an example. Feel free to apply to your own problem domain. We'll assume we have a couple of subsystems - input management, physics computation, rendering, as well as a subsystem coordinator that calls all of these from within the game loop.

If you don't know what a game loop is, Wikipedia has you covered.

Basic Strategies

There are many ways in which you might want to put bounds on how much work you're going to accomplish. Here are some basic strategies to consider.

Do one thing. Do one quantum of work, then leave the rest for later. The code that allocates memory on the GPU and uploads data might choose to do so just once every frame. That leaves plenty of resources for other work.
Do a few things. Similar to the above, but you might find through experience that some constant number is OK. For example, process up to three new meshes per frame.
Do as many things as you can within a deadline. For example, do as much work as possible within 2 milliseconds. This typically means that you need to keep track of the work incurred, as well as allow yourself some margin to overshoot (or choose to end early).
Do as many things as you can within a resource budget. This is a variation of the item above, but the resource is something other than wall-clock time. For example, your pre-fetcher might read as much as it needs to or 1MB, whichever is smaller.
Do as many things as you can within a problem-domain budget. For example, in something like an AR system, update state starting from the device origin and going out no more than 2 meters further than the last budget threshold.
Do only the high-priority work. For example, do only work that is clean-up and returns resources to the rest of the system.

In many of the strategies above, you might need a way to prioritize what work gets down. I won't get too much into that, as it tends to be quite domain specific. The last item is a special case, where all the work that is considered high priority gets done as soon as possible.

Compound Strategies

If you have multiple kinds of things you want to do, there are also multiple ways to combine the above.

Everyone gets their own strategy. For example, you have three subsystems, and they all get a turn, but they all have their own way of working within specific bounds.
Everyone shares one strategy. For example, you have three subsystems, and they all get to do a single quantum of progress, or they all get some specific time bound.
Everyone gets their own strategy, but also there's some global strategy that's being applied. For example, everyone gets to do high-priority work, then any other work they want within their own constraints, and the coordinator makes sure that some time bound isn't exceeded as it passes control around.

Taking a Nap

How do we avoid doing work? That may seem obvious, but there are multiple ways of going about it, and tradeoffs to consider betwen them.

Let someone else handle it. For example, in our game subsystem example, we might return and let the subsystem coordinator choose what to do next. This typically works best when the subsystem is running on the coordinator's thread.
Let yourself be scheduled later. On Windows, you can Sleep(0) to let your quantum go.
Let yourself be scheduled later, redux. Sleep(0) puts the thread on the pending queue, but if you're the one with the highest priority, you'll still get immediately rescheduled. You should try for Sleep(1) or greater if you want to let lower-priority threads have a quick go.
Sleep for a little bit. This is a longer, proper waiting period, for example a half-second or so to retry something. These tend not to be too great, because you can't easily interrupt or inspect state without a whole bunch of work.
Set up a timer. Here you set up a timer to get called back, for example to retry an HTTP operation after a variable number of seconds. You'd typically let control back to a dispatching queue or have a threadpool help you out.
Set up a delayed continuation. This is a more generic version of the above. While setting up timers is a fine strategy, you need to consider cancelation (in-flight and before the thing fires), error reporting, keeping state alive, and returning to the call context if it will fire later on. A proper continuation system (parallel tasks, System.Task, IAsyncOperation) can help with this so you don't have as much one-off work to be done.

As a general comment, Sleep is a terrible API for controlling who gets to run. I've only ever used this for high-performance code where I have no synchronization and I might be waiting for a concurrent memcpy and atomic to finish for example - a kind of spinlock helper so to speak.

Another interesting comment is that whenever you delay work, whether it be through a continuation or timer, you'll often find that properly reporting errors in these code paths is tricky (for example, failure to allocate a timer). The Windows threapool APIs are quite good at this, in that they let you allocate everything up front, and then guarantee no allocation failures during execution.

Enjoy your little-at-a-time work!

Tags: design perf

Home