Progress and preservation: Specification

Wednesday, September 1, 2010

Specification safari: it's not fair

Lock-freedom is great: it guarantees that a system will make progress regardless of the vagaries of scheduling. The delay of one thread -- due to a cache miss, or I/O, or preemption -- doesn't force a delay on other threads. In practice, lock-free algorithms permit a greater degree of concurrency, and by Amdahl's law that translates to increased parallel speedup.

But what if we do know something about the scheduler? Most commonly, we know that the scheduler is fair. Fairness and lock-freedom seem orthogonal, since lock-freedom assumes nothing about the scheduler. So I was startled to realize that, in practice, the two properties do interact.

Most well-known lock-free algorithms do not preserve fairness! Concretely, if a thread using a lock-free queue calls enqueue, that call may never return, despite enqueue being a "nonblocking" operation. What gives?

The problem is that other threads in the system may be repeatedly competing to enqueue, a race-condition made safe by the fact that they're all using CAS to do this. But CAS does not guarantee fairness, so a thread can repeatedly lose the race. The scheduler has fulfilled its obligations, since the thread is continually given chances to try CAS. At one level of abstraction (the queue implementation), the thread is not being starved. But at a higher level of abstraction (the queue client), it is.

Monday, August 30, 2010

Specification safari: SOS and liveness

I've been thinking about specifications lately -- particularly for reactive systems, where you're more interested in ongoing interaction than a final result. There are a ton of competing approaches. But thanks to Lamport, we can at least separate safety requirements ("nothing bad happens") from liveness requirements ("something good keeps happening").

For my purposes, operational semantics is attractive for specifying safety: it's reasonable to think about what I'm doing in terms of abstract states and transitions, and the semantics nicely characterizes the permitted transitions.

But as it turns out, operational semantics specifies more than just safety properties. There's an implicit requirement that if the abstract machine can take a transition it must do so. You can see this if you try to design a machine that, from one state, nondeterministically (1) halts or (2) transitions to a different, final state. Try it: you can't do it! You have to add an auxiliary stuck state to signify halting.

This might seem like an obvious or trivial point. But it gets interesting once you realize that each transition in the operational semantics (the abstract machine) is likely to be realized by a large number of steps in the implementation (the concrete machine). In my case, the implementation of a single abstract step is given by a loop whose termination depends on progress of the system as a whole -- hardly trivial!

Operational semantics isn't saying everything I need to say, but it's coming a lot closer than I expected.

Progress and preservation

Wednesday, September 1, 2010

Specification safari: it's not fair

Monday, August 30, 2010

Specification safari: SOS and liveness

Blog Archive

About Me