by Michael B. Jones, Daniela Rosu, and Marcel-Catalin
summarized by Adam M. Costello
Provide guaranteed CPU resource allocation independently to multiple real-time applications running concurrently.
Rialto, an OS for set-top boxes doing speech input, text-to-speech, audio/video in/out, GUI.
Activity = the entity to which guarantees are granted and to which CPU time is charged. Each thread runs on behalf of one activity. An application may have more than one activity. An activity is not confined to one address space, e.g. an RPC server thread's time is charged to the caller's activity.
CPU reservation = a guarantee of the form “activity A will get the CPU for X time out of every interval of length Y, if it's ready to run”. The application gets to choose X & Y (though they get scaled down within a factor of 2). Reservation requests can be denied if there is not enough time in the schedule.
Time constraint = a guarantee of the form “thread T will get the CPU for at least X time between start time S and deadline D”. The start time need not be the time the request is issued; it can be in the future. Constraints may be requested independently of reservations, and are granted or denied immediately.
Guarantees, once granted, will always be honored, and are unaffected by the behavior of other activities. Unreserved time and time reserved but unused is shared fairly among all runnable activities, both real-time and non-real-time.
Providing both reservations and constraints independently. Priority scheduling provides no guarantees except at the highest priority level. In other previous systems constraints were either absent or were dependent on sufficient reservations, whereas in Rialto they can be satisfied with unreserved time.
There is a precomputed repeating schedule, so the dispatching code is O(1). The schedule must be modified or rebuilt whenever a reservation or constraint is added or removed. Constraint feasibility can be calculated easily because the schedule is completely laid out in advance.
The schedule is represented as a tree, each node labeled with a duration and an activity. Every path from root to leaf has the same total duration, which is the minimum period (Y value) for a reservation. The dispatcher repeatedly follows a path from root to leaf, alternating which child is selected at a particular node on successive traversals through that node. So a node that has d branches in its ancestry is visited only once out of every 2^d periods.
Constraint obligations are represented by lists of assignments, one per node. An assignment assigns the CPU to a particular constraint during a particular absolute interval covered by that node.
Constructing the tree is non-trivial and uses heuristics.
Threads blocked on other threads donate their time, just like priority inheritance, but only when the blocked thread is trying to meet a constraint deadline or its activity has no other runnable threads.
Time constraints can be nested, but this is not explained well.
Reservation and constraint requests take time proportional to the number of outstanding reservations and constraints.
Without constraints, response time to randomly timed events increases linearly with the number of competing spinning threads, but remains constant if they use constraints. [But threads shouldn't spin!]
A 15 fps animation (one frame every 67 ms) with a background load suffered severe frame droppage for Y > 10 ms. I don't understand why.
Multiple concurrent animations had less frame droppage under Rialto than under a round-robin scheduler, but the quantum wasn't given for the latter, so I can't interpret this result.
The authors “consider CPU reservation to be necessary for real-time applications.” Their scheduler is very nifty, and provides impressively strong guarantees. But...
How useful are those guarantees? Enough to be worth the complexity, both in the OS and in the altered application programming style? If you have enough CPU cycles for what you're doing, a simple priority scheduler with a small time quantum should work, shouldn't it? And if you don't have enough cycles, something is going to suck no matter what. The user can see the suckage and kill something, or applications could detect the suckage and back off.