|
|
Cyber
Physical Computing
Department of Computer Science
The
|
|
|||||||
|
|
||||||||
For inquiries,
please see contact info.
Sponsor: NSF
Timing
properties of software (relating to its ability to meet timing constraints)
have traditionally been analyzed using real-time scheduling theory. This theory
is perhaps one of the most studied and well understood aspects of real-time
computing. However, the current state of the art is far from complete.
Most
prior research addressed small systems such as single processors or
multiprocessor machines. In contrast, future computing systems will be marked
by massive distribution. They will see increased convergence of computation,
communication, and sensing, and will augment a widely-distributed physical
environment, interacting with it under constraints of real time and space.
Despite the very large volume of real-time computing literature that presents different
analysis techniques for computing the ability of systems to meet time
constraints (called schedulability
analysis in real-time computing terminology), there is a lack of a
fundamental theory that allows systematic analysis of distributed real-time systems by appropriately composing properties
of their components. Consider, by contrast, the fundamental laws of circuit
theory used to analyze linear electric circuits. In that theory, a small number
of fundamental rules (e.g., Kirchhoff laws) allow a designer to analyze complex
circuits of arbitrary interconnection topology, reducing them to their
effective transfer functions and deducing their end-to-end characteristics,
such as total impedance, current draw, and voltage drop. Feasible region
calculus aims to develop a similar theory for distributed real-time computing
based on very encouraging recent results from our group, which suggest rules
for composition of temporal behavior of real-time system components.
Observe
that the capacity of a distributed system is a complicated concept because the
rate at which a distributed system can execute an application is a function of
not only the speed and topology of the system but also the application flow
graph. For example, an application with more internal parallelism will run
faster on a given system than one that is primarily serialized. Furthermore, in
a time-constrained system, tasks that finish late do not contribute useful
work. Hence, it is important to quantify only the fraction of work that can be done
on time. We call it the real-time
capacity of the system. A mathematical foundation is being developed for
composing real-time capacity expressions given application flow graphs, time
constraints and resource topology. This foundation will utilize a small set of
composition rules to arrive at the corresponding application-aware capacity
expressions. We shall also consider the limits when system components become
infinitesimal and the number of components needed for an application grows to
infinity. In such cases, it will be more appropriate to consider resource
density and liquid task flows rather than individual instances of resources and
tasks. Capacity expressions will be re-written in terms of density and flow
distributions. Such models will be suitable for future computing artifacts such
as smart paint and amorphous computing systems where individual computing
``particles'' become insignificant and computations may involve very large
numbers of components.
2. Utilization-Based Feasible Regions
One
of the big gaps in the theoretical underpinnings of real-time computing have
traditionally lied in the inability to relate aggregate metrics (such as
utilization) to per-task temporal performance metrics (such as response time of
individual tasks) except in some restricted special cases (such as that of
scheduling periodic tasks). The first major result of feasible region calculus
has been the derivation of regions, defined in the state space of resource
utilization, that imply satisfaction of all individual temporal performance
constraints. Utilization bounds for schedulability existed previously only for
periodic tasks. However, the periodicity requirement was a limit on
applicability. Moreover, no framework existed for composing the bounds of
individual components to compute those of aggregate systems. Composition is
complicated by the fact that in a multistage system a task can still meet its
end-to-end deadline despite overload on some resources as long as others are
underloaded. Hence, looking at resources in isolation may be misleading.
Instead, a surface exists in a space whose dimensions are the utilizations of
individual resources such that all utilization vectors below that surface
result in schedulable systems. The shape of that surface is a key research
problem addressed by feasible region calculus.
Observe
that most scheduling problems are NP-complete, which typically calls for
heuristic solutions. We demonstrate that computationally tractable sufficient
solutions to the general problem of temporal analysis of distributed resource
constrained communicating tasks are also possible if the feasible region is
appropriately defined. The following references constitute a brief tutorial on
feasible region calculus.
The
first basic result: a utilization bound for aperiodic tasks under
fixed-priority scheduling. This result drops the periodicity requirement that
prevented prior work from being applicable to a large category of systems with
no regularity in task arrival patterns.
Extensions
of the basic results to multiprocessors (these are derived mainly for the sake
of completeness and intellectual curiosity).
Extensions
that account for mutual exclusion (blocking). These allow modeling systems of
non-independent tasks, in which semaphores or other synchronization constraints
are used to regulate access to passive resources.
Extensions
to resource pipelines. These basic results allow composing feasible regions of
individual components to compute those of distributed systems.
An
application of feasible region calculus; computing the real-time capacity of a
sensor network.
2. Non-Utilization-Based Feasible Regions
The
scheduling policy has a dramatic effect on the ability of a system to meet
deadlines. Hence, feasible region calculus must allow composition under
different scheduling policies. An interesting and important consideration is
that often the scheduling policy is imposed on the application developer by legacy
factors such as operating system support. Hence, the scheduling policy is often
a constraint and not a design parameter. This is very different from the
typical direction in real-time computing that focuses on analyzing a handful of
``good'' policies with no general framework for incorporating arbitrary (often
suboptimal) scheduling policies into the analysis as design constraints. A
significant contribution of out work therefore lies in its ability to analyze
arbitrary scheduling policies within the same general framework for quantifying
real-time capacity. To the best of our knowledge, ours is the first such
general framework in real-time computing literature.
The
incorporation of arbitrary scheduling policies into the capacity analysis has
very important ramifications on the analytic engine used. Previous efforts have
attempted to relate latency (or ability to meet deadlines) to system state
(typically, resource utilization) in distributed systems. The implicit
assumption has been that utilization is a good predictor of latency or the
ability of a system to meet deadlines. Unfortunately, we have shown that,
depending on the resource scheduling policy, utilization may or may not be a
good predictor of ability to meet latency constraints. Instead, we developed an
automated method for choosing, for any scheduling policy, a family of load
metrics (not necessarily utilization) to represent system state such that it
becomes possible to predict (based on any metric in that family) whether or not
deadlines are met. Specifically, if the load, as measured by the new metrics,
is within a certain range, it is assured that all distributed tasks finish on
time. As before, composition rules are defined to determine feasible regions
defined on non-utilization-based load metrics.
·
Xue Liu and Tarek Abdelzaher,
"Non-utilization Bounds: On Bounding New Load Metrics for Schedulability
Analysis," invited to ACM Transactions on Embedded Systems, 2006.
3.
Sub-additive Delay Composition
The
above line of research focused on extensions to the derivation of utilization bounds
or utilization-like bounds for schedulability. These extensions addressed
aperiodic tasks, new load metrics, and pipelines. The idea was to quantify
feasible regions defined in the space of resource utilization such that
end-to-end deadlines are met as long as the system operated within the feasible
regions derived.
Feasible
regions, like utilization bounds, however, are high-level derived constructs. A
more basic and fundamental question is to ask: how does delay compose in
distributed systems? Uniprocessor schedulability results are “easy”
because delays on a single processor compose by addition. If task A is
preempted by task B and task C, then it is preempted by the sum of B’s
and C’s (worst case) computation times. This additive property is the
foundation for schedulability analysis expressions for uniprocessors. The
problem with distributed systems is that delays do not necessarily compose
additively. For example, if task A is preempted by B and C on processors X and
Y, where X and Y form a pipeline, then task A is not necessarily preempted by
the sum of B’s and C’s (worst case) computation times on X and Y. This
is due to the inherence concurrency in distributed systems. If A waits for B
and C on processor X, then while A finally executes on X, B and C might execute
on Y. Hence, A does not wait for all of B and C on Y. The delay composition is
therefore sub-additive. The interesting question is, can we describe how delays
compose in distributed systems? The following papers derive the basic delay
composition theorems for resource pipelines and directed acyclic graphs:
The
above papers lead to the development of a delay composition algebra. It enables
a new approach to schedulability analysis in distributed systems: namely, a
reduction-based approach. The
algebra reduces distributed system workload to an equivalent uniprocessor
workload that can be analyzed using uniprocessor schedulability analysis
techniques to infer end-to-end delay and schedulability properties of each of
the original distributed jobs.
The
algebraic operators in the delay composition algebra perform reductions on a
resource graph (representing a distributed system) into a smaller graph by
merging nodes. Each node represents a resource in the system and is
characterized by a load matrix that summarizes its workload. When nodes are
merged in the original system, their workloads are merged as well into an
equivalent workload for the resulting node. The main function of the algebraic
operators of delay composition algebra is to perform workload reductions as
nodes merge. The reductions stop when only one node remains, meaning that the
distributed system workload is fully reduced to that of an equivalent
uniprocessor. Analyzing the workload on the resulting uniprocessor using any of
the standard uniprocessor schedulability analysis techniques determines the
schedulability of the original distributed task set.
A
key consideration in developing operators of the new algebra is that the
operand set must be closed under composition. In other words, both the operands
and the result of each operator must have the same structure. As mentioned
above, the operands refer to the load matrices of the nodes being merged. The
result is the load matrix of the resulting merged node. Since merged nodes that
result from composition represent more than one physical node, the workload
representation for a node should be the same whether this node represents a
single resource or the result of merging an entire subsystem. A key
contribution of this project has been the attainment of such a general representation
(the load matrix of a node) and the derivation of simple algebraic operators
that manipulate this representation to reduce distributed systems workload into
a single-node workload that is equivalent from a schedulability perspective.
Observe
that if the operand matrices include variables (e.g., task computation times
that are yet unspecified), applying our algebraic operators to reduce the
distributed system to a single node yields equivalent uniprocessor load
expressions that are a function of those variables. Furthermore, applying
schedulability tests (e.g., uniprocessor utilization bounds) to the equivalent
load yields schedulability expressions that are a direct function of the
original load variables of the distributed system. This makes it possible to quantify the
distributed system parameter space for which schedulability is attained.
In
summary, existing techniques for analyzing delay and schedulability of jobs in
distributed systems can be broadly classified into two categories: (i)
decomposition-based, and (ii) extension-based. The decomposition-based
techniques break the system into multiple subsystems, analyze each subsystem
independently using current uniprocessor analysis techniques, then combine the
results. The extension-based techniques explore ways to extend current
uniprocessor analyses to accommodate distributed tasks and resources. In
contrast, we developed a third category of techniques for analyzing distributed
systems based on reduction (as opposed to decomposition or extension). Rather
than breaking up the problem into subproblems, or extending uniprocessor
analyses to more complex systems, we systematically reduce the distributed
system schedulability problem to a single simple problem on a uniprocessor. The
uniprocessor problem is then solved thereby determining the schedulability of
the original distributed system. The following paper describes the delay
composition algebra: