Assignment 1 Ece 633
Assignment 1 Ece 633
Assignment 1 Ece 633
OPERATING SYSTEM
ECE633
Prepared For
While many different devices are configured to allow multitasking, there is still an internal process that
creates a hierarchy of functions. In order for certain functions to take place, other functions must occur
beforehand. While the end user perceives that all the functions may appear to be taking place at the
same time, this is not necessarily the case.
A race condition is created when two or more operations are vying with each other to reach completion
ahead of the other operations. When all the individual functions are properly arranged, this leads to the
successful execution of all the functions in a timely manner. However, if the sequence of operations is
thrown out of balance, this creates a bottleneck. In the worse case scenario, the race condition will make
it impossible for the system to continue in its attempt to process all the functions in the order currently
engaged. Because the system may need to process the fifth function in the string before the first and
second functions can be completed, the entire string must be aborted and re-established in the proper
order.
*multitasking :
Multitasking is the act of doing multiple things at once. It is often encouraged among office workers and students,
because it is believed that multitasking is more efficient than focusing on a single task at once. Numerous studies on
multitasking have been carried out, with mixed results. It would appear that in some cases, multitasking is indeed an
effective way to utilize time, while in other instances, the quality of the work suffers as a result of split attention
Example
If a system receives commands to read existing data while writing new data, this can lead to a conflict that
causes the system to shut down in some manner. The system may display some type of error message if
the amount of data being processed placed an undue strain on available resources, or the system may
simply shut down. When this happens, it is usually a good idea to reboot the system and begin the
sequence again. If the amount of data being processed is considerable, it may be better to allow the
assimilation of the new data to be completed before attempting to read any of the currently stored data
Prevent race condition
1.Dimensions of Comparison
For the purposes of this survey, the different techniques examined will be compared
across three primary dimensions; ease of use, which includes annotations,
expressiveness and scalability, soundness and precision.
By ease of use we mean generally how easy or difficult it would be to integrate
a given tool or technique into a development process. When evaluating, we would
like to know what is the burden associated with the annotations necessary to make
a technique work. We are also interested in whether or not a given tool or technique
restricts the programmer’s ability to use the idioms and coding styles with which
he or she is familiar (we are calling this property expressiveness). Finally, research
tools are often known for not scaling well to large, realistic systems. Therefore, we
will examine scalability as well.
When discussing soundness, we are discussing the level of assurance provided
by a particular technique. If a tool is sound, then that tells us as programmers that
if the tool signals that there are no race conditions then their absence is in fact
guaranteed. We also might say that a tool is mostly sound if there are only certain,
unimportant situations in which false negatives may occur. Precision is related. If
a technique is perfectly precise, it is said to be ‘complete.’ A complete tool will
signal on actual race conditions, and never give us false positives. If a tool is not
complete, then the fewer the numbers of false positives given, the more precise we
will claim the tool to be. It is important to note that these dimensions are not all
entirely independent from one another. For example, a tool with a large annotation
burden might also not be scalable for that exact same reason.
After an in-depth look at the papers and techniques surveyed, we will return to
these categories to see how each style of analysis compares
2. Race-Free Type Systems
techniques besides type systems for achieving the same purpose, such as runtime
detection and compulsory compiler analysis. We might call these language-based
techniques purely because a language standard required them. However, these sorts
of techniques can essentially be discussed separately from the language, whereas a
language’s type system is a fundamental part of the language itself. In some ways,
adding race-detection, or more accurately, race prevention to the type system seems
natural. We have gradually seen more and more bugs that plagued programmers
of earlier generations be pushed off to the side thanks to type theory. For examples of this phenomenon
we can look to option types, which eliminate null dereferences, and type-safe languages, which prevent
many classes of bugs associated
with pointer arithmetic.
ferent locks, one lock that provides the protection for the object itself and another
lock that is meant to protect the contents of that particular node. This form of
lock parameterization is a vast improvement in expressiveness over previous type
systems in that it allows different objects of the same class to be protected by different locks.
Furthermore, it even allows different fields of the same object to be
protected by different locks. When a field is guarded by a lock, the type system knows that it needs to
verify all accesses of that field are protected by that lock
parameter. The pop method is an example of a method that relies on its caller to acquire the appropriate
lock. This is signified by the requires clause, which allows
the type checker to assume that the specified lock has already been acquired when
type checking the body of the method. Elsewhere, at each call site for this method,
the type checker will know to verify that the correct lock has been acquired. (This
is the previously mentioned effects clause.) On the downside, it seems somewhat
unnecessary to specify both the lock that protects a field and, when that field is being accessed inside a
method, if that method requires a caller to have the lock in
advance. It would seem that once we knew which lock was protecting the field we
could just search the code to make sure that it matches this protocol. This is true, but
it would require whole-program analysis, and would prevent separate compilation.
Therefore, most type-based systems require this approach.
Upon entering a method with a given effects clause stating which variables must
be held when that method is called, all three techniques [8] [2] [14] follow the same
basic algorithm:
1. Add all locks listed in the effects clause to the current set of held locks and
begin to step through method statements.
2. When a locking statement is encountered (e.g. synchronize(this) {
.add that particular lock to the set of held locks ,({...
3. When encountering a variable dereference, look up that variable’s type (which
4contains the lock that must be used to protect it) and verify that that lock exists
in the set of held locks. (If not, signal an error.)
While the specific works examined here do not mention exactly how this algorithm operates in the face of
branching statements, since the type systems are sound,
we assume that a conservative merging technique will be used. This is necessary,
since it is possible for different locks to be acquired in different control flow paths.
Up until this point, we have mentioned shared fields or variables without actually explaining how these
type system approach them. A naive type system might
assume that every variable could potentially be shared. This would require unnecessary locking and
would certainly be unacceptable to programmers given the
frequency of thread-local variables. In reality the type system must be informed
that a variable is thread-local so that it does not look for explicit lock usage. In [2],
objects can be owned by another object, itself, or thisThread, a special perthread owner. Both other
systems have nearly identical constructs, which do not
affect the run-time behavior of the application. (Curiously, however, Cyclone [8]
still requires the programmer to call synch on a nonlock variable, even though
it has no runtime effect.)
Despite the similarities in basic technique, each paper adds on certain features
that help make their system more programmer friendly in one way or another. Each
has a different system of type inference along with defaults that let the programmer
get away without writing very many lock annotations. Cyclone, has polymorphic
functions, a feature not possessed by Java at the time of [2] and [14], and therefore
must add some more machinery in order to make their lock types work with these
functions [8]. Boyapati et al. integrate their type system with a similar system for
preventing deadlocks [2], while Sasturkar et al. [14] integrate their technique with
a type system for preserving method atomicity; the property that a given method
cannot be interleaved with itself. They furthermore add a readonly type in addition
to the shared variable type. Since race conditions can only occur when reads and
writes are happening at the same time, this type allows their system to have shared
read-only variables without lock protection.
When considering a type system for preventing data races, one has to consider
certain trade-offs. For one, these type systems only enforce one programmer discipline, that of protecting
shared data with locks. All of these type systems are
sound in that they will not miss any potential race conditions, however, they limit
the programs that can be written, and some race-free programs are not allowed. For
instance, programmer knowledge about the thread ordering, which may arise from
explicit forks and joins, can obviate the need for locking. These systems would still
require locks to be used. Also, variables often have an initialization phase (before
other threads have been forked off) where variables do not need to be locked but
where these systems would force them to. Finally, there is the simple burden of
annotation. In addition to traditional typing annotation, all three systems add a host
5of new locking annotations that impose an additional burden on the programmer.
While this cost is certainly worth it for concurrent code with high reliability requirements, it is not at all
clear that corporations developing this type of software
would be willing to adopt a non-standard language for this purpose. The best possible outcome would be
that ideas developed in this field would make it in to a future
version of an industrial strength language
3. Dynamic and Hybrid Race Detectors
While in general it is more desirable to find software defects before run-time using
static analysis tools, there are certain benefits that dynamic race detection tools
can offer. The general rule of thumb, at least at this point in time, is that where
static analysis tools are forced to be conservative and produce more false positives,
dynamic analysis tools can use very intimate knowledge about the runtime behavior
of the application in order to increase precision.
There are two major tools that are used when designing a run-time race detection program.
While research into other, less “traditional” forms of race detection and analysis
has been pretty steady in the past few years, the same cannot be said for flow-based
analyses. Recent papers on the subject were fewer and farther between, and it was
necessary to go back to 2003 [4], 2001 [5] and 2000 [3] in order to find significant
discussions of this research ground. It is quite interesting to note that some of the
other techniques that we have and will examine claim that they could be even more
effective if used in combination with existing flow-based static analysis techniques.
In particular, when discussing model-checking [16] and dynamic race analysis [10]
the developers of these techniques recommend using a traditional static analysis in
order to determine all the possible statements where races might occur. Then based
on that information they propose reduce the overhead of their respective techniques
by only examining those possible statements.
Achieving soundness in the detection of race-conditions is old hat for static race
detectors. What makes these tools painful to use are the maddening number of
false positives that can be reported. The work surveyed in this area seems to take
three distinct approaches to the issue of making static analysis of race conditions
more precise: 1) What essentially amounts to engineering effort and the willing
sacrifice of some soundness, 2) improving and awaiting further improvements in
the underlying static analysis technologies that make detection of race conditions
(and many other potential bugs) possible, and 3) using programmer annotations to
increase precision. Here, in turn, we will take a look at the results of some of these
efforts.
Of all the systems reviewed in this survey, RacerX seems to be the best prepared
to be run in an actual software development environment. No other tool was capable of
examining such large, (millions of lines of source code) realistic (Linux,
FreeBSD, and a large commercial system) systems with comparable speed or numbers of
false positives (on the order of ten, versus hundreds and thousands for other
systems). The authors start with a basic, interprocedural, lockset analysis and then
push on the concept using heuristics, statistical analysis and ranking to produce a
result that, while no longer sound, tends to actually find bugs.
The lockset analysis itself is relatively straightforward. RacerX starts by building a control-flow graph of
the entire application. Using a flow-sensitive procedure
(one that remembers the past results of each execution path) the analysis adds locks
to the lockset when encountering locking statements and removes them when encountering unlocking
statements. As the tool encounters accesses of thread-shared
variables, it verifies that their locks are in the current lockset, and if not a warning
is produced.
But how does RacerX know which variables are shared and which locks protect
those that are? These issues are solved in some interesting ways. Belief analysis is
used to determine if code is even multi-threaded and to help determine if a given
variable needs to be protected. The use of any sort of concurrency statement implies to the system that
that section of code is probably multi-threaded. (The term
probably is used because nothing is known for sure and the likelihoods of different
scenarios are worked into the tool’s final ranking of possible races.) Similarly, if
accesses of a variable tend to occur inside a lock it’s a good indication that that
variable does in fact need to be protected; “bonus points” if that access is the first or
last statement in a critical section. To determine which lock should protect a given
variable, RacerX counts the number of times that a variable is protected by a given
lock verses the number of times that it is not. If this number of times is statistically
significant, then an association is created.
Some mundane (from a research point of view) engineering details also help
RacerX to be a more useful tool. For example, the set of statements that constitute
a lock or an unlock can be specified by the user. This helps the tool to be more
useful in applications that have a large number of locking idioms. Also, in order
to improve performance, RacerX caches the locksets that have already arrived at a
given function or statement. If a duplicate lockset revisits a statement or function
call site through another path but with the same locks held, the previous results can
be reused. (Although a function will be skipped if its cache exceeds a certain size,
one particular source of unsoundness.)
The number and variety of techniques used by the authors to rank the potential
severity a particular warning is too large to discuss them all here. Suffice to say that
the authors are serious about developing a more precise race detection system and
appear to have the analysis experience to include helpful heuristics. Houdini/rcc [5]
is another tool that has taken a similar, engineering-driven approach of using a simple
analysis and then aggressively attacking false positives. It is similarly unsound.
Static data race detection analyses like the one developed by Choi et al. [3]
rely in great part (by the authors’ own admission) on the precision of alias and
escape analysis available. The authors discuss how their system performs a
pathsensitive dataflow analysis over a program’s inter-thread call graph (a traditional
call graph that additionally encodes thread spawns as directed edges). Over the
course of this analysis, sets of abstract objects (abstractions of actual objects that
would exist at runtime) are built up. These abstract objects are used with alias
analysis to determine if a race-detection predicate is true or not. The race detection
predicate encodes four clauses that must be true for a race condition to occur:
1. Two ICG nodes must access the same object in memory.
2. Two nodes must occur within the execution of different threads.
3. The two events represented by these nodes must be protected by different synchronization objects.
4. No specific thread ordering (join, wait, notify) is enforced between those two
threads.
Alias analysis is used as a way to test for all of these conditions, even the last
condition, which is determined by a combination of path ordering encoded in the
ICG and knowledge about whether or not two variables represent the same thread.
Because of the wide applicability of alias analysis and escape analysis, a large
amount of research in these areas is currently ongoing (e.g. [13]). However, these
seem to be fundamentally hard problems to solve, ones that push up against the
boundaries of what is and is not decidable. Therefore, the time when alias analysis
is good enough to allow sufficiently precise detection of race conditions may be a
long way off in the future.
Fluid [7] has chosen to take a somewhat different approach in its solution to the
problem of detecting race conditions (and a variety of other defects). This tool uses
a large number of sophisticated and detailed annotations, that in total make up the
policy of concurrency in a given software system. The authors argue that traditionally
the concurrency policy is expressed anyway, in the form of comments and other
documentation which are not machine checkable. Their system essentially uses
annotations to make checking modular, as well as for explaining to analyses exactly
which usages are correct and which may constitute an unsafe operation.
These annotations allow the developers to describe many different policies. Annotations can
be used to define groups of state which may potentially cross object boundaries.
Once state has been grouped together, annotations can be used to specify which
locks are being used to protect that state, and whether or not the state needs to
be protected at all. It also allows for a very precise description of which methods
can be inter-leaved and which ones cannot, something not really available in other
systems. This includes the ability to specify whether object clients or the objects
themselves are in charge of assuring proper state protection. These annotations can
then be used by relatively straigtforward analyses in order to verify that they are
consistent with the code itself. In some ways this approach appears similar to the
ones used by type-based race detection mechanisms. Although thanks to additional
analyses such as thread-coloring, they have somewhat more expression.
This analysis can determine that certain data is only accessed by single-threaded code and
therefore locking would be unnecessary
NEW WAY TO SOLVE RACE CONDITION
A secure program must be written to make sure that an attacker can't manipulate a shared resource
in a way that causes trouble, and sometimes this isn't as easy as it seems. One of the most common
shared resources is the file system. All programs share the file system, so it sometimes requires
special effort to make sure an attacker can't manipulate the file system in a way that causes
problems.
Many programs intended to be secure have had a vulnerability called a time of check - time of
use (TOCTOU) race condition. This just means that the program checked if a situation was OK, then
later used that information, but an attacker can change the situation between those two steps. This
is particularly a problem with the file system. Attackers can often create an ordinary file or a symbolic
link between the steps. For example, if a privileged program checks if there's no file of a given name,
and then opens for writing that file, an attacker could create a symbolic link file of that name between
those two steps (to /etc/passwd or some other sensitive file, for instance).
These problems can be avoided by obeying a few simple rules:
Don't use access(2) to determine if you can do something. Often, attackers can change the
situation after theaccess(2) call, so any data you get from calling access(2) may no
longer be true. Instead, set your program's privileges to be equal to the privileges you intend
(for example, set its effective id or file system id, effective gid, and use setgroups to clear out
any unneeded groups), then make the open(2) call directly to open or create the file you
want. On a UNIX-like system, the open(2) call is atomic (other than for old NFS systems
versions 1 and 2).
When creating a new file, open it using the modes O_CREAT | O_EXCL (the O_EXCL makes
sure the call only succeeds if a new file is created). Grant only very narrow permissions at
first -- at least forbidding arbitrary users from modifying it. Generally, this means you need to
use umask and/or open's parameters to limit initial access to just the user and maybe the
user group. Don't try to reduce permissions after you create the file because of a related race
condition. On most UNIX-like systems, permissions are only checked on open, so an
attacker could open the file while the permission bits said it was OK, and keep the file open
with those permissions indefinitely. You can later change the rights to be more expansive if
you desire. You'll also need to prepare for having the open fail. If you absolutely must be
able to open a new file, you'll need to create a loop that (1) creates a "random" file name, (2)
open the file with O_CREAT | O_EXCL, and (3) stop repeating when the open succeeds.
When performing operations on a file's meta-information, such as changing its owner, stat-
ing the file, or changing its permission bits, first open the file and then use the operations on
open files. Where you can, avoid the operations that take file names, and use the operations
that take file descriptors instead. This means use the fchown( ), fstat( ),
orfchmod( ) system calls, instead of the functions taking file names, such
as chown(), chgrp(), and chmod(). Doing so will prevent the file from being replaced while
your program is running (a possible race condition). For example, if you close a file and then
use chmod() to change its permissions, an attacker may be able to move or remove the file
between those two steps and create a symbolic link to another file (say /etc/passwd).
If your program walks the file system, recursively iterating through subdirectories, be careful
if an attacker could ever manipulate the directory structure you're walking. A common
example of this is an administrator, system program, or privileged server running your
program while walking through parts of the file system controlled by ordinary users. The
GNU file utilities (fileutils) can do recursive directory deletions and directory moves, but
before V4.1, it simply followed the ".." special entry as it walked the directory structure. An
attacker could move a low-level directory to a higher level while files were being deleted.
Fileutils would then follow the ".." directory up much higher, possibly up to the root of the file
system. By moving directories at the right time, an attacker could delete every file in the
computer. You just can't trust ".." or "." if they're controlled by an attacker.
If you can, don't place files in directories that can be shared with untrusted users. Failing that, try to
avoid using directories that are shared between users. Feel free to create directories that only a
trusted special process can access.
Consider avoiding the traditional shared directories /tmp and /var/tmp. If you can just use a pipe to
send data from one place to another, you'll simplify the program and eliminate a potential security
problem. If you need to create a temporary file, consider storing temporary files somewhere else.
This is particularly true if you're not writing a privileged program; if your program isn't privileged, it's
safer to place temporary files inside that user's home directory, being careful to handle a root user
who has "/" as their home directory. That way, even if you don't create the temporary file "correctly,"
an attacker usually won't be able to cause a problem -- because the attacker won't be able to
manipulate the contents of the user's home directory.
But avoiding shared directories is not always possible, so we'll need to understand how to handle
shared directories like /tmp. This is sufficiently complicated that it deserves a section of its own.
REFERENCES :
http://www.wisegeek.com/what-is-a-race-condition.htm
http://www.wisegeek.com/what-is-multitasking.htm
http://www.cs.cmu.edu/~nbeckman/papers/race_detection_survey.pdf
http://www.ibm.com/developerworks/library/l-sprace.html