Detecting race conditions in iOS

Been working on my new startup – Streethawk (formerly known as Dealsta) for the last four months.  Actually Ive been working on it with Dave since last year but jumped on it full time pretty much the day my beautiful baby girl – Shivani Gayathri – was born.  I think in a way she showed me how much of a wuss I was for not doing something I was passionate about.  But that’s another story.

So I jumped on the iOS client for our app and have been doing pretty much that full time for the last  six months.  An amazing ride and have learnt so much I cannot beleive it.  50 KLOC in four months does that to you!  But there’s still so much to learn.  Hardly enough time.

One issue I wanted to address was race conditions.  Nothing magical to it and this is just one approach.  iOS can allow one to lock access to a resource via the @synchronized keyword or via the use of  semaphore objects (dispatch_semaphore_t).  Excessive locking just imposes additional overhead causing possible “hangups” on the UI/main thread giving the jerky feeling (With dispatch semaphores a kernel level trap only happens if indeed there are multiple threads waiting on the resources so in a sense dispatch semaphores are cheaper and faster than locks based on @synchronized but thats a very very simplistic explanation).


I am a big big user of blocks (and GCD) for my iOS apps.  Infact most of my style with asynchronous development has been favoured heavily towards blocks dispatched to async queues.  Heck I even wrote a Request class wrapping NSURLConnection that would do everything via callback blocks rather than using delegates that was just too messy and spread out everywhere.

The general way of using dispatch queues in GCD is to do something like (nothing fancy here):

// create a serial dispatch queue (from OSX Lion onwards you can also create Parallel queues)...
dispatch_queue_t    my_queue = dispatch_queue_create(queue_name, NULL);

dispatch_async(my_queue, ^{

// do the work here in the queue.


Now what is interesting with all this is that iOS takes care of how many threads to create across all your queues and manages the scheduling of the tasks across the queues.  There are some interesting memory management facts you need to know about but you are fine otherwise.  There are a few advantages to this (amongs others):

  1. You migrate away from manually having to managing threads and scheduling.  The OS handles how many threads to create across the several queues and manages the scheduling across the queues.
  2. You replace locks to enable exclusive access to a resource or a critical section because (with serial queues) only one task is run from each queue at a time (also constrained by how many threads the OS creates).
  3. The code actually looks more “continuous” spatially even though they are not temporally continuous.

<End Digression>

Back to it. Regardless of which approach you pick, concurrency risks race conditions.  So what you really need is a way to detect when multiple threads are accessing a particular section and instead of locking access to other threads, do a log or even better an assert triggering the dumping of a call stack.
For this I have created an RCTool class that simply has the following 3 primitives:
test_begin_lock(const char *filename, const char *function, int line, int mode, NSString *resname, BOOL alreadyLocked)

What this does is creates an entry to mark that the current thread (from which this method is called) is recorded as a “requestor” of a lock at this point in time.

The key parameters here are “mode” and “resname”

“resname” is the name of the resource being locked or checked – and must be unique to mark the resource being verified – very very app and developer specific.

“mode” controls what kind of access is required for the thread entering this section and/or accessing the particular resource.

There are three scenarios here:

  1. If there are no other threads holding a lock at this point in time, ie no calls to test_begin_lock with a matching test_end_lock (see below), then the current thread is now the “starter” of the lock at this section.
  2. If there are other threads holding locks at this section this is where the “mode” parameter comes in.  The mode parameter is either WRITE (ie only one thread – the current thread – is allowed beyond this point) or READ_WRITE (ie multiple READ threads are allowed but only the first WRITE thread and no subsequent threads of any kind are allowed).  So the following are possible:
    1. If another thread has a WRITE lock (and only one such thread can exist) then this method asserts.
    2. If there are other READ threads and this thread is seeking a READ lock, this thread is added to this group of threads and allowed to proceed (ok to have multiple readers).
    3. If there are other READ threads and this threads is seeking a WRITE lock, then the method asserts.
Assertions in this methods are hints towards which parts of the code have possible conflicts and are good candidates for re-factoring (or as I call it temporal factoring) into queues or other synchronisation methods.
This method marks the end of a critical section for  a particular thread (which was begun with the previous method).  Note that there could be multiple threads “alive” due if all threads were READ etc.
Now this is no substitution to one of the many static or runtime analysis tools that costs a few arms and a few legs.  But what this tool does is lets you run your code normally and *before* the first race condition occurs rather than somewhere down the track long after when the data corruption has occured.
Secondly this is a debug only tool.  Naturally one would use this tool/method/class to detect possible race conditions and critical sections so they can *fix* it rather than just leaving it there.  So this doesnt make sense on production.
HOWEVER, it doesnt seem useful or easy to have to add and remove this repeatedly across several parts of your code depending on production or not.  So I use a couple of macros that allow all this stuff to be easily turned on/off and also to automatically pass the function, filename and line number automatically.  It is something like this:
#define LOCK_NO_WRITE       1
#define LOCK_NO_READWRITE   3 
#define BEGIN_LOCK(mode, res, ...)
#define BEGIN_SYNCHED_LOCK(mode, res, ...)
#define END_LOCK(mode, res, ...)
#define BEGIN_LOCK(mode, res, ...)            \
    test_begin_lock(__FILE__, __FUNCTION_NAME__, __LINE__, mode, [NSString stringWithFormat:res, __VA_ARGS__], NO)
#define BEGIN_SYNCHED_LOCK(mode, res, ...)    \
    test_begin_lock(__FILE__, __FUNCTION_NAME__, __LINE__, mode, [NSString stringWithFormat:res, __VA_ARGS__], YES)
#define END_LOCK(mode, res, ...)              \
    test_end_lock(__FILE__, __FUNCTION_NAME__, __LINE__, mode, [NSString stringWithFormat:res, __VA_ARGS__])

So to obtain a READ lock simply call:

BEGIN_LOCK(LOCK_NOWRITE, @"resource_name")
// do things here that has read only access to this the resource
END_LOCK(LOCK_NOWRITE, @"resource_name")

And to get a write lock (to prevent other readers as well):

// Here the resources is also being written to
END_LOCK(LOCK_NOREADWRITE, @"resource_name")
Thats it.  In dev simply set IN_PRODUCTION to zero and off you go.  You will see this slowing down your phone or simulator but at least its towards a good cause!  The actual implementation is attached here.

2 thoughts on “Detecting race conditions in iOS

  1. Just desire to say your article is as astounding.
    The clearness in your post is just great and i could assume you are an expert on this subject.
    Fine with your permission let me to grab your feed to keep up
    to date with forthcoming post. Thanks a million and please keep up
    the rewarding work.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s