Static Analysis Problem Type Reference

Data race from cilk_spawn

The data usage pattern in a cilk_spawn creates a data race.

A data race occurs when two threads access the same memory without proper synchronization. This can cause the program to produce non-deterministic results when parallel execution occurs. The Intel® Cilk™ Plus cilk_spawn construct creates the possibility of parallel execution. That is, the routine called by the cilk_spawn may execute in parallel with the "continuation" that consists of the code sequence from the call return up to the next explicit or implicit cilk_sync. This error indicates that a data race exists between these two threads.

There are many ways that race conditions can be created. A simple example is when two threads write to a shared variable. In this case, the final value of the variable depends on which thread writes last.

There are many possible solutions to race conditions, and it is up to you to decide the best solution. Not all sequential algorithms can be run in parallel or be parallelized. Sometimes it is possible to change the algorithm to group the units of work in a different way that can be parallelized.

Beyond changing the algorithm, there are two main solutions to data race conditions. One solution involves locking which we will discuss later. The other solution is to have each thread copy the value of the shared variable into a thread-private variable. Each thread can then read and write their thread-private variable without fear of interaction. At the end of the parallel region, the private copies can be merged back into the shared variable. This strategy is supported by the Intel® Cilk™ Plus "hyperobject." For more details about hyperobjects and how to use them, consult the Intel® Cilk™ Plus documentation that is part of the Intel C/C++ Compiler documentation.

The locking approach establishes a region of exclusive access. When entering that region, a thread waits to acquire a lock. Since only one thread can own the lock at a time, this creates a region of exclusive access. Of course, if a thread attempts to acquire a lock that is owned by another thread, then it will have to wait for the owning thread to relinquish the lock. These wait periods reduce performance. The impact generally depends on how likely it is that a lock will be in use and how long it is typically held. In extreme cases the locking overhead can destroy all benefits of parallel execution. For this reason, the hyperobject approach is often preferred.

Note that neither locking nor hyperobjects can really guarantee that operations on shared objects will happen in the same order as they would sequentially. The example below shows a data race that cannot be fixed, either with hyperobjects or with locks. The only solution in a case like this is to move the cilk_sync above the last assignment to the variable named shared. This ends the continuation before it can access the shared variable, thus removing the conflict.

ID

Code Location

Description

1

Bad memory access

The place where the shared variable was used

2

Memory write

The place where the shared variable was written

3

cilk_spawn site

The cilk_spawn that can initiate parallelism

Example


#include <stdio.h>
#include "cilk/cilk.h"
#include <stdlib.h>

// sleep function is not really portable
#ifdef _CRTIMP // means it's Windows
#define sleep ::Sleep
#include <windows.h>
#endif

int shared = 0;

void set_shared_variable(int val) {
    sleep(val); // this sleep lets us control who finishes first
    shared = val;
}

int main(int argc, char **argv)
{
    cilk_spawn set_shared_variable(5);
    set_shared_variable(3);
    cilk_sync;
    
    // without parallel execution, shared would always be 3
    // with parallel execution, the spawned call will finish last,
    // leaving shared equal to 5

    if (shared == 3) {
        printf("same as sequential execution\n");
    } else {
        printf("different from sequential execution\n");
    }
    return 0;
}