Static Analysis Problem Type Reference

Data race from cilk_for

The data usage pattern in a cilk_for loop creates a data race.

A data race occurs when two threads access the same memory without proper synchronization. This can cause the program to produce non-deterministic results when parallel execution occurs. In Intel® Cilk™ Plus, cilk_for loops are parallelized by assigning different loop iterations to different threads. This strategy requires the loop iterations to be independent so they can run in any order.

There are many ways that race conditions can be created. The simplest way is to write to a shared variable in multiple loop iterations. In this case, the final value of the variable depends on which iteration writes last. In sequential execution, this would always be last iteration, but in parallel execution this is not guaranteed.

Another common cause of race conditions is a so-called loop carried data dependency. This refers to a case where a value written in one loop iteration is read or written by another iteration. Put another way, a variable is read in a loop iteration without having been previously written in that same iteration. Usually this occurs when an array is indexed improperly. For example, if "x" is a loop counter and "a" is an array, then the value of "a[x]" is different in every loop iteration. Therefore, a loop can write "a[x]" without creating a loop carried dependency. However, if that loop also reads or writes "a[x + 1]", then there is a loop carried dependency because "a[x]" in one loop iteration is the same as "a[x + 1]" in the previous iteration.

There are many possible solutions to race conditions, and it is up to you to decide the best solution. Not all sequential algorithms can be run in parallel or be parallelized. Sometimes it is possible to change the algorithm to group the units of work in a different way that can be parallelized.

Beyond changing the algorithm, there are two main solutions to data race conditions. One solution involves locking, which we will discuss later. The preferred solution is usually to replace the shared variable with a Intel® Cilk™ Plus hyperobject. A hyperobject provides each thread with its own private copy of a shared object. Each thread can then read and write the hyperobject without fear of interaction. At the end of the cilk_for, the per-thread copies can be merged back into the shared variable. The manner in which the per-thread data is combined depends on the type of the hyperobject. For example, you can create a hyperobject that adds all the per-thread values to create the final value. For more details about hyperobjects and how to use them, consult the Intel® Cilk™ Plus documentation that is provided as part of the Intel C/C++ Compiler documentation.

The locking approach establishes a region of exclusive access. When entering that region, a thread acquires a lock. Since only one thread can own the lock at a time, this creates a region of exclusive access. Of course, if a thread attempts to acquire a lock that is owned by another thread, then it will have to wait for the owning thread to relinquish the lock. These wait periods reduce performance. The impact generally depends on how likely it is that a lock will be in use and how long it is typically held. In extreme cases, the locking overhead can destroy all the benefits of parallel execution. For this reason, the hyperobject approach is often preferred.

Sometimes data dependencies can be resolved by splitting a loop into two loops. For example, suppose we have a loop where "x" is the loop counter and we are accessing an array "a". The loop contains one statement that assigns to "a[x]" and a later statement that reads from "a[x - 1]". This creates a data race, because there is no guarantee that the "i-1"-st loop iteration will complete before the "i"-th iteration in parallel mode. However, if the second statement can be moved to a second loop, then the problem is resolved.

ID

Code Location

Description

2-N

Bad memory access

The place a variable was read or written that contributed to the race condition

Example


#include <stdio.h>
#include "cilk/cilk.h"
#include "cilk/reducer_opadd.h"

void do_work1()
{
    int i;
    int a[100];
    // hyperobject that forms final value by adding
    // all the per-thread values together.
    cilk::reducer_opadd<int> sum(0);

    for (i = 1; i < 100; i++) {
        a[i] = i;
    }

    // The loop-carried data dependency on "a" in the following loop
    // means the iterations cannot be run safely in parallel.
    // In sequential mode, the i-th array element is only modified
    // in the i-th iteration.  Therefore, "a[i + 1]" is never modified
    // by a previous loop iteration, and will contain the value
    // assigned earlier: "i + 1".  This is not true in parallel mode.
    // For example, suppose the i = 10 iteration runs before the
    // i = 9 iteration.  Then the statement a[i] = a[i] + 1; in
    // iteration 10 will set a[10] to 11 before the statement
    // sum = sum + a[i + 1] in iteration 9 is executed.
    // When it is, it will add 11 to sum instead of 10.
    // This is not what happens in sequential mode.
    //
    // Note that the use of "sum" would create a data dependency,
    // but it does not because "sum" is a hyperobject.

    cilk_for (int i = 1; i < 99; i++) {
        a[i] = a[i] + 1;
        sum = sum + a[i + 1];
    }

    printf("%d\n", sum.get_value()); 
}

void do_work2()
{
    int i;
    int a[100], b[100];
    // hyperobject that forms final value by adding
    // all the per-thread values together.
    cilk::reducer_opadd<int> sum(0);

    for (i = 1; i < 100; i++) {
        a[i] = i;
        b[i] = 100 - i;
    }

    // The following loop demonstrates a "possible" data dependency.
    //
    // A data dependency exists if the same element of an array
    // is written on one loop iteration and read on another or
    // written on two different loop iterations.  This loop writes
    // the "a[i]-th" element of "b" and reads the "i-th" element.
    // Therefore, a write-write dependency exists if and only if
    // "a[i]" in one iteration is equal to "a[i]" in another iteration.
    // A read-write dependency exists if and only if "a[i]" in one
    // iteration is equal to "i" in another iteration.
    //
    // These facts are hard to determine statically so a "possible data
    // dependency error will be issued. In this program this is a
    // false positive because the previous loop sets "a[i]" equal to "i".
    // Therefore, "a[i]" has a different value on every iteration so no
    // write-write dependency exists.  Also "a[i]" and "i" have different values
    // on different loop iterations so no read-write dependency exists either.
    //
    // Note that the use of "sum" would create a data dependency,
    // but it does not because "sum" is a hyperobject.

    cilk_for (int i = 1; i < 100; i++) {
        b[a[i]] = i;
        sum = sum + b[i];
    }
    printf("%d\n", sum.get_value());
}

int main(int argc, char **argv)
{
    do_work1();
    do_work2();
    return 0;
}