Intel® C++ Compiler XE 13.1 User and Reference Guides

offload

Executes the statements on the target. This pragma only applies to Intel® MIC Architecture.

Syntax

#pragma offload specifier[ specifier...]

Where specifier can be any of the following:

<expression-stmt>

The following are arguments to use in specifier:

Arguments

target-name

An identifier that represents the target. The only allowable target name is mic.

target-number

Required for signal and wait clauses.

An integer expression whose value is interpreted as follows:

-1

This value specifies execution on the target. The runtime system chooses the specific target. Execution on the CPU is not allowed. If the correct target hardware needed to run the offloaded program is not available on the system, the program fails with an error message.

>=0

A value greater than or equal to zero specifies execution on a specific target. The number of the specific target is determined as follows:

target=target-number % number_of_targets

If the correct target hardware needed to run the offloaded program is not available on the system, the program fails with an error message.

<-1

This value is reserved.

If you don't specify this argument, the runtime system chooses whether to execute the code on the CPU, a target, or a specific target if multiple targets are available.

For example, in a system with four targets:

  • specifying 2 or 6 tells the runtime systems to execute the code on target 2, because both 2 % 4 and 6 % 4 equal 2.

  • Specifying 1000 tells the runtime systems to execute the code on target 0, because 1000 % 4 = 0.

if-specifier

A Boolean expression.

If the expression evaluates to true, then the program attempts to offload the statement. If the specified target is absent from the system or not available at that time because it is fully loaded, then the statement executes on the CPU.

If the expression evaluates to false, then the statement with the offload specification executes on the CPU and none of the other offload clauses have any effect.

If the expression evaluates to false and you use either the signal or wait clause in this pragma, then the behavior is undefined.

Note

Do not use this clause and a mandatory clause in the same directive.

signal

An optional integer expression that serves as a handle on an asynchronous data transfer or computational activity. The computation performed by the offload clause and any results returned from the offload using out clauses occurs concurrently with CPU execution of the code after the pragma. If this clause is not used, then the entire offload and associated data transfer are executed synchronously. The CPU will not continue past the pragma until it has completed.

This clause refers to a specific target device so you must specify a target-number in the target clause that is greater than or equal to zero.

wait

An optional integer expression to specify a wait for the completion of a previously initiated asynchronous data transfer or asynchronous computation.

This clause refers to a specific target device so you must specify a target-number in the target clause that is greater than or equal to zero.

Querying a signal before the signal has been initiated results in undefined behavior and a runtime abort of the application. For example, querying a signal on target:0 that was initiated for target:1 results in a runtime abort of the application because the signal was initiated for target:1, so there is no signal associated with target:0.

mandatory

An optional clause to specify execution on the target is required. Execution on the CPU is not allowed. If the correct target hardware needed to run the offloaded program is not available on the system, the program fails with an error message.

Note

Do not use this clause and the if-specifier clause in the same directive.

offload-parameter

Is one of the following:

  • in ( variable-ref [, variable-ref ] [ modifier[ modifier ] ] )

  • out ( variable-ref [, variable-ref ] [ modifier[ modifier ] ] )

  • inout ( variable-ref [, variable-ref ] [ modifier[ modifier ] ] )

  • nocopy ( variable-ref [, variable-ref ] [ modifier[ modifier ] ] )

When a program runs in a heterogeneous environment, program variables are copied back and forth between CPU and the target. This clause is a specification for controlling the direction in which variables are copied, and for pointers, the amount of data that is copied.

in The variables are strictly an input to the target region. Its value is not copied back after the region completes.
out The variables are strictly an output of the target region. The host CPU does not copy the variable to the target.
inout

The variable is both copied from the CPU to the target and back from the target to the CPU.

nocopy

A variable whose value is reused from a previous target execution or one that is used entirely within the offloaded code section may be named in a nocopy clause to avoid any copying.

An in or out element-count-expr expression (see description below within modifier) is evaluated at a point in the program before the statement or clause in which it is used.

An array variable whose size is known from the declaration is copied in its entirety. If a subset of an array is to be processed, use a pointer to the starting element of the subset and the element-count-expr to transfer the array subset.

Because a data pointer variable not listed in an in clause is uninitialized within the construct, it must be assigned a value before it can be de-referenced.

variable-ref

Is one of the following:

  • a C/C++ identifier.

  • variable-ref.identifier

  • array-slice

array-slice

variable-ref '[' integral-expression [ : integral-expression ] ']'

An array-slice is an array expression that denotes one contiguous set of array elements.

modifier

Is one of the following:

  • length(element-count-expr)

    where element-count-expr is an integral expression, computed at runtime. Use it with:

    • Pointer variables.

      Pointer variable values themselves are never copied across the host/target interface because there is no correspondence between the memory addresses of the host CPU and the target. Instead, objects that a pointer points to are copied to or from the target, and the value of the pointer variable is recreated. By default a single element is copied.

      You can use element-count-expr to specify how many elements of the pointer type should be considered as the data the pointer points to. If the expression value is zero or negative, a runtime error occurs.

    • Variable-length arrays.

      element-count-expr specifies a number of elements copied between the CPU and target.

alloc_if( condition ) | free_if (condition) where condition is a Boolean expression.

alloc_if specifies a Boolean condition that controls whether the allocatable variables in the in clause will be allocated a new block of memory on the target when the offload is executed on the target. If the expression evaluates to true, a new memory allocation is performed for each variable listed in the clause. If the condition evaluates to false, the existing allocated values on the target are reused (data persistence). You must ensure that a block of memory of sufficient size has been previously allocated for the variables on the target by using a free_if (0) clause on an earlier offload.

free_if specifies a Boolean condition that controls whether to deallocate the memory allocated for the allocatable variables in an in clause. If the expression evaluates to true, the memory pointed to by each variable listed in the clause is deallocated. If the condition evaluates to false, no action is taken on the memory pointed to by the variables in the list. A subsequent clause will be able to reuse the allocated memory (data persistence).

The following are the default settings for the alloc_if and free_if modifiers:

alloc_if

free_if

in

true

true

inout

true

true

out

true

true

nocopy

false

false

For more information, please see Managing Memory Allocation for Pointer Variables

  • align(expression) where the value of expression should be a power of two. This modifier applies to pointer variables and requests the specified minimum alignment for pointer data allocated on Intel® MIC Architecture.

  • alloc (array-slice) where array-slice specifies a set of elements of the array that need allocation. Data specified by the in/ out expression is transferred into the corresponding section of the array allocated on the target. For more information, see Allocating Memory for Parts of Arrays

  • into (var-exp) where var-exp is a variable expression. The into modifier allows data to be transferred from one variable on the CPU to another on the target, and vice versa. Only one item is allowed in variable-ref when using the into modifier. For more information, see Moving Data from One Variable to Another

Description

offload both transfers data and offloads computation.

You can write the offload pragma before any statement, including a compound statement, or an OpenMP* parallel pragma, to specify remote execution of that compound statement or top-level OpenMP* construct, or a single call statement.

Note

Do not use the __MIC__ preprocessor macro inside a offload pragma. You can, however, use it in a subprogram called from the pragma.

Conceptually, this is the sequence of events when this pragma is encountered:

  1. If there is no if clause, go to step 3.

  2. On the host CPU, evaluate the if expression. If it evaluates to true, go to step 3. Otherwise, execute the region on the host CPU and be done.

  3. Attempt to acquire the target. If successful, go to step 4. Otherwise, execute the region on the host CPU and be done.

  4. On the host CPU, compute all alloc_if, free_if, and element-count-expr expressions used in in and out clauses, and element-count-expr expressions used in out clauses.

  5. On the host CPU, gather all variable values that are inputs to the offload.

  6. Send the input values from the host CPU to the target.

  7. On the target, allocate memory for variable-length out variables.

  8. On the target, copy input values into corresponding target variables.

  9. On the target, execute the offloaded region.

  10. On the target, compute all element-count-expr expressions used in out clauses.

  11. On the target, gather all variable values that are outputs of the offload.

  12. Send output values back from the target to the host CPU.

  13. On the host CPU, copy values received into corresponding host CPU variables.

Example

The following example demonstrates how to use a variable-length array to specify a number of elements copied between the CPU and target.

void sample(const int nx)
{
  float temp[nx];
  #pragma offload target(mic) in(temp : length(nx))
  {
    ...
  }
}

The following example demonstrates variable-ref in the in/out clauses:

typedef int ARRAY[10][10]; 
int a[1000][500];
int *p;
ARRAY *q;
int *r[10][10];
int i, j;
struct { int y; } x;
#pragma offload …  in( a )
#pragma offload … out( a[i:j][:] )
#pragma offload …  in( p[0:100] )
#pragma offload …  in( (*q)[5][:] )
#pragma offload …  in( r[5][5][0:2] )
#pragma offload … out( x.y )

See Also


Submit feedback on this help topic