The order in which image or buffer memory is read or written by shaders is largely undefined. For some shader types (vertex, tessellation evaluation, and in some cases, fragment), even the number of shader invocations that may perform loads and stores is undefined.
In particular, the following rules apply:
![]() | Note |
---|---|
The above limitations on shader invocation order make some forms of synchronization between shader invocations within a single set of primitives unimplementable. For example, having one invocation poll memory written by another invocation assumes that the other invocation has been launched and will complete its writes in finite time. |
Stores issued to different memory locations within a single shader
invocation may not be visible to other invocations in the order they were
performed. The OpMemoryBarrier
instruction can be used to provide
stronger ordering of reads and writes performed by a single invocation.
OpMemoryBarrier
guarantees that any memory transactions issued by the
shader invocation prior to the instruction complete prior to the memory
transactions issued after the instruction. Memory barriers are needed for
algorithms that require multiple invocations to access the same memory and
require the operations to be performed in a partially-defined relative
order. For example, if one shader invocation does a series of writes,
followed by an OpMemoryBarrier
instruction, followed by another write,
then the results of the series of writes before the barrier become visible to
other shader invocations at a time earlier or equal to when the results of
the final write become visible to those invocations. In practice it means
that another invocation that sees the results of the final write would also
see the previous writes. Without the memory barrier, the final write may be
visible before the previous writes.
The built-in atomic memory transaction instructions can be used to read and write a given memory address atomically. While built-in atomic functions issued by multiple shader invocations are executed in undefined order relative to each other, these functions perform both a read and a write of a memory address and guarantee that no other memory transaction will write to the underlying memory between the read and write.
![]() | Note |
---|---|
Atomics allow shaders to use shared global addresses for mutual exclusion or as counters, among other uses. |