Name AMD_shader_ballot Name Strings GL_AMD_shader_ballot Contact Qun Lin, AMD (quentin.lin 'at' amd.com) Contributors Qun Lin, AMD Graham Sellers, AMD Daniel Rakos, AMD Rex Xu, AMD Dominik Witczak, AMD Status Shipping Version Last Modified Date: 10/19/2016 Author Revision: 4 Number ??? Dependencies This extension is written against the OpenGL Shading Language Specification, Version 4.50. This extension requires ARB_shader_group_vote and ARB_shader_ballot. This extension interacts with ARB_gpu_shader_int64. This extension interacts with AMD_gpu_shader_half_float. Overview The extensions ARB_shader_group_vote and ARB_shader_ballot introduced the concept of sub-groups and a set of operations that allow data exchange across shader invocations within a sub-group. This extension further extends the capabilities of these extensions with additional sub-group operations. IP Status None. New Procedures and Functions None. New Tokens None. Modifications to the OpenGL Shading Language Specification, Version 4.50 Including the following line in a shader can be used to control the language features described in this extension: #extension GL_AMD_shader_ballot : where is as specified in section 3.3. New preprocessor #defines are added to the OpenGL Shading Language: #define GL_AMD_shader_ballot 1 Additions to Chapter 8 of the OpenGL Shading Language (GLSL) Specification, version 4.30 (Built-in functions) Add Section 8.18, Shader Invocation Group Functions The , , group invocation functions process values of the specified value across all active shader invocations in the sub-group with three special group operatons according to the following table: Group Operation Description --------------- --------------------------------------------------------- Reduce A reduction operation for values of the specified value in the sub-group InclusiveScan A binary operation with an identity and (where is the size of the sub-group) elements { a[0], a[1], .., a[n] } resulting in { a[0], (a[0] op a[1]), .., (a[0] op a[1] op .. op a[n-1]) }. could be any of , , . ExclusiveScan A binary operation with an identity and (where is the size of the sub-group) elements { a[0], a[1], .., a[n] } resulting in { I, a[0], (a[0] op a[1]), .., (a[0] op a[1] op .. op a[n-2]) }. could be any of , , . The identity in the group operations and is decided according to the following table: Function Data Type Identity -------- ----------------------------------- ---------- Min 32-bit signed integer INT_MAX 64-bit signed integer INT64_MAX 32-bit unsigned integer UINT_MAX 64-bit unsigned integer UINT64_MAX 16-bit/32-bit/64-bit floating-point +INF Max 32-bit signed integer INT_MIN 64-bit signed integer INT64_MIN 32-bit/64-bit unsigned integer 0 floating-point -INF Add 32-bit/64-bit signed integer 0 32-bit/64-bit unsigned integer 0 16-bit/32-bit/64-bit floating-point 0 +------------------------------------------------------+-----------------------------------------------------------+ | Syntax | Description | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsAMD(genType v) | Returns the minimum value of across all active shader | | genIType minInvocationsAMD(genIType v) | invocations in the sub-group with group | | genUType minInvocationsAMD(genUType v) | operation. These functions must be used in uniform | | genDType minInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsNonUniformAMD(genType v) | Returns the minimum value of across all active shader | | genIType minInvocationsNonUniformAMD(genIType v) | invocations in the sub-group with group | | genUType minInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | | genDType minInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsInclusiveScanAMD(genType v) | Returns the minimum value of across all active shader | | genIType minInvocationsInclusiveScanAMD(genIType v) | invocations in the sub-group with group | | genUType minInvocationsInclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | | genDType minInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genType v) | invocations in the sub-group with group | | genType minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genIType v) | control flow. These functions operate component-wise. | | genUType minInvocationsInclusiveScanNonUniformAMD( | | | genUType v) | | | genDType minInvocationsInclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsExclusiveScanAMD(genType v) | Returns the minimum value of across all active shader | | genIType minInvocationsExclusiveScanAMD(genIType v) | invocations in the sub-group with group | | genUType minInvocationsExclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | | genDType minInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genType v) | invocations in the sub-group with group | | genIType minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genIType v) | control flow. These functions operate component-wise. | | genUType minInvocationsExclusiveScanNonUniformAMD( | | | genUType v) | | | genDType minInvocationsExclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsAMD(genType v) | Returns the maximum value of across all active shader | | genIType maxInvocationsAMD(genIType v) | invocations in the sub-group with group | | genUType maxInvocationsAMD(genUType v) | operation. These functions must be used in uniform | | genDType maxInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsNonUniformAMD(genType v) | Returns the maximum value of across all active shader | | genIType maxInvocationsNonUniformAMD(genIType v) | invocations in the sub-group with group | | genUType maxInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | | genDType maxInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsInclusiveScanAMD(genType v) | Returns the maximum value of across all active shader | | genIType maxInvocationsInclusiveScanAMD(genIType v) | invocations in the sub-group with group | | genUType maxInvocationsInclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | | genDType maxInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genType v) | invocations in the sub-group with group | | genType maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genIType v) | control flow. These functions operate component-wise. | | genUType maxInvocationsInclusiveScanNonUniformAMD( | | | genUType v) | | | genDType maxInvocationsInclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsExclusiveScanAMD(genType v) | Returns the maximum value of across all active shader | | genIType maxInvocationsExclusiveScanAMD(genIType v) | invocations in the sub-group with group | | genUType maxInvocationsExclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | | genDType maxInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genType v) | invocations in the sub-group with group | | genIType maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genIType v) | control flow. These functions operate component-wise. | | genUType maxInvocationsExclusiveScanNonUniformAMD( | | | genUType v) | | | genDType maxInvocationsExclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsAMD(genType v) | Returns the sum of the value of across all active | | genIType addInvocationsAMD(genIType v) | shader invocations in the sub-group with group | | genUType addInvocationsAMD(genUType v) | operation. These functions must be used in uniform | | genDType addInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsNonUniformAMD(genType v) | Returns the sum of the value of across all active | | genIType addInvocationsNonUniformAMD(genIType v) | shader invocations in the sub-group with group | | genUType addInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | | genDType addInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsInclusiveScanAMD(genType v) | Returns the sum of the value of across all active | | genIType addInvocationsInclusiveScanAMD(genIType v) | shader invocations in the sub-group with | | genUType addInvocationsInclusiveScanAMD(genUType v) | group operation. These functions must be used in uniform | | genDType addInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genType v) | shader invocations in the sub-group with | | genIType addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in | | genIType v) | non-uniform control flow. These functions operate | | genUType addInvocationsInclusiveScanNonUniformAMD( | component-wise. | | genUType v) | | | genDType addInvocationsInclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsExclusiveScanAMD(genType v) | Returns the sum of the value of across all active | | genIType addInvocationsExclusiveScanAMD(genIType v) | shader invocations in the sub-group with | | genUType addInvocationsExclusiveScanAMD(genUType v) | group operation. These functions must be used in uniform | | genDType addInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | | | | | | | | | | | | | +------------------------------------------------------+-----------------------------------------------------------+ | genType addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genType v) | shader invocations in the sub-group with | | genIType addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in | | genIType v) | non-uniform control flow. These functions operate | | genUType addInvocationsExclusiveScanNonUniformAMD( | component-wise. | | genUType v) | | | genDType addInvocationsExclusiveScanNonUniformAMD( | | | genDType v) | | +------------------------------------------------------+-----------------------------------------------------------+ | genType swizzleInvocationsAMD( | Swizzles data within a group of 4 consecutive invocations | | genType data, uvec4 offset) | of the sub-group based on as described below: | | genIType swizzleInvocationsAMD( | | | genIType data, uvec4 offset) | for (i = 0; i < gl_SubGroupSizeARB; i+=4) { | | genUType swizzleInvocationsAMD( | dataOut[i+0] = isActive[i+offset.x] ? | | genUType data, uvec4 offset) | dataIn[i+offset.x] : 0; | | | dataOut[i+1] = isActive[i+offset.y] ? | | | dataIn[i+offset.y] : 0; | | | dataOut[i+2] = isActive[i+offset.z] ? | | | dataIn[i+offset.z] : 0; | | | dataOut[i+3] = isActive[i+offset.w] ? | | | dataIn[i+offset.w] : 0; | | | } | | | | | | Where: | | | - isActive[i] tells whether the invocation with the index | | | is currently active in the sub-group. | | | - dataIn[i] is the value of for invocation index | | | . | | | - dataOut[i] is the return value of the function for | | | invocation index . | | | | | | Components of must be constant integer | | | expression with a value in the range [0, 3]. | +------------------------------------------------------+-----------------------------------------------------------+ | genType swizzleInvocationsMaskedAMD( | Swizzles data within a group of 32 consecutive | | genType data, uvec3 mask) | invocations with a limited mask as described below: | | genIType swizzleInvocationsMaskedAMD( | | | genIType data, uvec3 mask) | for (i = 0; i < gl_SubGroupSizeARB; i++) { | | genUType swizzleInvocationsMaskedAMD( | j = (((i & 0x1f) & mask.x) | mask.y) ^ mask.z; | | genIType data, uvec3 mask) | j |= (i & 0x20); // which group of 32 | | | dataOut[i] = isActive[j] ? dataIn[j] : 0; | | | } | | | | | | Where: | | | - isActive[i] tells whether the invocation with the index | | | is currently active in the sub-group. | | | - dataIn[i] is the value of for invocation index | | | . | | | - dataOut[i] is the return value of the function for | | | invocation index . | | | | | | Components of must be constant integer expression | | | with a value in the range [0, 31]. | +------------------------------------------------------+-----------------------------------------------------------+ | genType writeInvocationAMD( | Returns for all active invocations in the | | genType inputValue, | sub-group except for the invocation whose invocation | | genType writeValue, | index within the sub-group is for which | | uint invocationIndex) | is returned as described below: | | genIType writeInvocationAMD( | | | genIType inputValue, | for (i = 0; i < gl_SubGroupSizeARB; i++) { | | genIType writeValue, | out[i] = (i == invocationIndex) ? | | uint invocationIndex) | writeValue:inputValue; | | genUType writeInvocationAMD( | } | | genUType inputValue, | | | genUType writeValue, | Where out[i] is the return value of the function for | | uint invocationIndex) | invocation index . | | | | | | and must be dynamically | | | uniform within the sub-group, otherwise the return value | | | of the function is undefined. | +------------------------------------------------------+-----------------------------------------------------------+ Dependencies on ARB_gpu_shader_int64 If the shader enables ARB_gpu_shader_int64, this extension adds additional shader invocation group functions. Add Section 8.18, Shader Invocation Group Functions +------------------------------------------------------+-----------------------------------------------------------+ | Syntax | Description | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsAMD(genI64Type v) | Returns the minimum value of across all active shader | | genU64Type minInvocationsAMD(genU64Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsNonUniformAMD(genI64Type v) | Returns the minimum value of across all active shader | | genU64Type minInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsInclusiveScanAMD( | Returns the minimum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type minInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsExclusiveScanAMD( | Returns the minimum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type minInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsAMD(genI64Type v) | Returns the maximum value of across all active shader | | genU64Type maxInvocationsAMD(genU64Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsNonUniformAMD(genI64Type v) | Returns the maximum value of across all active shader | | genU64Type maxInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsInclusiveScanAMD( | Returns the maximum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type maxInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsExclusiveScanAMD( | Returns the maximum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type maxInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genI64Type v) | invocations in the sub-group with group | | genU64Type maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsAMD(genI64Type v) | Returns the sum of the value of across all active | | genU64Type addInvocationsAMD(genU64Type v) | shader invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsNonUniformAMD(genI64Type v) | Returns the sum of the value of across all active | | genU64Type addInvocationsNonUniformAMD(genU64Type v) | shader invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsInclusiveScanAMD( | Returns the sum of the value of across all active | | genI64Type v) | shader invocations in the sub-group with | | genU64Type addInvocationsInclusiveScanAMD( | group operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genI64Type v) | shader invocations in the sub-group with | | genU64Type addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in | | genU64Type v) | non-uniform control flow. These functions operate | | | component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsExclusiveScanAMD( | Returns the sum of the value of across all active | | genI64Type v) | shader invocations in the sub-group with | | genU64Type addInvocationsExclusiveScanAMD( | group operation. These functions must be used in uniform | | genU64Type v) | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genI64Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genI64Type v) | shader invocations in the sub-group with | | genU64Type addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in | | genU64Type v) | non-uniform control flow. These functions operate | | | component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | uint mbcntAMD(uint64_t mask) | Returns the bit count of gl_SubGroupLtMaskARB with | | | as described below: | | | | | | bitCount(gl_SubGroupLtMaskARB & mask). | +------------------------------------------------------+-----------------------------------------------------------+ Dependencies on AMD_gpu_shader_half_float If the shader enables AMD_gpu_shader_half_float, this extension adds additional shader invocation group functions. Add Section 8.18, Shader Invocation Group Functions +------------------------------------------------------+-----------------------------------------------------------+ | Syntax | Description | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsAMD(genF16Type v) | Returns the minimum value of across all active shader | | | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsNonUniformAMD(genF16Type v) | Returns the minimum value of across all active shader | | | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsInclusiveScanAMD( | Returns the minimum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsExclusiveScanAMD( | Returns the minimum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsAMD(genF16Type v) | Returns the maximum value of across all active shader | | | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsNonUniformAMD(genF16Type v) | Returns the maximum value of across all active shader | | | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsInclusiveScanAMD( | Returns the maximum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsExclusiveScanAMD( | Returns the maximum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of across all active shader | | genF16Type v) | invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsAMD(genF16Type v) | Returns the sum of the value of across all active | | | shader invocations in the sub-group with group | | | operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsNonUniformAMD(genF16Type v) | Returns the sum of the value of across all active | | | shader invocations in the sub-group with group | | | operation. These functions could be used in non-uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsInclusiveScanAMD( | Returns the sum of the value of across all active | | genF16Type v) | shader invocations in the sub-group with | | | group operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genF16Type v) | shader invocations in the sub-group with | | | group operation. These functions could be used in | | | non-uniform control flow. These functions operate | | | component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsExclusiveScanAMD( | Returns the sum of the value of across all active | | genF16Type v) | shader invocations in the sub-group with | | | group operation. These functions must be used in uniform | | | control flow. These functions operate component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ | genF16Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of across all active | | genF16Type v) | shader invocations in the sub-group with | | | group operation. These functions could be used in | | | non-uniform control flow. These functions operate | | | component-wise. | +------------------------------------------------------+-----------------------------------------------------------+ Additions to the AGL/GLX/WGL Specifications None. GLX Protocol None. Errors None. Issues Revision History Rev. Date Author Changes ---- ---------- -------- -------------------------------------------------- 4 10/19/2016 rexu Add interactions with ARB_gpu_shader_int64 and AMD_gpu_shader_half_float. New group invocation functions are added to support 64-bit integer type, 16-bit/64-bit floating-point type, and group operations. Clarify that in swizzleInvocationsMaskedAMD() should be constant integer expression with a value in the range [0, 31]. 3 08/16/2016 rexu Clarify that minInvocationsAMD, maxInvocationsAMD, addInvocationsAMD, along with their non-uniform versions, operate component-wise rather than on vector. 2 08/11/2016 rexu Add non-uniform versions of minInvocationsAMD, maxInvocationsAMD, and addInvocationsAMD. Support those operations in non-uniform control flow. 1 04/21/2016 qlin Internal revisions.