NIR Texture Instructions

Even though texture instructions could be supported as intrinsics, the vast number of combinations mean that doing so is practically impossible. Instead, NIR has a dedicated texture instruction. There are several texture operations:

enum nir_texop

Texture instruction opcode

enumerator nir_texop_tex

Regular texture look-up

enumerator nir_texop_txb

Texture look-up with LOD bias

enumerator nir_texop_txl

Texture look-up with explicit LOD

enumerator nir_texop_txd

Texture look-up with partial derivatives

enumerator nir_texop_txf

Texel fetch with explicit LOD

enumerator nir_texop_txf_ms

Multisample texture fetch

enumerator nir_texop_txf_ms_fb

Multisample texture fetch from framebuffer

enumerator nir_texop_txf_ms_mcs_intel

Multisample compression value fetch

enumerator nir_texop_txs

Texture size

enumerator nir_texop_lod

Texture lod query

enumerator nir_texop_tg4

Texture gather

enumerator nir_texop_query_levels

Texture levels query

enumerator nir_texop_texture_samples

Texture samples query

enumerator nir_texop_samples_identical

Query whether all samples are definitely identical.

enumerator nir_texop_tex_prefetch

Regular texture look-up, eligible for pre-dispatch

enumerator nir_texop_fragment_fetch_amd

Multisample fragment color texture fetch

enumerator nir_texop_fragment_mask_fetch_amd

Multisample fragment mask texture fetch

enumerator nir_texop_descriptor_amd

Returns a buffer or image descriptor.

enumerator nir_texop_sampler_descriptor_amd

Returns a sampler descriptor.

enumerator nir_texop_lod_bias_agx

Returns the sampler’s LOD bias

enumerator nir_texop_hdr_dim_nv

Maps to TXQ.DIMENSION

enumerator nir_texop_tex_type_nv

Maps to TXQ.TEXTURE_TYPE

As with other instruction types, there is still an array of sources, except that each source also has a type associated with it. There are various source types, each corresponding to a piece of information that the different texture operations require.

enum nir_tex_src_type

Texture instruction source type

enumerator nir_tex_src_coord

Texture coordinate

Must have nir_tex_instr.coord_components components.

enumerator nir_tex_src_projector

Projector

The texture coordinate (except for the array component, if any) is divided by this value before LOD computation and sampling.

Must be a float scalar.

enumerator nir_tex_src_comparator

Shadow comparator

For shadow sampling, the fetched texel values are compared against the shadow comparator using the compare op specified by the sampler object and converted to 1.0 if the comparison succeeds and 0.0 if it fails. Interpolation happens after this conversion so the actual result may be anywhere in the range [0.0, 1.0].

Only valid if nir_tex_instr.is_shadow and must be a float scalar.

enumerator nir_tex_src_offset

Coordinate offset

An integer value that is added to the texel address before sampling. This is only allowed with operations that take an explicit LOD as it is applied in integer texel space after LOD selection and not normalized coordinate space.

enumerator nir_tex_src_bias

LOD bias

This value is added to the computed LOD before mip-mapping.

enumerator nir_tex_src_lod

Explicit LOD

enumerator nir_tex_src_min_lod

Min LOD

The computed LOD is clamped to be at least as large as min_lod before mip-mapping.

enumerator nir_tex_src_ms_index

MSAA sample index

enumerator nir_tex_src_ms_mcs_intel

Intel-specific MSAA compression data

enumerator nir_tex_src_ddx

Explicit horizontal (X-major) coordinate derivative

enumerator nir_tex_src_ddy

Explicit vertical (Y-major) coordinate derivative

enumerator nir_tex_src_texture_deref

Texture variable dereference

enumerator nir_tex_src_sampler_deref

Sampler variable dereference

enumerator nir_tex_src_texture_offset

Texture index offset

This is added to nir_tex_instr.texture_index. Unless nir_tex_instr.texture_non_uniform is set, this is guaranteed to be dynamically uniform.

enumerator nir_tex_src_sampler_offset

Dynamically uniform sampler index offset

This is added to nir_tex_instr.sampler_index. Unless nir_tex_instr.sampler_non_uniform is set, this is guaranteed to be dynamically uniform. This should not be present until GLSL ES 3.20, GLSL 4.00, or ARB_gpu_shader5, because in ES 3.10 and GL 3.30 samplers said “When aggregated into arrays within a shader, samplers can only be indexed with a constant integral expression.”

enumerator nir_tex_src_texture_handle

Bindless texture handle

This is, unfortunately, a bit overloaded at the moment. There are generally two types of bindless handles:

  1. For GL_ARB_bindless bindless handles. These are part of the GL/Gallium-level API and are always a 64-bit integer.

  2. HW-specific handles. GL_ARB_bindless handles may be lowered to these. Also, these are used by many Vulkan drivers to implement descriptor sets, especially for UPDATE_AFTER_BIND descriptors. The details of hardware handles (bit size, format, etc.) is HW-specific.

Because of this overloading and the resulting ambiguity, we currently don’t validate anything for these.

enumerator nir_tex_src_sampler_handle

Bindless sampler handle

See nir_tex_src_texture_handle,

enumerator nir_tex_src_plane

Plane index for multi-plane YCbCr textures

enumerator nir_tex_src_backend1

Backend-specific vec4 tex src argument.

Can be used to have NIR optimization (copy propagation, lower_vec_to_regs) apply to the packing of the tex srcs. This lowering must only happen after nir_lower_tex().

The nir_tex_instr_src_type() of this argument is float, so no lowering will happen if nir_lower_int_to_float is used.

enumerator nir_tex_src_backend2

Second backend-specific vec4 tex src argument, see nir_tex_src_backend1.

Of particular interest are the texture/sampler deref/index/handle source types. First, note that textures and samplers are specified separately in NIR. While not required for OpenGL, this is required for Vulkan and OpenCL. Some OpenGL [ES] drivers have to deal with hardware that does not have separate samplers and textures. While not recommended, an OpenGL-only driver may assume that the texture and sampler derefs will always point to the same resource, if needed. Note that this pretty well paints your compiler into a corner and makes any future port to Vulkan or OpenCL harder, so such assumptions should really only be made if targeting OpenGL ES 2.0 era hardware.

Also, like a lot of other resources, there are multiple ways to represent a texture in NIR. It can be referenced by a variable dereference, an index, or a bindless handle. When using an index or a bindless handle, the texture type information is generally not available. To handle this, various information from the type is redundantly stored in the nir_tex_instr itself.

struct nir_tex_instr

Represents a texture instruction

nir_instr instr

Base instruction

enum glsl_sampler_dim sampler_dim

Dimensionality of the texture operation

This will typically match the dimensionality of the texture deref type if a nir_tex_src_texture_deref is present. However, it may not if texture lowering has occurred.

nir_alu_type dest_type

ALU type of the destination

This is the canonical sampled type for this texture operation and may not exactly match the sampled type of the deref type when a nir_tex_src_texture_deref is present. For OpenCL, the sampled type of the texture deref will be GLSL_TYPE_VOID and this is allowed to be anything. With SPIR-V, the signedness of integer types is allowed to differ. For all APIs, the bit size may differ if the driver has done any sort of mediump or similar lowering since texture types always have 32-bit sampled types.

nir_texop op

Texture opcode

nir_def def

Destination

nir_tex_src *src

Array of sources

This array has nir_tex_instr.num_srcs elements

unsigned int num_srcs

Number of sources

unsigned int coord_components

Number of components in the coordinate, if any

bool is_array

True if the texture instruction acts on an array texture

bool is_shadow

True if the texture instruction performs a shadow comparison

If this is true, the texture instruction must have a nir_tex_src_comparator.

bool is_new_style_shadow

If is_shadow is true, whether this is the old-style shadow that outputs 4 components or the new-style shadow that outputs 1 component.

bool is_sparse

True if this texture instruction should return a sparse residency code. The code is in the last component of the result.

unsigned int component

nir_texop_tg4 component selector

This determines which RGBA component is gathered.

unsigned int array_is_lowered_cube

Validation needs to know this for gradient component count

unsigned int is_gather_implicit_lod

True if this tg4 instruction has an implicit LOD or LOD bias, instead of using level 0

int8_t tg4_offsets[4][2]

Gather offsets

bool texture_non_uniform

True if the texture index or handle is not dynamically uniform

bool sampler_non_uniform

True if the sampler index or handle is not dynamically uniform.

This may be set when VK_EXT_descriptor_indexing is supported and the appropriate capability is enabled.

This should always be false in GLSL (GLSL ES 3.20 says “When aggregated into arrays within a shader, opaque types can only be indexed with a dynamically uniform integral expression”, and GLSL 4.60 says “When aggregated into arrays within a shader, [texture, sampler, and samplerShadow] types can only be indexed with a dynamically uniform expression, or texture lookup will result in undefined values.”).

unsigned int texture_index

The texture index

If this texture instruction has a nir_tex_src_texture_offset source, then the texture index is given by texture_index + texture_offset.

unsigned int sampler_index

The sampler index

The following operations do not require a sampler and, as such, this field should be ignored:

  • nir_texop_txf

  • nir_texop_txf_ms

  • nir_texop_txs

  • nir_texop_query_levels

  • nir_texop_texture_samples

  • nir_texop_samples_identical

If this texture instruction has a nir_tex_src_sampler_offset source, then the sampler index is given by sampler_index + sampler_offset.

struct nir_tex_src

A texture instruction source

nir_src src

Base source

nir_tex_src_type src_type

Type of this source

Texture instruction helpers

There are a number of helper functions for working with NIR texture instructions. They are documented here in no particular order.

nir_tex_instr *nir_tex_instr_create(nir_shader *shader, unsigned int num_srcs)

Creates a NIR texture instruction

bool nir_tex_instr_need_sampler(const nir_tex_instr *instr)

Returns true if the texture operation requires a sampler as a general rule

Note that the specific hw/driver backend could require to a sampler object/configuration packet in any case, for some other reason.

See also nir_tex_instr.sampler_index.

unsigned int nir_tex_instr_result_size(const nir_tex_instr *instr)

Returns the number of components returned by this nir_tex_instr

Useful for code building texture instructions when you don’t want to think about how many components a particular texture op returns. This does not include the sparse residency code.

static inline unsigned int nir_tex_instr_dest_size(const nir_tex_instr *instr)

Returns the destination size of this nir_tex_instr including the sparse residency code, if any.

bool nir_tex_instr_is_query(const nir_tex_instr *instr)

Returns true if this texture operation queries something about the texture rather than actually sampling it.

bool nir_tex_instr_has_implicit_derivative(const nir_tex_instr *instr)

Returns true if this texture instruction does implicit derivatives

This is important as there are extra control-flow rules around derivatives and texture instructions which perform them implicitly.

nir_alu_type nir_tex_instr_src_type(const nir_tex_instr *instr, unsigned int src)

Returns the ALU type of the given texture instruction source

unsigned int nir_tex_instr_src_size(const nir_tex_instr *instr, unsigned int src)

Returns the number of components required by the given texture instruction source

static inline int nir_tex_instr_src_index(const nir_tex_instr *instr, nir_tex_src_type type)

Returns the index of the texture instruction source with the given nir_tex_src_type or -1 if no such source exists.

void nir_tex_instr_add_src(nir_tex_instr *tex, nir_tex_src_type src_type, nir_def *src)

Adds a source to a texture instruction

void nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned int src_idx)

Removes a source from a texture instruction

Texture instruction lowering

Because most hardware only supports some subset of all possible GLSL/SPIR-V texture operations, NIR provides a quite powerful lowering pass which is able to implement more complex texture operations in terms of simpler ones.

bool nir_lower_tex(nir_shader *shader, const nir_lower_tex_options *options)

Lowers complex texture instructions to simpler ones

struct nir_lower_tex_options
unsigned int lower_txp

bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which sampler types a texture projector is lowered.

bool lower_txp_array

If true, lower texture projector for any array sampler dims

bool lower_txf_offset

If true, lower away nir_tex_src_offset for all texelfetch instructions.

bool lower_rect_offset

If true, lower away nir_tex_src_offset for all rect textures.

nir_instr_filter_cb lower_offset_filter

If not NULL, this filter will return true for tex instructions that should lower away nir_tex_src_offset.

bool lower_rect

If true, lower rect textures to 2D, using txs to fetch the texture dimensions and dividing the texture coords by the texture dims to normalize.

bool lower_1d

If true, lower 1D textures to 2D. This requires the GL/VK driver to map 1D textures to 2D textures with height=1.

lower_1d_shadow does this lowering for shadow textures only.

unsigned int lower_y_uv_external

If true, convert yuv to rgb.

unsigned int saturate_s

To emulate certain texture wrap modes, this can be used to saturate the specified tex coord to [0.0, 1.0]. The bits are according to sampler #, ie. if, for example:

(conf->saturate_s & (1 << n))

is true, then the s coord for sampler n is saturated.

Note that clamping must happen after projector lowering so any projected texture sample instruction with a clamped coordinate gets automatically lowered, regardless of the ‘lower_txp’ setting.

unsigned int lower_srgb

Bitmap of textures that need srgb to linear conversion. If (lower_srgb & (1 << texture_index)) then the rgb (xyz) components of the texture are lowered to linear.

bool lower_txd_cube_map

If true, lower nir_texop_txd on cube maps with nir_texop_txl.

bool lower_txd_3d

If true, lower nir_texop_txd on 3D surfaces with nir_texop_txl.

bool lower_txd_array

If true, lower nir_texop_txd any array surfaces with nir_texop_txl.

bool lower_txd_shadow

If true, lower nir_texop_txd on shadow samplers (except cube maps) with nir_texop_txl. Notice that cube map shadow samplers are lowered with lower_txd_cube_map.

bool lower_txd

If true, lower nir_texop_txd on all samplers to a nir_texop_txl. Implies lower_txd_cube_map and lower_txd_shadow.

bool lower_txd_clamp

If true, lower nir_texop_txd when it uses min_lod.

bool lower_txb_shadow_clamp

If true, lower nir_texop_txb that try to use shadow compare and min_lod at the same time to a nir_texop_lod, some math, and nir_texop_tex.

bool lower_txd_shadow_clamp

If true, lower nir_texop_txd on shadow samplers when it uses min_lod with nir_texop_txl. This includes cube maps.

bool lower_txd_offset_clamp

If true, lower nir_texop_txd on when it uses both offset and min_lod with nir_texop_txl. This includes cube maps.

bool lower_txd_clamp_bindless_sampler

If true, lower nir_texop_txd with min_lod to a nir_texop_txl if the sampler is bindless.

bool lower_txd_clamp_if_sampler_index_not_lt_16

If true, lower nir_texop_txd with min_lod to a nir_texop_txl if the sampler index is not statically determinable to be less than 16.

bool lower_txs_lod

If true, lower nir_texop_txs with a non-0-lod into nir_texop_txs with 0-lod followed by a nir_ishr.

bool lower_txs_cube_array

If true, lower nir_texop_txs for cube arrays to a nir_texop_txs with a 2D array type followed by a nir_idiv by 6.

bool lower_tg4_broadcom_swizzle

If true, apply a .bagr swizzle on tg4 results to handle Broadcom’s mixed-up tg4 locations.

bool lower_tg4_offsets

If true, lowers tg4 with 4 constant offsets to 4 tg4 calls

bool lower_to_fragment_fetch_amd

Lower txf_ms to fragment_mask_fetch and fragment_fetch and samples_identical to fragment_mask_fetch.

enum nir_lower_tex_packing (*lower_tex_packing_cb)(const nir_tex_instr *tex, const void *data)

To lower packed sampler return formats. This will be called for all tex instructions.

bool lower_lod_zero_width

If true, lower nir_texop_lod to return -FLT_MAX if the sum of the absolute values of derivatives is 0 for all coordinates.

void *callback_data

Payload data to be sent to callback / filter functions.

enum nir_lower_tex_packing
enumerator nir_lower_tex_packing_none = 0

No packing

enumerator nir_lower_tex_packing_16

The sampler returns up to 2 32-bit words of half floats or 16-bit signed or unsigned ints based on the sampler type

enumerator nir_lower_tex_packing_8

The sampler returns 1 32-bit word of 4x8 unorm