NIR Texture Instructions

Even though texture instructions could be supported as intrinsics, the vast number of combinations mean that doing so is practically impossible. Instead, NIR has a dedicated texture instruction. There are several texture operations:

enum nir_texop

Texture instruction opcode

Values:

enumerator nir_texop_tex

Regular texture look-up

enumerator nir_texop_txb

Texture look-up with LOD bias

enumerator nir_texop_txl

Texture look-up with explicit LOD

enumerator nir_texop_txd

Texture look-up with partial derivatives

enumerator nir_texop_txf

Texel fetch with explicit LOD

enumerator nir_texop_txf_ms

Multisample texture fetch

enumerator nir_texop_txf_ms_fb

Multisample texture fetch from framebuffer

enumerator nir_texop_txf_ms_mcs_intel

Multisample compression value fetch

enumerator nir_texop_txs

Texture size

enumerator nir_texop_lod

Texture lod query

enumerator nir_texop_tg4

Texture gather

enumerator nir_texop_query_levels

Texture levels query

enumerator nir_texop_texture_samples

Texture samples query

enumerator nir_texop_samples_identical

Query whether all samples are definitely identical.

enumerator nir_texop_tex_prefetch

Regular texture look-up, eligible for pre-dispatch

enumerator nir_texop_fragment_fetch

Multisample fragment color texture fetch

enumerator nir_texop_fragment_mask_fetch

Multisample fragment mask texture fetch

As with other instruction types, there is still an array of sources, except that each source also has a type associated with it. There are various source types, each corresponding to a piece of information that the different texture operations require.

enum nir_tex_src_type

Texture instruction source type

Values:

enumerator nir_tex_src_coord

Texture coordinate

Must have nir_tex_instr::coord_components components.

enumerator nir_tex_src_projector

Projector

The texture coordinate (except for the array component, if any) is divided by this value before LOD computation and sampling.

Must be a float scalar.

enumerator nir_tex_src_comparator

Shadow comparator

For shadow sampling, the fetched texel values are compared against the shadow comparator using the compare op specified by the sampler object and converted to 1.0 if the comparison succeeds and 0.0 if it fails. Interpolation happens after this conversion so the actual result may be anywhere in the range [0.0, 1.0].

Only valid if nir_tex_instr::is_shadow and must be a float scalar.

enumerator nir_tex_src_offset

Coordinate offset

An integer value that is added to the texel address before sampling. This is only allowed with operations that take an explicit LOD as it is applied in integer texel space after LOD selection and not normalized coordinate space.

enumerator nir_tex_src_bias

LOD bias

This value is added to the computed LOD before mip-mapping.

enumerator nir_tex_src_lod

Explicit LOD

enumerator nir_tex_src_min_lod

Min LOD

The computed LOD is clamped to be at least as large as min_lod before mip-mapping.

enumerator nir_tex_src_ms_index

MSAA sample index

enumerator nir_tex_src_ms_mcs_intel

Intel-specific MSAA compression data

enumerator nir_tex_src_ddx

Explicit horizontal (X-major) coordinate derivative

enumerator nir_tex_src_ddy

Explicit vertical (Y-major) coordinate derivative

enumerator nir_tex_src_texture_deref

Texture variable dereference

enumerator nir_tex_src_sampler_deref

Sampler variable dereference

enumerator nir_tex_src_texture_offset

Texture index offset

This is added to nir_tex_instr::texture_index. Unless nir_tex_instr::texture_non_uniform is set, this is guaranteed to be dynamically uniform.

enumerator nir_tex_src_sampler_offset

Dynamically uniform sampler index offset

This is added to nir_tex_instr::sampler_index. Unless nir_tex_instr::sampler_non_uniform is set, this is guaranteed to be dynamically uniform.

enumerator nir_tex_src_texture_handle

Bindless texture handle

This is, unfortunately, a bit overloaded at the moment. There are generally two types of bindless handles:

  1. For GL_ARB_bindless bindless handles. These are part of the GL/Gallium-level API and are always a 64-bit integer.

  2. HW-specific handles. GL_ARB_bindless handles may be lowered to these. Also, these are used by many Vulkan drivers to implement descriptor sets, especially for UPDATE_AFTER_BIND descriptors. The details of hardware handles (bit size, format, etc.) is HW-specific.

Because of this overloading and the resulting ambiguity, we currently don’t validate anything for these.

enumerator nir_tex_src_sampler_handle

Bindless sampler handle

See nir_tex_src_texture_handle,

enumerator nir_tex_src_plane

Plane index for multi-plane YCbCr textures

enumerator nir_tex_src_backend1

Backend-specific vec4 tex src argument.

Can be used to have NIR optimization (copy propagation, lower_vec_to_movs) apply to the packing of the tex srcs. This lowering must only happen after nir_lower_tex().

The nir_tex_instr_src_type() of this argument is float, so no lowering will happen if nir_lower_int_to_float is used.

enumerator nir_tex_src_backend2

Second backend-specific vec4 tex src argument, see nir_tex_src_backend1.

enumerator nir_num_tex_src_types

Of particular interest are the texture/sampler deref/index/handle source types. First, note that textures and samplers are specified separately in NIR. While not required for OpenGL, this is required for Vulkan and OpenCL. Some OpenGL [ES] drivers have to deal with hardware that does not have separate samplers and textures. While not recommended, an OpenGL-only driver may assume that the texture and sampler derefs will always point to the same resource, if needed. Note that this pretty well paints your compiler into a corner and makes any future port to Vulkan or OpenCL harder, so such assumptions should really only be made if targeting OpenGL ES 2.0 era hardware.

Also, like a lot of other resources, there are multiple ways to represent a texture in NIR. It can be referenced by a variable dereference, an index, or a bindless handle. When using an index or a bindless handle, the texture type information is generally not available. To handle this, various information from the type is redundantly stored in the nir_tex_instr itself.

struct nir_tex_instr

Represents a texture instruction

Public Members

nir_instr instr

Base instruction

enum glsl_sampler_dim sampler_dim

Dimensionality of the texture operation

This will typically match the dimensionality of the texture deref type if a nir_tex_src_texture_deref is present. However, it may not if texture lowering has occurred.

nir_alu_type dest_type

ALU type of the destination

This is the canonical sampled type for this texture operation and may not exactly match the sampled type of the deref type when a nir_tex_src_texture_deref is present. For OpenCL, the sampled type of the texture deref will be GLSL_TYPE_VOID and this is allowed to be anything. With SPIR-V, the signedness of integer types is allowed to differ. For all APIs, the bit size may differ if the driver has done any sort of mediump or similar lowering since texture types always have 32-bit sampled types.

nir_texop op

Texture opcode

nir_dest dest

Destination

nir_tex_src *src

Array of sources

This array has nir_tex_instr::num_srcs elements

unsigned num_srcs

Number of sources

unsigned coord_components

Number of components in the coordinate, if any

bool is_array

True if the texture instruction acts on an array texture

bool is_shadow

True if the texture instruction performs a shadow comparison

If this is true, the texture instruction must have a nir_tex_src_comparator.

bool is_new_style_shadow

If is_shadow is true, whether this is the old-style shadow that outputs 4 components or the new-style shadow that outputs 1 component.

bool is_sparse

True if this texture instruction should return a sparse residency code. The code is in the last component of the result.

unsigned component

nir_texop_tg4 component selector

This determines which RGBA component is gathered.

unsigned array_is_lowered_cube

Validation needs to know this for gradient component count

int8_t tg4_offsets[4][2]

Gather offsets

bool texture_non_uniform

True if the texture index or handle is not dynamically uniform

bool sampler_non_uniform

True if the sampler index or handle is not dynamically uniform

unsigned texture_index

The texture index

If this texture instruction has a nir_tex_src_texture_offset source, then the texture index is given by texture_index + texture_offset.

unsigned sampler_index

The sampler index

The following operations do not require a sampler and, as such, this field should be ignored:

  • nir_texop_txf

  • nir_texop_txf_ms

  • nir_texop_txs

  • nir_texop_query_levels

  • nir_texop_texture_samples

  • nir_texop_samples_identical

If this texture instruction has a nir_tex_src_sampler_offset source, then the sampler index is given by sampler_index + sampler_offset.

struct nir_tex_src

A texture instruction source

Public Members

nir_src src

Base source

nir_tex_src_type src_type

Type of this source

Texture instruction helpers

There are a number of helper functions for working with NIR texture instructions. They are documented here in no particular order.

nir_tex_instr *nir_tex_instr_create(nir_shader *shader, unsigned num_srcs)

Creates a NIR texture instruction

static inline bool nir_tex_instr_need_sampler(const nir_tex_instr *instr)

Returns true if the texture operation requires a sampler as a general rule

Note that the specific hw/driver backend could require to a sampler object/configuration packet in any case, for some other reason.

See

nir_tex_instr::sampler_index.

static inline unsigned nir_tex_instr_result_size(const nir_tex_instr *instr)

Returns the number of components returned by this nir_tex_instr

Useful for code building texture instructions when you don’t want to think about how many components a particular texture op returns. This does not include the sparse residency code.

static inline unsigned nir_tex_instr_dest_size(const nir_tex_instr *instr)

Returns the destination size of this nir_tex_instr including the sparse residency code, if any.

static inline bool nir_tex_instr_is_query(const nir_tex_instr *instr)

Returns true if this texture operation queries something about the texture rather than actually sampling it.

static inline bool nir_tex_instr_has_implicit_derivative(const nir_tex_instr *instr)

Returns true if this texture instruction does implicit derivatives

This is important as there are extra control-flow rules around derivatives and texture instructions which perform them implicitly.

static inline nir_alu_type nir_tex_instr_src_type(const nir_tex_instr *instr, unsigned src)

Returns the ALU type of the given texture instruction source

static inline unsigned nir_tex_instr_src_size(const nir_tex_instr *instr, unsigned src)

Returns the number of components required by the given texture instruction source

static inline int nir_tex_instr_src_index(const nir_tex_instr *instr, nir_tex_src_type type)

Returns the index of the texture instruction source with the given nir_tex_src_type or -1 if no such source exists.

void nir_tex_instr_add_src(nir_tex_instr *tex, nir_tex_src_type src_type, nir_src src)

Adds a source to a texture instruction

void nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)

Removes a source from a texture instruction

Texture instruction lowering

Because most hardware only supports some subset of all possible GLSL/SPIR-V texture operations, NIR provides a quite powerful lowering pass which is able to implement more complex texture operations in terms of simpler ones.

bool nir_lower_tex(nir_shader *shader, const nir_lower_tex_options *options)

Lowers complex texture instructions to simpler ones

struct nir_lower_tex_options

Public Members

unsigned lower_txp

bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which sampler types a texture projector is lowered.

bool lower_txf_offset

If true, lower away nir_tex_src_offset for all texelfetch instructions.

bool lower_rect_offset

If true, lower away nir_tex_src_offset for all rect textures.

bool lower_rect

If true, lower rect textures to 2D, using txs to fetch the texture dimensions and dividing the texture coords by the texture dims to normalize.

unsigned lower_y_uv_external

If true, convert yuv to rgb.

unsigned saturate_s

To emulate certain texture wrap modes, this can be used to saturate the specified tex coord to [0.0, 1.0]. The bits are according to sampler #, ie. if, for example:

(conf->saturate_s & (1 << n))

is true, then the s coord for sampler n is saturated.

Note that clamping must happen after projector lowering so any projected texture sample instruction with a clamped coordinate gets automatically lowered, regardless of the ‘lower_txp’ setting.

unsigned lower_srgb

Bitmap of textures that need srgb to linear conversion. If (lower_srgb & (1 << texture_index)) then the rgb (xyz) components of the texture are lowered to linear.

bool lower_txd_cube_map

If true, lower nir_texop_txd on cube maps with nir_texop_txl.

bool lower_txd_3d

If true, lower nir_texop_txd on 3D surfaces with nir_texop_txl.

bool lower_txd_shadow

If true, lower nir_texop_txd on shadow samplers (except cube maps) with nir_texop_txl. Notice that cube map shadow samplers are lowered with lower_txd_cube_map.

bool lower_txd

If true, lower nir_texop_txd on all samplers to a nir_texop_txl. Implies lower_txd_cube_map and lower_txd_shadow.

bool lower_txb_shadow_clamp

If true, lower nir_texop_txb that try to use shadow compare and min_lod at the same time to a nir_texop_lod, some math, and nir_texop_tex.

bool lower_txd_shadow_clamp

If true, lower nir_texop_txd on shadow samplers when it uses min_lod with nir_texop_txl. This includes cube maps.

bool lower_txd_offset_clamp

If true, lower nir_texop_txd on when it uses both offset and min_lod with nir_texop_txl. This includes cube maps.

bool lower_txd_clamp_bindless_sampler

If true, lower nir_texop_txd with min_lod to a nir_texop_txl if the sampler is bindless.

bool lower_txd_clamp_if_sampler_index_not_lt_16

If true, lower nir_texop_txd with min_lod to a nir_texop_txl if the sampler index is not statically determinable to be less than 16.

bool lower_txs_lod

If true, lower nir_texop_txs with a non-0-lod into nir_texop_txs with 0-lod followed by a nir_ishr.

bool lower_txs_cube_array

If true, lower nir_texop_txs for cube arrays to a nir_texop_txs with a 2D array type followed by a nir_idiv by 6.

bool lower_tg4_broadcom_swizzle

If true, apply a .bagr swizzle on tg4 results to handle Broadcom’s mixed-up tg4 locations.

bool lower_tg4_offsets

If true, lowers tg4 with 4 constant offsets to 4 tg4 calls

enum nir_lower_tex_packing lower_tex_packing[32]

To lower packed sampler return formats.

Indexed by sampler-id.

enum nir_lower_tex_packing

Values:

enumerator nir_lower_tex_packing_none = 0

No packing

enumerator nir_lower_tex_packing_16

The sampler returns up to 2 32-bit words of half floats or 16-bit signed or unsigned ints based on the sampler type

enumerator nir_lower_tex_packing_8

The sampler returns 1 32-bit word of 4x8 unorm