Mesa 24.0.0 Release Notes / 2024-02-01

Mesa 24.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 24.0.1.

Mesa 24.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 24.0.0 implements the Vulkan 1.3 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

dc7e8c077bc5884df95478263b34bdebb7e88e600689cb56fb07be2b8c304c36  mesa-24.0.0.tar.xz

New features

  • VK_EXT_image_compression_control on RADV

  • VK_EXT_device_fault on RADV

  • OpenGL 3.3 on Asahi

  • Geometry shaders on Asahi

  • GL_ARB_texture_cube_map_array on Asahi

  • GL_ARB_clip_control on Asahi

  • GL_ARB_timer_query on Asahi

  • GL_EXT_disjoint_timer_query on Asahi

  • GL_ARB_base_instance on Asahi

  • OpenGL 4.6 (up from 4.2) on d3d12

  • VK_EXT_depth_clamp_zero_one on RADV

  • GL_ARB_shader_texture_image_samples on Asahi

  • GL_ARB_indirect_parameters on Asahi

  • GL_ARB_viewport_array on Asahi

  • GL_ARB_fragment_layer_viewport on Asahi

  • GL_ARB_cull_distance on Asahi

  • GL_ARB_transform_feedback_overflow_query on Asahi

  • VK_KHR_calibrated_timestamps on RADV

  • VK_KHR_vertex_attribute_divisor on RADV

  • VK_KHR_maintenance6 on RADV

  • VK_KHR_ray_tracing_position_fetch on RADV

  • EGL_EXT_query_reset_notification_strategy

Bug fixes

  • vlc crashes when playing 1920x1080 video with Radeon RX6600 hardware acceleration and deinterlacing enabled.

  • [radeonsi] Regression: graphical artifacting on water texture in OpenGOAL

  • Assertion when creating dmabuf-compatible VkImage on Tigerlake

  • VAAPI: EFC on VCN2 produces broken H264 video and crashes the HEVC encoder

  • [AMDGPU RDNA3] Antialiasing is broken in Blender

  • MTL: vulkan cooperative matrix tests gpu hang on MTL

  • Assassin’s Creed Odyssey wrong colors on Arc A770

  • The Finals fails to launch with DX12 on Intel Arc unless “force_vk_vendor” is set to -1.

  • VA-API CI tests freeze

  • radv: games render with garbage output on RX5600M through PRIME with DCC

  • radv: RGP reports for mesh shaders are confusing

  • zink crashes on nvidia

  • d3d10umd: Build failure regression with MSVC during 23.3 development cycle

  • Error during SPIR-V parsing of OpCopyLogical

  • rusticl: fails to find SPIRV-Tools headers via pkg-config under non-default prefix

  • Conservative depth output doesn’t work with RADV

  • RADV: DOA-X3 (yuzu) missing hair, eyes and skybox

  • intel: Require 64KB alignment when using CCS and multiple engines

  • radv: Atlas Fallen corrupted rendering

  • r300: nir pass to lower indirect regression

  • r300: LRP present even with .lower_flrp32=true

  • 23.3.2 regression: kms_swrast_dri.so segfaults

  • Radeon: YUYV DMA BUF eglCreateImageKHR fails

  • No support for a644

  • anv: importing memory for a compressed image using modifier is hitting an assert

  • anv: importing memory for a compressed image using modifier is hitting an assert

  • anv: importing memory for a compressed image using modifier is hitting an assert

  • Large regression in `glbench –tests context` on Intel

  • Android 14 depends on Vulkan EXT_swapchain_maintenance1, which breaks radv

  • nvk,nak: Implement shaderFloat64

  • Mesa is not compatible with Python 3.12 due to use of distutils

  • anv: glcts regression on zink

  • nir: Trivial loop not unrolling

  • Possible regression with AMD GPU with flatpak apps

  • nvk,nak: Implement VK_KHR_vulkan_memory_model

  • Compiling Mesa with X in custom prefix fails in Intel Vulkan driver

  • anv: implement recommended AUX-TT invalidation on compute/transfer queues

  • anv: implement recommended AUX-TT invalidation on compute/transfer queues

  • !26307 broke some piglit tests with rusticl on radeonsi on Navi 14

  • Compute shader with imageStore() to a swapchain image (from a display surface) produces incorrect results (Raspberry, Vulkan).

  • nvk: Implement VK_EXT_multi_draw

  • radv/aco: Crysis 2 Remastered RT reflections are blocky around the edges with ACO, renders normally with LLVM

  • radv: Major regression in main branch causing all Vulkan apps to crash on 6600M (Navi 23)

  • [23.3.0] Parallel build failure - fatal error: vtn_generator_ids.h: No such file or directory

  • crocus: Assertion failures in NIR divergence analysis

  • nak: Implement nir_op_fmulz

  • nvk,nak: Implement VK_KHR_shader_float_controls

  • 748b7f80ef1cf6a3fed9991d70230e69fef51a0e - Regression on Doom Eternal w/ RT Reflections

  • glFlush() blocks until close to GPU completion on Radeon R9 270

  • nvk: Implement VK_EXT_texel_buffer_alignment

  • rusticl: fails to find X11 headers via pkg-config under non-default prefix

  • nvk,nak: Implement VK_EXT_shader_image_atomic_int64

  • nvk,nak: Implement VK_KHR_shader_atomic_int64

  • nvk,nak: Implement VK_KHR_shader_subgroup_extended_types

  • nvk,nak: Implement shaderInt64

  • nvk: Implement VK_EXT_subgroup_size_control

  • mesa:freedreno / afuc-disasm unit test failure

  • anv: Resident Evil 2 hang

  • Mesa 23.3.0 release build fails on 22.04 LTS

  • Segfault in SDL2 game when using environment variables: `SDL_VIDEODRIVER=wayland DRI_PRIME=1`

  • Mesa 22.3.0 SEGFAULT in nir shader creation for r600 cards on FreeBSD

  • radeonsi: merge request 26055 causes thousands of piglit failures

  • iris: INTEL_COMPUTE_CLASS causes gpu hangs on MTL platforms

  • anv: piglit tests regressed for zink

  • aco,radeonsi: GFX11 dEQP-GLES31.functional.separate_shader.random.0 fail when AMD_DEBUG=useaco

  • crash in si_update_tess_io_layout_state during _mesa_ReadPixels (radeonsi_dri, mesa 23.2.1)

  • Compilation error with current LLVM git (createLoopSinkPass)

  • [RADV] War Thunder has some grass flickering.

  • radv: satisfactory broken shader

  • RADV problem with R7 M440 in some games

  • nvk,nak: Weird fog effect in old GTA games with DXVK

  • gpu driver crashes when opening ingame map playing dead space 2023

  • [anv] Valheim water misrendering

  • radv, zink: dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16 fails on gfx9

  • Armored Core 6 (1888160) fake_sparse support

  • radv: fix sparseResidencyImage3D on GFX8

  • build still broken on Slackware 15.0 i586

  • mesa fails to build on arch

  • EGL/v3d: EGL applications under a X compositor doesn’t work

  • nvk,nak: Implement VK_KHR_fragment_shader_barycentric

  • RADV: trunc_coord breaks ambient occlusion in Dirt Rally and other games

  • radv: Mass Effect Legendary Edition: a line going across the screen is visible in some areas with Ambient Occlusion enabled

  • LTO-related build failures

  • anv: DIRT5 gfx11_generated_draws_spv_source triggers “assert(!copy_value_is_divergent(src) || copy_value_is_divergent(dest));”

  • nvk: Implement VK_KHR_synchronization2

  • nvk: Implement bufferDeviceAddressCaptureReplay

  • nvk,nak,codegen: Implement VK_KHR_pipeline_executable_properties

  • panfrost: gbm_bo_get_offset() wrongly returns 0 for second plane of NV12 buffers

  • Sastisfactory since Update 8 needs force_vk_vendor set

  • [RADV][TONGA] - BeamNG.drive (284160) - Artifacts are present when looking at the skybox.

  • LEGO Star Wars: The Skywalker Saga graphical glitches (DXVK) on R9 380

  • [radv] Crypt not rendering properly

  • Leaks of DescriptorSet debug names

  • [Tracing flake] Missing geometry in trace@freedreno-a630@freedoom@freedoom-phase2-gl-high.trace

  • Unreal Engine 5.2 virtual shadow maps have glitchy/lazy tile updates

  • RADV: Visual glitches in Unreal Engine 5.2.1 when using material with anisotropy and light channel 2

  • radv: Regression with UE5 test

  • SIGSEGV with MESA_VK_TRACE=rgp and compute only queue

  • mesa: vertex attrib regression

  • [ANV] Corruptions in Battlefield 4

  • anv regression w/ commit e488773b29d97 (“anv: Fast clear depth/stencil surface in vkCmdClearAttachments”)

  • freedreno uses wrong patch size

  • ir3: dEQP-GLES31.functional.synchronization.inter_invocation.image_atomic_read_write crash on a6xx gen4

  • a630: antichamber crashes with pack_A6XX_GRAS_CL_GUARDBAND_CLIP_ADJ: Assertion

  • mesa:amd+compiler / aco_tests assembler.gfx11.vop12c_v128/gfx11 failure with llvm-17

  • ci_run_n_monitor crash because of incorrect parsing of dag

  • Zink + Venus: driver can’t handle INVALID<->LINEAR!

  • anv not initializing engine correctly with INTEL_COPY_CLASS=1

  • Anv: Particles have black square artifacts on Counter Strike 2 on Skylake

  • Lords of the Fallen 2023 Red Eye mode crashing game and desktop

  • [radeonsi] [vulkan] [23.3-rc1 regression] Video output corrupted in QMplay2 with Vulkan renderer

  • [BISECTED] ac/radeon commit somehow breaks nv12 surface from HEVC decode

  • radv: Chrome crashes when ANGLE uses GPL

  • Parsec displays completely green screen with hardware decoder selected while using Mesa 23.3 and Mesa 24

  • H264 to H264 transcode output corruption with gst-vaapi

  • opencl-jpeg-encoder does not work with nouveau/rusticl, works with nouveau/clover

  • [rusticl] [radeonsi] [darktable4] [ppc64le] Darktable always renders black images despite not throwing any error

  • [R600] X-plane 11 demo (Linux Native) crashes upon launch on HD5870 and HD6970

  • [CI] .gitlab-ci/setup-test-env.sh date -d parsing fails on Alpine Linux containers

  • ANV not handling VkMutableDescriptorTypeCreateInfoEXT::pMutableDescriptorTypeLists[i] being out of range

  • Ubuntu 23.10 build error with rusticl_opencl_bindings.rs

  • Rusticl fails to build

  • tu: Wolfenstein: The New Order misrenders on a740

  • DRI_PRIME fails with ACO only radeonsi

  • ci_run_n_monitor: undetected sanity dep breaks the pipeline

Changes

Alejandro Piñeiro (10):

  • broadcom/qpu: use back BITFIELD64_RANGE for ANYOPMASK

  • broadcom/compiler: add v3d_pack_unnormalized_coordinates helper

  • broadcom: only support v42 and v71

  • broadcom/compiler: set properly lod query

  • broadcom/cle: remove v33 and v41 from xml definition

  • broadcom/cle: rename xml files

  • docs/v3d: update v3d documentation

  • nir: add new opcodes to map new v71 packing/conversion instructions

  • broadcom/compiler: update image store lowering to use v71 new packing/conversion instructions

  • broadcom/compiler: remove one superfluous call to nir_opt_undef

Alessandro Astone (2):

  • asahi: Use the compat version of qsort_r

  • zink: Fix resizable BAR detection logic

Alexander von Gluck IV (3):

  • egl/haiku: Cleanup includes; minor build fix

  • hgl: Redefine visual options in hgl_context.h

  • egl/haiku: Remove some dead cleanup code

Alyssa Rosenzweig (286):

  • hasvk: Support builiding on non-Intel

  • crocus: Support building on non-Intel

  • meson: Add vulkan-drivers=all option

  • meson: Add gallium-drivers=all option

  • gitlab: Highlight .cl as C

  • nir,vtn: Add exported bool to nir_function

  • nir: Add nir_remove_non_exported

  • nir/builder: Add nir_call helper

  • meson: Simplify clc expression

  • meson: Require clc for asahi

  • vtn: Add spirv_library_to_nir_builder feature

  • clc: Add missing idep_vtn

  • agx: Fix lower regular texture metadata

  • agx: Vectorize load/stores

  • agx: Fuse (unmasked) extr_agx

  • agx: Fuse ubitfield_extract

  • asahi: Fix agx_pack unrolling

  • asahi: Make GenXML compatible with OpenCL

  • asahi: Unpack at 32-bit granularity

  • asahi: Reexpress genxml pack macro

  • asahi: Add folder for internal shaders

  • asahi: Add asahi_clc infrastructure

  • asahi: Pass valid memctx to open_device

  • asahi: Deserialize libagx when opening device

  • asahi,agx: Plumb libagx

  • asahi: Add software-defined field to texture desc

  • agx: Use CL for texture lowerings

  • asahi: Remove placeholder shader

  • asahi: Fix tools=all builds

  • ci: Opt out asahi from clang-format

  • ttn: Set sample shading for sample ID reads

  • compiler: Make shader_enums.h CL-safe

  • compiler: Inline mesa_vertices_per_prim

  • compiler: Make u_decomposed_prims_for_vertices available to CL

  • nir/lower_gs_intrinsics: Include primitive counts

  • nir/lower_gs_intrinsics: Append EndPrimitive

  • nir/lower_gs_intrinsics: Count decomposed primitives too

  • nir: Also gather decomposed primitive count

  • nir: Add intrinsics for lowering GS

  • nir: Add intrinsics for lowering bindless textures/samplers

  • nir/print: handle adjacency

  • asahi: Clamp 8-bit integer RTs

  • agx: Legalize image MS index

  • agx: Fix fragment side effects scheduling

  • agx: Check for spilling in release builds

  • docs/features: Mark ARB_mdi done on asahi

  • agx: Cleanup 8-bit math before lowering

  • agx: Require 32-bit alignment for EOT offset

  • agx: Add scaffolding for subgroup ops

  • agx: Translate simple subgroup ops

  • asahi: Pack non-border colour sampler desc

  • agx: Allow drivers to lower texture handles

  • asahi: Lower samplers to bindless if needed

  • agx: Lower LOD bias earlier

  • agx: Handle bindless samplers

  • asahi: Handle load_sampler_handle

  • asahi: Add sampler heap data structure

  • asahi: Use the sampler heap

  • asahi: Upload tex/samplers properly with merged shaders

  • asahi: Don’t hazard track fake resources

  • asahi: Refactor encoder data structure

  • asahi: Factor out agx_launch

  • asahi: Make encoder_allocate public

  • asahi: Add data structures for geometry shaders

  • asahi: Add helpers for lowering GS

  • asahi: Add GS lowering pass

  • asahi: Wire up geometry shaders

  • asahi: Advertise geometry shaders

  • asahi: rm unused deqp debug flag

  • asahi: Don’t use OpenGL clip bit

  • asahi: Plumb clip_halfz bit from RS

  • asahi: Advertise ARB_clip_control

  • asahi: Implement timer queries

  • docs: Mark timer queries as done on asahi

  • asahi: Implement ARB_base_instance

  • nir: Simplify nir_alu_instr_channel_used definition

  • nir/validate: Optimize ssa_srcs set

  • nir/validate: Don’t spam nir_alu_instr_channels

  • nir/validate: Don’t validate out-of-bounds channels

  • nir/validate: Use unlikely for validate_assert

  • nir/validate: Don’t check dimensions in validate_def

  • nir/validate: Drop stale todo

  • nir/validate: Inline validate_ssa_src

  • nir/validate: Split out validate_sized_src

  • nir/validate: Specialize if source validation

  • panfrost: Add an allow_rotating_primitives() helper

  • panfrost: Factor out vertex attribute stride calculation

  • panfrost: Add panfrost_get_{position,varying}_shader() helpers

  • gallium: add pipe_shader_from_nir helper

  • radeonsi: use pipe_shader_from_nir

  • v3d: use pipe_shader_from_nir

  • asahi: use pipe_shader_from_nir

  • vc4: use pipe_shader_from_nir

  • zink: use pipe_shader_from_nir

  • nouveau: use pipe_shader_from_nir

  • panfrost: use pipe_shader_from_nir

  • gallium: drop pipe_shader_state_from_nir

  • mesa/st: collapse tgsi deadcode

  • mesa/st: use pipe_shader_from_nir

  • nir/lower_tex: Add 1D lowering

  • agx: fix 1D texture sampling

  • ac,radv,radeonsi: use common 1D texture lowering

  • nir/format_convert: handle clamping smaller bit sizes

  • nir/lower_idiv: Optimize idiv sign calculation

  • agx: Hotfix for stack_adjust in GS

  • asahi/decode: Decode multiple macOS commands

  • asahi: Quiet clang warning

  • asahi: Add half float type to genxml

  • asahi: Add XML for hw tessellation

  • asahi: Identify Primitive ID frag input

  • asahi: Identify bicubic filtering mode

  • asahi: fix index bias with GS/XFB

  • asahi: Sync heap size

  • asahi: init clear colour between batches

  • asahi: clamp clear colours

  • asahi: handle self blits

  • asahi: bump limits

  • asahi: remove bogus assertion

  • asahi: be robust about null xfb

  • asahi: fix dirty tracking fail with point sprites

  • asahi: handle null PBE

  • asahi: Be robust with arrays of images

  • asahi: fix imageSize of null image

  • asahi: rm compact image atomic descriptors

  • asahi: use 2D descriptors for cubes

  • asahi: defer texture packing to draw-time

  • ail: handle >4GiB textures

  • asahi: return GL_OOM for excessive image sizes

  • asahi: fix meta usc builder allocation

  • asahi: implement xfb stream queries

  • asahi: fix output to non-rast streams

  • asahi: bump glsl version

  • asahi: minify when blitting for transition

  • asahi: blit with the old format when transitioning

  • asahi: flush before resource transition

  • agx: Fix flatshading of matrices

  • asahi: fix xfb of pointsize when not drawing points

  • asahi: defeature quads

  • asahi: Rotate tri fans based on provoking vtx

  • asahi: use GS for first-provoking fans

  • asahi: Early out for GS + rast discard

  • asahi: Implement draw parameters

  • agx: wire up texture_samples/image_samplers

  • asahi: advertise ARB_shader_texture_image_samples

  • asahi: fix layout transitions with arrays

  • asahi: use correct target packing PBE

  • asahi: choose staging bind better

  • asahi: fix destroy_query leaving dangling references

  • asahi: add agx_push macro

  • asahi: collapse unreachable condition

  • asahi: use agx_push

  • asahi: remove dead declarations

  • asahi: rm unnecessary uniform upload for GS

  • asahi: make UB easier to see

  • asahi: force GS for indirect prim gen query

  • asahi: rework GS input assembly

  • asahi: Implement multidraw indirect

  • asahi: move heap alloc to first use

  • asahi: double depth bias

  • asahi: add static assert

  • agxdecode: fix stack smash with border colour

  • asahi: Support L/A/I formats for texture buffers

  • asahi: fix tri fan enum

  • asahi: rework cf binding xml

  • asahi: add xml for flatshading fans

  • agx: fix VARYING_SLOT_COL0 getting flatshaded

  • agx: Avoid scratch mem with tri strip w/ adjacency

  • agx: rework libagx linking a bit

  • asahi: Unroll GS/XFB primitive restart on the GPU

  • asahi: Lower edge flags

  • asahi: assert hw invariant

  • asahi: rewrite pointsize handling

  • agx: remove spurious z/s writes in force early-z shaders

  • agx: handle force early-z + discard

  • agx: note that sample_mask runs occlusion queries

  • agx: allocate varying slot if writing viewport only

  • agx: report if we have a nonzero viewport

  • asahi: allow empty scissor box

  • asahi: add XML for multiple viewports

  • asahi: Implement ARB_viewport_array

  • asahi: handle some components/offsets in GS lowering

  • asahi: prepare gs copy shaders for compact clip/cull

  • asahi: handle compact clip/cull in gs component gather

  • asahi: Implement ARB_cull_distance

  • asahi: add more BGR formats

  • asahi: fix dupe rgb65 formats

  • asahi: fix pbe swizzling

  • asahi: fix integer RT clamping

  • agx: fix fp64 lowering options

  • agx: Lower 64-bit I/O to 32-bit

  • agx: don’t produce split of immediate

  • asahi: fix size calculation for 2d msaa arrays

  • asahi: allow more format reinterpretation

  • asahi: respect render condition for compute

  • asahi: wire up hardware gl_PrimitiveID

  • asahi: clamp draw count for mdi

  • gallium: fix util_clamp_color type confusion

  • gallium: add PIPE_IMAGE_ACCESS_DRIVER_INTERNAL

  • nir/validate: allow bias on nir_texop_lod

  • asahi: Implement lod queries

  • vtn: fuse OpenCL mad if we can can

  • asahi: fix eMRT + background load interaction

  • ail: add is_level_compressed query

  • ail: use is_level_compressed

  • ail: add ail_is_level_twiddled_uncompressed

  • asahi: do not use compression blits for uncompressed levels

  • agx: allow bindful arrays if not clamping

  • asahi: don’t format convert with staging blits

  • asahi: implement arrays as 2d for internal images

  • asahi: respect last_block

  • asahi: allow compressed image stores in blits

  • asahi: fix image_mask with unbind num trailing

  • asahi: add compute blitter

  • asahi: add and use batch_is_compute helper

  • asahi: fix get_batch with compute batches

  • asahi: allow multiple compute dispatches in a batch

  • asahi: drop custom mipmap generate

  • asahi: set data_valid on first draw

  • asahi: fix data valid tracking

  • asahi: reduce transfer map flushing with staging blits

  • asahi: do not stall for writers with invalid mips

  • asahi: implement blit-based resource_copy_region

  • asahi: fix snorm staging blits

  • asahi: use copy region for decompression

  • asahi: fix scissor arrays

  • asahi: disable compute-based blitter for now

  • agx: use more mem->tex barriers even on g13g

  • agx: fix early-z + discard together

  • asahi: fix set_sampler_views

  • asahi: fix max tex sizes

  • agx: optimize fcmp like fcmpsel

  • agx: wire up some ballots

  • agx: lower votes to ballots

  • agx: implement query_levels

  • agx: skip scoreboard bit in builder for !wait

  • agx: make vec widths explicit in IR

  • agx: validate post-RA

  • agx: rm silly todo

  • agx: rm outdated comment

  • agx: add index size helper

  • agx: trust in agx_index size

  • agx: mv agx_read/write_regs to validator

  • agx: use custom assert when packing

  • agx: use mov imm for pcopies

  • agx: allow phis with 16bit imms

  • agx: prepare for immediates in phis

  • agx: handle imm inlining into phis

  • asahi: rework compute emptiness tracking

  • asahi: stub qbo on the cpu

  • asahi: implement xfb overflow queries

  • agx: const fold after discard lowering

  • agx: fix xfb of invalid comp

  • agx: fix xfb of invalid var

  • asahi: bump vertex shader outputs

  • asahi: rm pointless multisample key bit

  • asahi: rm layered bit from shader key

  • asahi: implement point sprites w/o shader key

  • asahi: rm unused blend enable bit

  • asahi: rm logicop enable bit

  • asahi: rm nr_cbufs from key

  • asahi: rm blend->store from shader key

  • asahi: rm vbuf.count from key

  • asahi: rm agx_vbufs wrapper

  • asahi: invert program_point_size

  • asahi: divide by xfb stride for xfb draws

  • asahi: disable fp16 cbuf cap

  • asahi: add missing GS line strip (+adj) handling

  • asahi: link libagx before lowering mem access widths

  • asahi: cl-ify some xfb logic

  • asahi: factor out libagx_map_vertex_in_tri_strip

  • asahi: rotate xfb’d tri strips

  • asahi: inline something silly

  • asahi: plumb get_ubo_size

  • asahi: make txf robust properly

  • asahi: fix passthrough GS with poly modes

  • asahi: add missing tib alignment check

  • agx: optimize split(64-bit uniform)

  • agx: expand agx_index

  • agx: fix 64-bit phis with inlined immediates

  • agx: add unit test for pcopy lowering bug

  • agx: require min alignment for load/store vectorize

  • asahi: fallback some resource copies

  • asahi: don’t canonicalize nans/flush denorms when copying

  • agx: unit test split uniform opt

  • agx: clang-fmt

  • nir,zink: Redefine flat_mask in terms of I/O locations

Andrew Gazizov (4):

  • venus: Add use_guest_vram capset to enable guest-based blob alloc

  • venus: Use vk_object_id as blob_id for guest_vram device memory alloc

  • venus: Tighten the conditions for guest_vram device memory alloc

  • venus: Make sure that guest allocated blobs from hostmem are mappable

Anthony Roberts (1):

  • glsl: Use unsigned instead of enum type in ir_variable_data

Antoine Coutant (1):

  • clc: retrieve libclang path at runtime.

Antonio Gomes (14):

  • rusticl, meson: Move libc functions to their own crate

  • rusticl, meson: Add gl/egl/glx bindings

  • iris: Fixups in resource_get_handle and resource_from_handle

  • mesa/st: Add new data to mesa_glinterop

  • mesa/st, dri2, wgl, glx: Modify flush_objects interop func to export a fence_fd

  • rusticl: Add xplat helpers to dynamic link interop functions

  • rusticl/device: Function to check for gl interop support

  • rusticl/device: Enable gl_sharing only if create_fence_fd is implemented

  • rusticl: Add functions to create CL ctxs from GL, and also to query them

  • rusticl/format: Add conversion table for GL->CL

  • rusticl: Create CL mem objects from GL

  • rusticl: Add support for cube maps

  • rusticl: Flush objects just before importing them

  • rusticl: Advertise cl_khr_gl_sharing extension

Anuj Phogat (1):

  • intel/l3: Adjust URB weight calculation for gfx12.5+.

Asahi Lina (12):

  • asahi: Fix CDM Launch/Barrier naming

  • asahi: Add extra CDM barrier bit for G13X

  • asahi: Move USC cache flush to agx_batch_init_state

  • asahi: Add more memory barrier opcodes

  • asahi: Add extra barrier for texture atomics on G13X

  • ail: Fix miptree offset generation for compressed textures

  • ail: Add explicit specification of mip level strides

  • ail: Fix tile size & strides for compressed textures

  • asahi: Add .editorconfig for CL files

  • asahi: Implement BO alignment

  • agx: Fix packing of stack map/unmap

  • agx: Add scoreboarding to stack instructions

Bas Nieuwenhuizen (11):

  • radv: Add DGC preprocessing barrier support.

  • radv: Add compute DGC preprocessing support.

  • radv: Add some initial graphics DGC preprocessing support.

  • radv: Add implementation of cmd buffers for a sparse binding queue.

  • radv: Remove the sparse binding queue from coherent images.

  • radv: Move sparse binding into a dedicated queue.

  • nir: Add nir_static_workgroup_size helper.

  • nir: Add pass for clearing memory at the end of a shader.

  • radv: Add option to clear LDS at the end of a shader.

  • radeonsi: Add support to clear LDS at the end of a shader.

  • radv: Use correct writemask for cooperative matrix ordering.

Benjamin Lee (14):

  • nak: make sm available in builders

  • nak: Legalize a bunch of instructions for SM50

  • nak: add IADD instruction for SM50

  • nak: implement ST* and LD* on SM50

  • nak: add ATOM{G,S} encoding for SM50

  • nak: add carry register file

  • nak: move iadd64 construction to a builder method

  • nak: use carry register file for IADD2

  • nak: make as_imm_not_{i,f}20 helper methods public

  • nak: implement SHL and SHR on SM50

  • nak: implement IMUL for SM50

  • nak: encode Dst::None as RZ on SM50

  • nak: implement SHFL on SM50

  • nak: implement VOTE on SM50

Boris Brezillon (74):

  • pan/genxml: Fix “{Last,First} Heap Chunk” field position

  • panfrost: Fix format_minimum_alignment() for v6-

  • pan/bo: Make sure we catch refcnt underflows

  • pan/genxml: Fix ‘Shader Program’ descriptor definition on v9 and v10

  • pan/decode: Print the resource table label

  • pan/decode: Make CSF decoding more robust to NULL pointers

  • pan/decode: Fix the pan_unpack() call for JUMP instruction unpacking

  • panfrost: Flag the right shader when updating images

  • panfrost: Kill unused panfrost_batch::polygon_list field

  • panfrost: Emit attribs in panfrost_update_state_3d() on bifrost/midgard

  • panfrost: Emit image attribs for compute in panfrost_update_shader_state()

  • panfrost: Rename panfrost_vtable::context_init

  • panfrost: Inline pan_emit_tiler_heap()

  • panfrost: Inline pan_emit_tiler_ctx()

  • panfrost: Count draws at the batch level

  • panfrost: Express the per-batch limit in term of draws

  • panfrost: Count the number of compute jobs at the batch level

  • panfrost: Make panfrost_has_fragment_job() public

  • panfrost: Stop using the scoreboard to check the presence of draws/compute

  • panfrost: Store the fragment job descriptor address in the batch

  • panfrost: Emit the fragment job from panfrost_batch_submit()

  • panfrost: Move the panfrost_emit_tile_map() call around

  • panfrost: Get rid of unused in_sync parameter in panfrost_batch_submit[_ioctl]()

  • panfrost: Get rid of the out_sync parameter in panfrost_batch_submit_jobs()

  • panfrost: Get rid of unused fb parameter passed to panfrost_batch_submit_jobs()

  • panfrost: Add a submit_batch() hook to panfrost_vtable

  • panfrost: Store the index pointer in panfrost_batch

  • panfrost: Stop passing vertex attribute arrays around

  • panfrost: Store varying related fields in panfrost_batch

  • panfrost: Use u_reduced_prim() to do the is_line check

  • panfrost: Move JM specific fields to their own struct

  • panfrost: s/panfrost_emit_vertex_tiler_jobs/jm_push_vertex_tiler_jobs/

  • panfrost: Move the JM-specific bits out of emit_fragment_job()

  • panfrost: Rename several job emission helpers

  • panfrost: Factor out the point-sprite shader update logic

  • panfrost: Factor out the vertex count logic

  • panfrost: Re-order things in panfrost_direct_draw()

  • panfrost: Move all JM-specific bits out of panfrost_direct_draw()

  • panfrost: Use batch->tls.gpu to store the compute TLS descriptor

  • panfrost: Move JM-specific bits out of panfrost_launch_grid_on_batch()

  • panfrost: Move JM specific bits out of panfrost_launch_xfb()

  • panfrost: Drop the vertex_count argument passed to panfrost_batch_get_bifrost_tiler()

  • panfrost: Rename panfrost_batch_get_bifrost_tiler()

  • panfrost: s/panfrost_emit_shader/jm_emit_shader_env/

  • panfrost: s/panfrost_emit_primitive/jm_emit_primitive/

  • panfrost: Rename JM-specific batch submission helpers

  • panfrost: s/preload/jm_preload_fb/

  • panfrost: s/init_batch/jm_init_batch/

  • panfrost: Prepare things for the common/JM cmdstream split

  • panfrost: Move JM helpers to their own source file

  • panfrost: Add a JOBX() macro to simplify job-frontend selection

  • panfrost: Fix multiplanar YUV texture descriptor emission on v9+

  • panfrost: Don’t leak NIR compute shaders

  • panfrost: s/pan_scoreboard/pan_jc/

  • panfrost: Rename pan_cs.{c,h} into pan_desc.{c,h}

  • panfrost: Make pan_afbc_compression_mode() per-gen

  • panfrost: Restrict job chain helpers to JM hardware

  • panfrost: Restrict job descriptor emission to JM hardware

  • util/hash_table: Use FREE() to be consistent with the CALLOC_STRUCT() call

  • util/hash_table: Don’t leak hash_u64_key objects when the entry exists

  • util/hash_table: Don’t leak hash_key_u64 objects when the u64 hash table is destroyed

  • panfrost: Abstract kernel driver operations

  • pan/kmod: Add a backend for the panfrost kernel driver

  • panfrost: Avoid direct accesses to some panfrost_device fields

  • panfrost: Avoid direct accesses to some panfrost_bo fields

  • panfrost: Back panfrost_device with pan_kmod_dev object

  • panfrost: Add a VM to panfrost_device

  • panfrost: Back panfrost_bo with pan_kmod_bo object

  • panfrost: Introduce a PAN_BO_SHAREABLE flag

  • panvk: Pass PAN_BO_SHAREABLE when relevant

  • panfrost: Flag BO shareable when appropriate

  • panvk: Fix tracing

  • panvk: Fix access to unitialized panvk_pipeline_layout::num_sets field

  • panfrost: Clamp the render area to the damage region

Boyuan Zhang (4):

  • gallium/pipe: define hevc max slices number

  • frontend/va: add support for multi slices reflist

  • radeonsi: add new interface to handle multi slice reflist

  • radeonsi/vcn: add new logic for hevc multi slices reflist

Brian King ((MEDIA)) (1):

  • d3d12: Add constraint_set1_flag support

Caio Oliveira (90):

  • anv: Fix leak when compiling internal kernels

  • intel/compiler: Remove unused parameter from brw_nir_adjust_payload()

  • intel/compiler: Take more precise params in brw_nir_optimize()

  • intel/compiler: Remove unused parameter from brw_nir_analyze_ubo_ranges()

  • intel/compiler: Clarify the asserts in nir_load_workgroup_id lowering

  • intel/compiler: Rework opt_split_sends to not rely/modify LOAD_PAYLOAD

  • intel/compiler: Re-enable opt_zero_samples() for Gfx7+

  • intel/compiler: Re-enable opt_zero_samples() in many cases for Gfx12.5

  • intel/compiler: Remove is_tex()

  • intel/compiler: Use linear allocator in parts of brw_schedule_instructions

  • intel/compiler: Remove reference to brw_isa_info from schedule_node

  • intel/compiler: Allocate all schedule_nodes at once

  • intel/compiler: Use array to iterate the scheduler nodes

  • intel/compiler: Add only available instructions to scheduling list

  • intel/compiler: Extract scheduling related basic functions

  • intel/compiler: Cache issue_time information

  • intel/compiler: Remove virtual calls from scheduler

  • intel/compiler: Move FS specific fields to fs_instruction_scheduler

  • intel/compiler: Merge child/latency arrays in schedule_node

  • intel/compiler: Tidy up code in scheduler related to reads_remaining

  • intel/compiler: Move earlier scheduler code that is not mode-specific

  • intel/compiler: Separate schedule_node temporary data

  • intel/compiler: Make scheduler classes take an external mem_ctx

  • intel/compiler: Reuse same scheduler for all pre-RA scheduling modes

  • intel/compiler: Clear up block instructions before re-adding them

  • intel/compiler: Simplify allocation of NIR related arrays

  • intel/compiler: Prefer ctor/dtors in some Google Tests

  • intel/compiler: Don’t use fs_visitor::bld in tests

  • intel/compiler: Don’t use fs_visitor::bld in fs_reg_alloc

  • intel/compiler: Don’t use fs_visitor::bld in thread payload classes

  • intel/compiler: Add a few more helpers to fs_builder

  • intel/compiler: Allow dumping CFG to a specific FILE*

  • intel/compiler: Sort lists of succs and preds in CFG dump output

  • intel/compiler: Add a few tests to opt_predicated_break

  • anv/xe2+: Use Region-based Tessellation redistribution

  • iris/xe2+: Use Region-based Tessellation redistribution

  • intel/compiler: Refactor program exit in intel_clc

  • intel/compiler: Use single variable instead of dynarray

  • intel/compiler: Fix memory leaks in intel_clc

  • intel/compiler: Remove the linking step in intel_clc

  • intel/compiler: Remove unused headers

  • intel/compiler: Move NIR emission code to brw_fs_nir.cpp

  • intel/compiler: Make a NIR intrinsic emission functions static

  • intel/compiler: Make more functions in NIR conversion static

  • intel/compiler: Make functions for NIR control flow conversion static

  • intel/compiler: Make setup functions of NIR emission static

  • intel/compiler: Make non-intrinsic NIR conversion functions static

  • intel/compiler: Make NIR atomic conversion functions static

  • intel/compiler: Make NIR resources helpers static

  • intel/compiler: Move nir_ssa_value into a local structure

  • intel/compiler: Move remaining NIR conversion fields to nir_to_brw_state

  • intel/compiler: Stop using fs_visitor::bld field in NIR conversion

  • intel/compiler: Annotate and use nir_to_brw_state::bld

  • intel/compiler: Don’t use fs_visitor::bld in remaining places

  • intel/compiler: Remove fs_visitor::bld

  • intel/compiler: Make fs_visitor not depend on fs_builder

  • intel/compiler: Make fs_builder include fs_visitor and not the other way

  • intel/compiler: Add ctor to fs_builder that just takes the shader

  • intel/compiler: Create and use nir_to_brw() function

  • intel/compiler: Use reference instead of pointer for nir_to_brw_state

  • intel/compiler: Use reference instead of pointer for fs_visitor

  • compiler/glsl: Reduce scope of is_anonymous

  • clover: Remove usage of glsl_type C++ helpers

  • compiler/types: Add a few more helpers to get builtin types

  • intel/compiler: Use C helpers to access builtin types

  • compiler: Remove C++ static member pointers to builtin types

  • intel/compiler: Use glsl_type C helpers

  • r600/sfn: Use glsl_type C helpers

  • nouveau: Use glsl_type C helpers

  • nir: Use glsl_type C helpers

  • mesa: Use glsl_type C helpers

  • lima: Use glsl_type C helpers

  • compiler/types: Add a few more glsl_type C helpers

  • glsl: Use glsl_type C helpers

  • compiler/types: Remove glsl_type C++ helpers

  • compiler/types: Use a typedef for glsl_type

  • intel/cmat: Add pass to lower cooperative matrix to subgroup operations

  • intel/dev: Add cooperative matrix configuration information

  • anv: Implement VK_KHR_cooperative_matrix

  • util: Add a way to set the min_buffer_size in linear_alloc

  • spirv: Use linear_alloc for parsing-only data

  • spirv: Use value_id_bound to set initial memory allocated

  • intel/fs: Only allocate acp_entry if we are adding one

  • intel/fs: Use linear allocator in opt_copy_propagation

  • intel/fs: Use linear allocator in fs_live_variables

  • anv: Don’t print warnings for GRL kernel compilations

  • intel/compiler: Use INTEL_DEBUG=cs to ask for brw_compiler output

  • nir: Disable -Wmisleading-indentation when compiling with GCC

  • ci: Add Werror=misleading-indentation to debian-clang

  • intel/compiler: Fix rebuilding the CFG in fs_combine_constants

Casey Bowman (1):

  • anv: Override vendorID for Diablo IV

Chia-I Wu (14):

  • radv: fix vkCmdCopyImage2 for emulated etc2/astc

  • radv: stop using vk_render_pass_state::render_pass

  • vulkan, tu, pvr: remove vk_render_pass_state::render_pass

  • radv: fix image view extent override for astc

  • radv: minor clean up to image view extent override

  • ac: be careful with stencil_offset override

  • radv: disable TC-compat htile on GFX9 in some cases

  • radv: fix VkDrmFormatModifierProperties2EXT for multi-planar formats

  • radv: fix VkSubresourceLayout2KHR for multi-planar formats with modifiers

  • radv: fix a typo in radv_image_view_make_descriptor

  • radv: fix asserts for radv_init_metadata

  • radv: convert a check in radv_get_memory_fd to assert

  • vk/util: ignore unsupported feature structs

  • Revert “vk/util: ignore unsupported feature structs”

Chris Spencer (7):

  • meson: Add option to ignore artificial Android limitations

  • android.mk: Add option to pass arbitrary parameters to meson

  • anv/android: Only limit advertised Vulkan version in strict mode

  • radv/android: Only limit advertised Vulkan version in strict mode

  • v3dv/android: Only limit advertised Vulkan version in strict mode

  • vn/android: Only limit advertised Vulkan version in strict mode

  • vulkan/android: Only limit advertised extensions in strict mode

Christian Gmeiner (13):

  • agx: Re-index nir defs to reduce memory usage

  • ci/etnaviv: Update ci expectation

  • etnaviv: rs: Call etna_rs_gen_clear_surface(..) when needed

  • etnaviv: Mark etna_rs_gen_clear_surface(..) private

  • docs: Update etnaviv extensions

  • etnaviv: Update headers from rnndb

  • etnaviv: Add static_assert(..) to catch memory corruption

  • isaspec: Add bool_inv type to print inverted bools

  • etnaviv: Add isaspec support

  • etnaviv: disassembler: Switch to isaspec

  • mesa: Drop not used program_written_to_cache

  • nir/opt_peephole_select: handle speculative ubo loads

  • pan/mdg: Use nir_builder for load_sampler_lod_parameters_pan

Colin Marc (1):

  • vulkan video: correctly set SPS VUI bits

Connor Abbott (32):

  • util/rb_tree: Fix editorconfig

  • util/rb_tree: Add augmented trees and interval trees

  • freedreno/ci: Remove minetest trace

  • v3d/ci: Remove minetest trace

  • vk,lvp,tu,radv,anv: Add common vk_*_pipeline_create_flags() helper

  • vk/graphics_state: Support VK_KHR_maintenance5

  • vk/graphics_state, tu: Rewrite renderpass flags handling

  • vk/graphics_state: Support VK_EXT_attachment_feedback_loop_dynamic_state

  • vk/graphics_state: Add vk_pipeline_flags_feedback_loops helper

  • tu: Assume no raster-order attachment access with NULL DS/blend state

  • tu: Fix order of rasterizer_discard check

  • tu: Make sure copies to half-float formats are bit exact

  • tu: Fix getting VkDescriptorSetVariableDescriptorCountLayoutSupport

  • ir3/ra: Don’t swap killed sources for early-clobber destination

  • nir: Add quad vote intrinsics

  • amd: Implement quad_vote intrinsics

  • nir/subgroups: Add option to lower Boolean subgroup reductions

  • amd: Enable boolean subgroup lowering

  • tu: Fix re-emitting VS param state after it is re-enabled

  • tu: Don’t use pipeline layout to emit shared const enable

  • tu: Rework dynamic offset handling

  • tu: Make filling out tu_program_state not depend on the pipeline

  • tu: Move shader linking to tu_shader.cc

  • freedreno/afuc: Handle store instruction on a5xx

  • freedreno/afuc: Add separate “SQE registers”

  • freedreno/afuc: Use SQE registers for call stack

  • freedreno/afuc: Add syntax for pre-increment addressing

  • freedreno/afuc: Decode (sdsN) modifier

  • freedreno: Update more control/pipe registers for a7xx

  • freedreno/afuc: README updates for a7xx

  • freedreno/afuc: Fix gen autodetection for a7xx

  • ir3/legalize: Fix helper propagation with b.any/b.all/getone

Corentin Noël (10):

  • mesa/bufferobj: ensure that very large width+offset are always rejected

  • virgl: fill the array_size value when using PIPE_TEXTURE_CUBE

  • virgl/texture: Align destination box to block depth

  • mesa/ffvs: Use gl_state_index16 in helpers directly

  • gallivm: Initialize indir_index to NULL before use

  • gallivm/lp_bld_nir_aos: Use TGSI instead of PIPE enum

  • mesa: Use a switch for state_iter and be more precise about its type

  • frontends/va: Remove wrong use of ProfileToPipe

  • virgl: Only send the same amount of data than declared in pipe_sampler_state

  • virgl: Assert build_id_note before dereferencing it

Daniel Almeida (33):

  • nak: derive From<OpFoo> for Op through a proc macro

  • nak: make Instr::new() generic

  • nak: compiler: add From<T:Into<Op>> for Instr

  • nak: compiler: replace Instr::new(..) with OpFoo {}.into()

  • nak: Heap-allocate Instrs

  • nak: Do not allocate vectors needlessly in optimization passes

  • nak: add support for floor, ceil and trunc

  • nak: run nir_lower_frexp and nir_opt_algebraic_late

  • nak: more lowerings

  • nak: change ishl data type to I32

  • nak: add support for nir_op_isign

  • nak: Add support for nir_op_bitcount

  • nak: add support for nir_op_bitfield_reverse

  • nak: add support for findmsb,findlsb

  • nak: add support for packhalf2x16_split

  • nak: add support for nir_op_unpack_half_2x16_split_{x|y}

  • nak: add support for atomic cmpxcgh on images

  • nak/sm50: rewrite encode_iadd2 to not use encode_alu()

  • nak: sm50: rewrite fsetp to not use encode_alu

  • nak: sm50: Rewrite fmnmx to not use encode_alu

  • nak: sm50: rewrite fmul to not use encode_alu

  • nak: sm50: rewrite fset to not use encode_alu

  • nak: sm50: rewrite iabs to not use encode_alu

  • nak: sm50: convert sel to not use encode_alu()

  • nak: sm50: convert i2f to not use encode_alu()

  • nak: sm50: rewrite encode_f2f to not use encode_alu()

  • nak: convert encode_imad to not use encode_alu()

  • nak: sm50: rewrite encode_popc to not use encode_alu()

  • nak: sm50: rewrite encode_prmt to not use encode_alu()

  • nak: sm50: remove encode_alu() and friends

  • nak/sm50: remove ALUSrc and friends

  • nak/sm50: remove *fmod* calls from iabs

  • nak: sm50: fix ineg legalization

Daniel Schürmann (24):

  • nir/lower_subgroups: optimize reductions with cluster_size == 1

  • nir: optimize open-coded quadVote* directly to new nir_quad intrinsics

  • aco: delete instruction selection for boolean subgroup operations

  • nir: remove info.fs.needs_all_helper_invocations

  • nir/gather_info: add missing wide subgroup operations

  • nir: add info.fs.require_full_quads

  • aco: enable helper lanes if shader->info.fs.require_full_quads

  • amd: rename max_wave64_per_simd -> max_waves_per_simd

  • aco: rename max_wave64_per_simd -> max_waves_per_simd

  • radv: fix number of physical SGPRs on GFX10+

  • aco: remove VCCZ and EXECZ register handling

  • nir/opt_loop: move loop control-flow optimizations into separate pass

  • treewide: replace calls to nir_opt_trivial_continues() with nir_opt_loop()

  • nir: remove nir_opt_trivial_continues()

  • nir: remove redundant passes from nir_opt_if()

  • nir/opt_loop_cf: generalize removal of “trivial” continues

  • aco: fix should_form_clause() for memory instructions without operands

  • aco: form clauses for LDS instructions

  • aco: add new post-RA scheduler for ILP

  • aco: refactor and speed-up dead code analysis

  • nir/opt_move_discards_to_top: don’t schedule discard/demote across subgroup operations

  • nir/gather_info: fix enumeration of wide subgroup intrinsics

  • aco: give spiller more room to assign spilled SGPRs to VGPRs

  • aco/insert_exec_mask: Fix unconditional demote at top-level control flow.

Daniel Stone (7):

  • ci: Try really hard to print final result string

  • ci/radeonsi: Occlusion queries are flaky on stoney

  • ci: Fix trivial typo in ARTIFACTS_BASE_URL

  • panfrost/ci: Remove Vulkan expectations from G57

  • panfrost/ci: Add environment variable to suppress warnings

  • panfrost/ci: Skip broken image copy tests

  • ci: Re-enable Collabora farm

Danylo Piliaiev (15):

  • tu: Fix reading of stale (V)PC_PRIMITIVE_CNTL_0

  • tu/a7xx: Zero out A7XX_VPC_PRIMITIVE_CNTL_0 in 3d blits

  • tu/a6xx: Exclude REG_A6XX_TPL1_UNKNOWN_B602 from reg stomping

  • tu/a7xx: Fix occlusion queries on pre-A740 GPUs

  • tu: Always print startup failure messages

  • tu: Return error when GPU is unsupported

  • freedreno/devices: Support Adreno 725

  • tu: Add a725 workaround dispatch at the start of each cmdbuf

  • freedreno/devices: Separate device definition into base + gen features

  • freedreno,tu,ir3: Pass fd_dev_info into ir3_compiler_create

  • freedreno,tu: Add env vars to modify fd_dev_info

  • freedreno: Add a644 support

  • freedreno/devices: Update a690 magic regs from WSL blob

  • turnip: Disable UBWC for D/S images on A690

  • freedreno: Disable UBWC for D/S images on A690

Dave Airlie (38):

  • vulkan: update video headers

  • vulkan/video: add support for h264 encode to common code

  • vulkan/video: add h265 encode support

  • vulkan/video: add h264 nal enum

  • vulkan/video: add a nal_unit lookup for hevc

  • util: add a bitstream encoder for video stream headers.

  • vulkan/video: add h264 level idc convertor utility

  • vulkan/video: add a h265 level translator.

  • vulkan/video: add h264 headers encode

  • vulkan/video: add h265 header encoders.

  • nak: fix backtrace crash running computeheadless

  • nak: make ipa encoding match the order in codegen gv100

  • nak: do perspective divide for interp none as well

  • nvk/xfb: set correct counter buffer for writing stream out counters.

  • nvk/nil: allow storage on VK_FORMAT_A2B10G10R10_UINT_PACK32

  • nvk: fix transform feedback with multiple saved counters.

  • nvk/nak/xfb: handle skipping properly when setting xfb_attr.

  • nvk: drop unneeded shader type conversion function

  • nvk/nak: fix regression with shf changes on sm70

  • intel/compiler: move gen5 final pass to actually be final pass

  • vulkan/video: drop encode beta checks and rename EXT->KHR

  • gallivm: handle llvm 16 atexit ordering problems.

  • intel/compiler: fix release build unused variable.

  • intel/compiler: revert part of “Move earlier scheduler code that is not mode-specific”

  • llvmpipe: fix caching for texture shaders.

  • gallivm/sample: refactor first/last level handling and use level_zero_only.

  • gallivm/sample: add some num_samples vs level zero only support

  • gallivm/sample: make the load_mip helper useful outside this file.

  • gallivm/lp: reduce size of lp_jit_texture.

  • gallivm/lp: reduce image descriptor size.

  • gallivm/lp: merge sample info into normal info

  • gallivm/lp: move sampler index around to reduce struct

  • lavapipe: bump .maxResourceDescriptorBufferRange

  • intel/compiler: reemit boolean resolve for inverted if on gen5

  • radv: don’t emit cp dma packets on video rings.

  • radv/video: refactor sq start/end code to avoid decode hangs.

  • radv: don’t submit empty command buffers on encoder ring.

  • gallivm: passing fp16_split_fp64 to fp16 lowering.

Dave Stevenson (2):

  • gallium: Add more TinyDRM drivers to the list of kmsro drivers

  • gallium: Add udl (DisplayLink) to the list of kmsro drivers

David Heidelberg (53):

  • ci/docs: add coreutils

  • ci: bump tags

  • ci/zink: reduce premerge testing on a618 to ~ 12 minutes

  • ci: hide Mesa install phase

  • ci: drop clover from release builds and remove rusticl build

  • ci: simplify debian-rusticl-testing definition

  • ci: drop mingw and wine from the x86_64 build container

  • ci: always cleanup pip and cargo leftovers

  • ci: bashify scripts, use arrays

  • ci: drop debootstrap, unused

  • ci/panfrost: run T860 traces as intended (nightly job)

  • ci/venus: reduce pre-merge to fit under 15 min

  • ci/alpine: do not store apk cache

  • ci/wine: move wine configuration into rootfs where is wine available

  • Revert “ci/wine: move wine configuration into rootfs where is wine available”

  • ci/lava: add wine into the amd64 ephemeral container packages

  • ci/zink: restore full premerge testing on Adreno 618

  • ci: fixup section names

  • ci/nouveau: define a kernel and dtb, so we can fetch it from external sources

  • ci: inject gfx-ci/linux S3 artifacts without rebuilding containers

  • ci/zink: disable nheko trace, as it sometimes crashes

  • gitlab: make commit more commit-like formatted

  • ci: tag sanity, rustfmt and clang-format job as a “placeholder” job

  • ci/traces: drop the freedoom-phase2-gl-high.trace

  • ci: disable Anholt farm

  • ci/freedreno: disable a660 as it’s down now

  • Revert “ci/freedreno: disable a660 as it’s down now”

  • ci: bump kernel to 6.6.4

  • docs: drop unused manual optimizations override

  • ci/freedreno: mark unvanquished-lowest trace as flaky and skip

  • ci/freedreno: switch Adreno 630 boards back to 6.4 kernel

  • ci/freedreno: increase fraction for Vulkan testing

  • ci/tu: add another failing pipeline strip draw

  • ci/freedreno: extend timeout for full runs

  • ci/freedreno: re-enable two Adreno 618 tests

  • ci/freedreno: timestamp-get no longer fails on Adreno

  • ci/freedreno: downgrade a618_piglit to 6.4 kernel

  • ci/freedreno: fail introduced by ARB_post_depth_coverage

  • rusticl: add freedreno alias for RUSTICL_ENABLE

  • ci/freedreno: more issues showed up on a618, let’s use 6.4

  • ci/austriancoder: separate HW definition from SW

  • ci/freedreno: downgrade whole Adreno 6xx series, incl. zink-a618 jobs

  • ci/broadcom: separate HW definition from SW

  • ci: skip EGL functional color_clears tests for Wayland

  • ci/lava: separate HW definitions from SW

  • ci/google: re-enable farm

  • ci/zink: update piano trace

  • ci/radeonsi: disable VA-API testing on raven

  • ci: enable ci-deb-repo for libdrm 2.4.119 (and others in the future)

  • ci/alpine: update to latest to get libdrm 2.4.119

  • ci: bump Fedora and Android libdrm2 to 2.4.119

  • ci/rootfs: add libdrm also inside the rootfs

  • ci/deqp: uprev deqp-runner for Linux too to 0.18.0

David Rosca (19):

  • frontends/va: Map decoder and postproc surfaces for reading

  • radeonsi/vce: Implement destroy_fence vfunc

  • radeonsi/uvd: Implement destroy_fence vfunc

  • radeonsi/uvd_enc: Implement destroy_fence vfunc

  • radeonsi/uvd_enc: Fix leaking session info buffer

  • Revert “radeon/radeon_vce: fix out of target bitrate in CBR mode (H.264)”

  • radeonsi/vce: Tweak motion estimation params for better quality

  • radeonsi/vce: Add VUI parameters in output bitstream

  • radeonsi/uvd_enc: Add VUI parameters in output bitstream

  • radeonsi: Fix offset for linear surfaces on GFX < 9

  • gallium/auxiliary/vl: Fix coordinates clamp in compute shaders

  • gallium/auxiliary: Fix coordinates clamp in util_compute_blit

  • gallium/auxiliary/vl: Scale dst_rect x0/y0 when rendering chroma plane

  • gallium/auxiliary/vl: Support interleaved input in deinterlace filter

  • Revert “frontends/va: Alloc interlaced surface for interlaced pics”

  • gallium/auxiliary: NIR blit_compute_shader

  • gallium/auxiliary/vl: NIR compute shaders

  • util/rbsp: Fill bits twice if reading more than 16 bits

  • radeonsi/vcn: Fix H264 slice header when encoding I frames

Dennis Bonke (1):

  • mesa: add managarm support

Dmitry Baryshkov (9):

  • freedreno/regs/mdp_common: change BPC1 -> BPC4

  • freedreno/regs/mdp_common: fix BPC comments

  • freedreno/regs: add mdp_fetch_mode enum

  • freedreno/drm: fallback to default BO allocation if heap alloc fails

  • ir3: fix shift amount for 8-bit shifts

  • ir3/a6xx: fix ldg/stg of ulong2 and ulong4 data

  • freedreno/drm: notify valgrind about FD_BO_NOMAP maps

  • freedreno/drm: don’t crash in heap allocator when run under valgrind

  • freedreno/drm: don’t crash for unsupported devices

Dudemanguy (1):

  • vulkan/wsi/wayland: fix wl_event_queue memory leak

Dylan Baker (3):

  • docs: add release notes for 23.2.1

  • docs: Add sha256 sum for 23.2.1

  • meson: add wrap for libdrm

Echo J (2):

  • nvk: Set HOST_CACHED_BIT for the GTT type

  • vulkan: Remove nonexistent output in vk_synchronization_helpers target

Eric Engestrom (236):

  • VERSION: bump to 24.0

  • docs: reset new_features.txt

  • docs: update calendar for 23.3.0-rc1

  • ci/rpi4: group all spec@ext_image_dma_buf_import@ext_image_dma_buf_import-sample_* together

  • ci/rpi4: add spec@ext_image_dma_buf_import@ext_image_dma_buf_import-sample_yvyu to the list of known failures

  • ci/zink+radv: add another flake on polaris

  • ci: drop confusing fake `rules`, `if` and `when` on the list of rules strings

  • docs/ci: allow sanity job to be missing

  • ci: don’t run sanity in Marge pipelines

  • ci: add `.never-post-merge-rules` to avoid re-running pre-merge jobs after merging

  • broadcom: use `.never-post-merge-rules` for all rpi tests

  • ci/radeonsi: add another flake

  • rpi4/ci: add more known dEQP-EGL.functional.*.*_context.gles*.other failures

  • rpi4/ci: move `spec@!opengl 1.1@depthstencil-default_fb-drawpixels-24_8 samples=2` from fails for flakes after an UnexpectedPass

  • rpi4/ci: remove `spec@!opengl 1.1@depthstencil-default_fb-drawpixels-32f_24_8_rev samples=2` from fails as it’s a flaky test and already marked as such

  • Revert “ci: backport two mesh/task query fixes for VKCTS”

  • ci/build-deqp: stop ignoring failures while fetching patches

  • ci/build-deqp: split deqp version into a variable

  • ci/build-deqp: move mkdir earlier

  • ci/build-deqp: print more detailed information about what deqp version is running

  • ci: bump image tags to rebuild deqp

  • ci/rules: add missing clang-format files to what needs containers to build

  • broadcom/ci: merge gl test lists to use a single deqp instance

  • broadcom/ci: fix list indentation

  • broadcom/ci: split broadcom-common manual rules to .broadcom-common-manual-rules

  • vc4/ci: add manual variant of .vc4-rules

  • v3dv/ci: add manual variant of .v3dv-rules

  • v3d/ci: add “full run” variant of v3d-rpi4-gl:arm64 as a manual job

  • v3dv/ci: add “full run” variant of v3dv-rpi4-vk:arm64 as a manual job

  • vc4/ci: add piglit “full run” variant of vc4-rpi3-gl:arm32 as a manual job

  • rpi4/ci: skip more timing out tests in the dEQP-VK.ssbo.layout.* group

  • zink+radv/ci: simplify deqp config

  • zink+radv/ci: ensure renderer is “zink on radv”

  • ci: restore sanity (aka. Revert “ci: don’t run sanity in Marge pipelines”)

  • gitlab_gql: strip newline at the end of the token file

  • ci_run_n_monitor: compile target_jobs_regex only once

  • ci/gitlab_gql: stop re-compiling regex now that all users pre-compile it

  • v3d/ci: run manual jobs in daily pipeline

  • radeonsi/ci: document new failures and flakes

  • ci: disable lima farm as it appears to be down

  • radv/ci: add navi21 flakes

  • radv/ci: add vega10 flakes

  • radv/ci: add polaris10 flakes

  • radv+zink/ci: add polaris10 flakes

  • radv+zink/ci: add navi10 flakes

  • bin/gitlab_gql: resolve sha locally to be able to use things like `HEAD`

  • gitlab_gql: make `–rev` optional, defaulting to `HEAD`

  • bin/gitlab_gql: fix command in example

  • bin/gitlab_gql: only get the pipeline when a pipeline is needed

  • v3d/ci: add new failures

  • bin/gitlab_gql: only allow a single `–print-*` argument per invocation

  • bin/gitlab_gql: rename get_job_final_definition() to print_…() since that’s what it actually does

  • bin/gitlab_gql: deduplicate fetch_merged_yaml() logic between print branches

  • bin/gitlab_gql: give a better name to the –print-job-manifest argument value than PRINT_JOB_MANIFEST

  • ci/valve-infra: ensure the correct farm picks up the job

  • docs: update calendar for 23.3.0-rc{2,3,4} and add another release candidate

  • util/xmlconfig: drop default SYSCONFDIR & DATADIR values

  • lima: drop unused lima_get_absolute_timeout()

  • intel/ci: fix gl/vk dependencies in hsw jobs

  • intel/dev: use libdrm.h wrapper to support builds without libdrm

  • ci_run_n_monitor: require user to add an explicit `.*` at the end if jobs like `*-full` are wanted

  • amd/ci: avoid re-running all the test jobs when changing the expectations for only one of them

  • egl/dri2: increase NUM_ATTRIBS to fit all the attributes

  • asahi: use util_resource_num() instead of open-coding it

  • ci/piglit: specify only the traces file in the job config

  • amd/ci: track changes to the traces config file as well

  • ci: fix kdl commit fetch

  • ci: uprev deqp-runner from 0.16.1 to 0.18.0

  • ci/deqp-runner: turn paths in errors into links

  • docs: update calendar for 23.0.0-rc5

  • docs: add another -rc

  • ci: use released version of meson

  • lp: make sure 0xff is unsigned before shifting it past signed int range

  • intel/perf: fix regex escaping

  • intel/ci: fix .hasvk-manual-rules

  • docs: update calendar for 23.3.0

  • docs/calendar: add 23.3.x releases

  • bin/python-venv: detect python version change

  • ci: disable opengl & gles in debian-vulkan build

  • radv/ci: add navi21-aco flake

  • bin/gen_release_notes: fix regex raw string

  • bin/python-venv: fix venv folder check

  • bin/gen_release_notes: include removed ‘new_features.txt’ in commit

  • docs: add release notes for 23.3.0

  • docs: add sha256sum for 23.3.0

  • docs: fix release date for 23.3.0

  • turnip: fix typo in comment

  • ci_run_n_monitor: allow picking a pipeline by its MR

  • amd/ci: radeonsi is gl, not vk

  • v3dv: update symbols that have become aliases for newer ones

  • v3dv: drop duplicate flag

  • radv: update symbols that have become aliases for newer ones

  • pvr: update symbols that have become aliases for newer ones

  • anv: update symbols that have become aliases for newer ones

  • hasvk: update symbols that have become aliases for newer ones

  • amd/ci: fix yaml indentation

  • amd/ci: split common amd files list from radeonsi files list

  • amd/ci: limit radv jobs to radv + aco files changes

  • nvk: update symbols that have become aliases for newer ones

  • vk/runtime: update symbols that have become aliases for newer ones

  • vk/wsi: update symbols that have become aliases for newer ones

  • vk/util: update symbols that have become aliases for newer ones

  • vk/overlay-layer: update symbols that have become aliases for newer ones

  • venus: update symbols that have become aliases for newer ones

  • venus: fix typo in comment

  • amd/ci: reuse .radeonsi-rules in .radeonsi-vaapi-rules

  • nvk: use `||` instead of `|` between bools

  • radeonsi/ci: update vangogh piglit expectations

  • freedreno/ci: add flake seen on a630

  • freedreno/ci: add more flakes seen on a630

  • freedreno/ci: add more a630 flakes

  • v3d: drop leftover from “move v3d_tiling to common”

  • radeonsi/ci: track changes to `vpelib`

  • turnip: update symbols that have become aliases for newer ones

  • util/blob: fix trivial typo

  • ci: explain what we mean by the various types of pipelines

  • ci: turn comment into code in `sanity` job rules

  • ci: identify merge request pipelines using `$CI_PIPELINE_SOURCE == merge_request_event` instead of `$CI_COMMIT_BRANCH` being missing

  • ci: rename is-pre-merge-for-marge to is-merge-attempt to be clearer

  • ci: drop containers, builds, and tests from post-merge pipeline

  • ci: add pipeline for direct pushes to main

  • ci: give an explicit priority to the scheduled nightly pipelines

  • ci: clean up pre-merge and fork pipelines rules

  • ci: make sure pre-merge pipelines have the same jobs as merge pipelines

  • ci: improve comments

  • ci: take microsoft farm offline

  • ci: fix rules for formatting checks

  • zink/ci: fix yaml indentation

  • zink/ci: use variable to avoid repeating the list

  • zink/ci: expand first (and only) level of folders in the list of files

  • zink/ci: run only the relevant jobs when changing the ci expectations

  • panfrost/ci: fix yaml indendation

  • panfrost/ci: run only the relevant jobs when changing the ci expectations

  • freedreno/ci: fix yaml indentation

  • freedreno/ci: run only the relevant jobs when changing the ci expectations

  • intel/ci: fix yaml indentation

  • intel/ci: deduplicate common intel files rules

  • intel/ci: expand first level of common intel files

  • intel/ci: anv changes should only trigger anv jobs

  • intel/ci: hasvk changes should only trigger hasvk jobs

  • intel/ci: run only the relevant jobs when changing the ci expectations

  • docs/calendar: add 24.0 branchpoint and release schedule

  • etnaviv/ci: fix yaml indentation

  • etnaviv/ci: expand first level of files in src/etnaviv/

  • etnaviv/ci: run only the relevant jobs when changing the ci expectations

  • broadcom/ci: avoid running the rpi4 jobs when changing the rpi3 expectations, and vice-versa

  • vk/update-aliases.py: drop dead –check-only

  • vk/update-aliases.py: allow specifying the files we want to update

  • vk/update-aliases.py: handle “no match” grep call

  • vk/update-aliases.py: sort files when informing the user of the matches

  • vk/update-aliases.py: simplify addition of other concatenated prefixes

  • vk/update-aliases.py: handle more concatenated prefixes

  • vk/update-aliases.py: enforce correct list order

  • vk/update-aliases.py: only apply renames for the vulkan api (not vulkansc)

  • v3dv/ci: only trigger on relevant changes

  • a630/ci: add another flake

  • freedreno/ci: move hang-y a630 jobs from pre-merge to nightly

  • spirv: add missing build dependency

  • ci/b2c: drop passthrough of unset CI_JOB_JWT

  • ci/b2c: stop ignoring errors in before_script

  • ci/b2c: fix indentation of comment and after_script: list

  • ci/b2c: drop unused B2C_EXTRA_VOLUME_ARGS

  • ci/b2c: tags are mandatory

  • ci/b2c: drop support for harbor.freedesktop.org

  • ci/b2c: drop unused –volume and –mount-volume

  • ci/b2c: always define job_volume_exclusions

  • ci/b2c: always define cmdline_extras

  • ci/b2c: use with:write instead of manually doing open;write;close

  • ci/b2c: export B2C_TEST_SCRIPT

  • ci/b2c: use envvars directly instead of converting them back and forth into cli args

  • ci/b2c: import all variables starting with `B2C_`

  • ci/b2c: rename B2C_TEST_SCRIPT to B2C_CONTAINER_CMD to match the automatic import

  • ci/b2c: identify dut by its id instead of its tags

  • docs: add release notes for 23.3.1

  • docs: add sha256sum for 23.3.1

  • docs: update calendar for 23.3.1

  • ci: deduplicate constructing the ARTIFACTS_BASE_URL

  • bin/gitlab_gql: fix –print-merged-yaml when –rev != HEAD

  • bin/gitlab_gql: print merged yaml as yaml instead of a python dict

  • v3d/ci: add flake

  • ci: fix indentation

  • ci: run every test when changing the build

  • docs: drop `:` in title

  • radv/ci: add flake

  • docs: document how to build the docs

  • vulkan/wsi: fix build when platform headers are installed in non-standard locations

  • ci/build: drop redundant meson/build.sh from jobs that already inherit from .meson-build

  • radv/ci: add flake on raven

  • ci: add nvk to the clang build

  • ci: disable collabora farm as it is currently offline

  • ci: fix farm restore pipelines

  • meson: always define {,DRAW_}LLVM_AVAILABLE one way or the other

  • docs: add release notes for 23.3.2

  • docs: add sha256sum for 23.3.2

  • docs: update calendar for 23.3.2

  • meson: update expat wrap

  • meson: update libarchive wrap

  • meson: update libxml2 wrap

  • meson: update zlib wrap

  • meson: use `allow_fallback` instead of manually listing the deps and what they provide

  • ci/containers: use build-libdrm.sh in debian/android

  • Revert “meson: add wrap for libdrm”

  • zink: update symbols that have become aliases for newer ones

  • zink/requirements: update feature and property names that have been promoted

  • docs/backport-mr: fix invalid nested formatting

  • docs: fix list whitespace

  • docs: mention that python package `packaging` is required on python 3.12+

  • lvp: update symbols that have become aliases for newer ones

  • egl: only accept APIs that are compiled in

  • ci: split & reuse debian version identifier

  • ci: convert several `find | xargs` to `find -exec`

  • ci/deqp: set default platform to `default` instead of glx, to also support wayland

  • docs: add release notes for 23.3.3

  • docs: add sha256sum for 23.3.3

  • docs: update calendar for 23.3.3

  • docs: close the 23.2 cycle

  • VERSION: bump for 24.0.0-rc1

  • .pick_status.json: Update to 4fe5f06d400a7310ffc280761c27b036aec86646

  • .pick_status.json: Mark 0557f0d59c5b22a8a934900ddc91f7a6057e146f as denominated

  • ci: make sure we evaluate the python-test rules first

  • .pick_status.json: Update to ff84aef116f9d0d13440fd13edf2ac0b69a8c132

  • .pick_status.json: Update to 10e2dbb63b9d1f8f35c4fc3f570cd19b3fc03b43

  • ci: fix job dependency error in MRs for bin/ci/* scripts

  • VERSION: bump for 24.0.0-rc2

  • ci/deqp: ensure that in `default` builds, wayland + x11 + xcb are all built

  • .pick_status.json: Update to d2b08f9437f692f6ff4be2512967973f18796cb2

  • .pick_status.json: Update to d0a3bac163ca803eda03feb3afea80e516568caf

  • .pick_status.json: Update to 90939e93f6657e1334a9c5edd05e80344b17ff66

  • .pick_status.json: Update to eca4f0f632b1e3e6e24bd12ee5f00522eb7d0fdb

  • VERSION: bump for 24.0.0-rc3

  • .pick_status.json: Update to b75ee1a0670a3207dfd99917e4f47d064a44197f

  • .pick_status.json: Update to 4cd5b2b5426e8d670fc3657eee040a79e3f9df1e

  • util: rename __check_suid() to __normal_user()

  • tree-wide: use __normal_user() everywhere instead of writing the check manually

  • util: simplify logic in __normal_user()

  • util: check for setgid() as well in __normal_user()

Eric R. Smith (1):

  • panfrost: fix panfrost drm-shim

Erico Nunes (6):

  • v3dv: Rework to remove drm authentication for wsi

  • lima/ci: update piglit ci expectations

  • Revert “ci: disable lima farm as it appears to be down”

  • panvk: Support modifiers for Wayland WSI

  • ci: lima farm is down

  • Revert “ci: lima farm is down”

Erik Faye-Lund (34):

  • docs: prepare for hawkmoth

  • docs: remove breathe/doxygen stuff

  • docs: improve readability of c-signatures

  • util: remove unused lut

  • panfrost: allow packing formats outside of pan_format.c

  • panfrost: bypass format-table for null-textures

  • panfrost: pass blendable formats to pan_pack_color

  • panfrost: store blendable_formats in panfrost_device

  • panfrost: look at correct blendable format version

  • panfrost: use perf_debug instead of open-coding

  • mesa/ffvs: use unreachable instead of assert

  • docs: apply permanent redirect

  • panfrost: do not open-code panfrost_has_fragment_job()

  • ci: opt-out panfrost from clang-format

  • panfrost: minify dimensions when converting modifiers

  • util/format: document NONE swizzle

  • lavapipe: do not use NONE-swizzle

  • panfrost: do not handle NONE-swizzle

  • d3d12: do not handle PIPE_SWIZZLE_NONE from sampler-view

  • zink: do not handle PIPE_SWIZZLE_NONE

  • meson: work around meson 0.62 issue

  • mesa/main: remove unused Log2 variants of width/height/depth

  • mesa/main: remove unused ClassID

  • mesa/main: use _mesa_is_zero_size_texture-helper

  • mesa/main: remove unused function

  • mesa/st: use _mesa_is_zero_size_texture-helper

  • zink: update profile schema

  • zink: use KHR version of maint5 features

  • panfrost: document ci failure

  • mesa/st: do not require render-target support for texture-only exts

  • mesa/st: do not check for emulated format

  • mesa: actually check for EXT_color_buffer_float support

  • mesa/main: require EXT_color_buffer_float for ES 3.2

  • mesa: check for float-format support

Etaash Mathamsetty (1):

  • driconf: add a workaround for Rainbow Six Siege

Faith Ekstrand (663):

  • nir: Add a lower_first_invocation_to_ballot option to lower_subgroups

  • nir: Add a lower_read_first_invocation option to lower_subgroups

  • nir/lower_bit_size: Fix subgroup lowering for floats

  • nir/lower_bit_size: Handle vote_feq/ieq separately

  • nir/lower_bit_size: Use u_intN_min/max()

  • nir: Split nir_lower_subgroup_options::lower_vote_eq into two bits

  • nir: Return b2b ops from nir_type_conversion_op()

  • nir/lower_bit_size: Use b2b for boolean subgroup ops

  • nir: add deref follower builder for casts.

  • nir: Handle wildcards with casts in copy_prop_vars

  • nir: Use nir_builder to insert movs

  • nir: Add asserts to nir_phi_builder_value_set_block_def

  • vc4: Stop assuming glsl_get_length() returns 0 for vectors

  • v3d: Stop assuming glsl_get_length() returns 0 for vectors

  • nir/lower_io_to_vector: Only call glsl_get_length() on arrays

  • nir/types: Support vectors in glsl_get_length()

  • nir: Handle array-deref-of-vec in vars_to_ssa

  • nir: Handle array-deref-of-vec in var split passes

  • nir/validate: Allow array derefs on vectors on function/shader_temp

  • nvk: Force all mappable BOs into GART pre-Maxwell

  • nvk: Fix nvk_heap_free() for contiguous heaps

  • nvk: Drop a bogus assert

  • nvk: Assert no storage images on Kepler

  • nir: Optimize boolean ieq/ine with an immediate

  • nouveau: Add initial headers and meson for the new compoiler

  • nak: Copy the optimization loop from Intel

  • nak: Add a bunch of shader lowering code in NIR

  • nak: Add initial stubs for rust code

  • nvk: Run shaders through NAK

  • nak: Add the core IR

  • nak: Add Rust bindings for NIR

  • nak: Add initial translation from NIR

  • nak: Add a copy-prop pass

  • nak: Add a dead-code pass

  • nak: Add a util library

  • nak: Add a trivial register allocator

  • nak: Add a lowering pass for VEC and SPLIT instructions

  • nak: Add a lowering pass for ZERO sources and destinations

  • nak: Add bitset infrastructure

  • nak: Add encoding for a few instructions

  • nak: Encode program headers

  • nak: Header stuff

  • nak: Lower system values to a new load_sysval_nak intrinsic

  • nak: Implement load_sysval_nv as S2R

  • nak: Implement load_ubo

  • nak: Implement load/store_global

  • nak: Zero out the .w component of descriptors

  • nak: Add an instruction fuzzing tool

  • nak: Implement iadd and ishl

  • nak: Add a pass for computing instruction dependencies

  • nak: Implement 32-bit logic ops

  • nak: Add support for instruction predicates

  • nak: Implement integer comparisons

  • nak: Implement bcsel

  • nak: Rework ALU instruction encode

  • nak/meson: Use bindgen dependencies

  • nak: Add nak_compiler_create/destroy

  • nvk: Pass an actual nak_compiler to nak_compile_shader()

  • nak: Plumb the SM through to nak::Shader

  • nak: Encode load/store correctly on SM80

  • nak: Rework instruction encoding

  • nak: Implement boolean logic ops

  • nak: Lower 8 and 16-bit types

  • HACK: Support old meson

  • nak: Use Instr::num_srcs/dsts() less

  • nak: Get rid of meta instructions

  • meson: Pull in syn from crates.io

  • nak: Add SrcAsSlice and DstAsSlice traits

  • nak: Add a SrcModsAsSlice trait

  • nak: Use a different inner struct type for each opcode

  • nak: Use Src::Zero for load_const(0)

  • nak: Handle zeroes at emit time

  • nak: Implement i2f

  • nak: Implement fadd

  • nak: Rework integer compare ops

  • nak: Implement float comparisons

  • nak: Implement nir_op_b2f32

  • nak: Implement unary float and integer ops

  • nak: Allow iadd3 to take an immediate in srcs[2]

  • nak: Implement fsign

  • nak: Rework ALUSrc in emit code

  • nak: Rework source modifiers

  • nak: One of the predicates in IADD3 is a destination

  • nak: Implement Display for SSAValue

  • nak: Make Dst its own type

  • nak: Add modifier propagation

  • nak: Implement basic control-flow

  • nak: Move nak_compiler to nak_private.h

  • nak: Add a nir_shader_compiler_options to nak_compiler

  • nvk: Pull the NIR options from NAK

  • nak: Implement b2i32

  • nak: Implement iadd64

  • nak: Implement phis

  • nak: Add a union-find implementation

  • nak: Lower global access to scalars as needed

  • nak: Print names of missing instructions

  • nak: Implement unpack_64_2x32_split_*

  • WIP: nak: Rework the barrier assignment pass

  • nak: Add an SSAValueAllocator struct

  • nak: Pass an SSAValueAllocator through to map methods

  • nak: Handle fadd funnyness in the emit code

  • WIP: nak: Add a legalization pass

  • nak: Rename Imm to Imm32

  • nak: Add separate True and False source types

  • nak: Handle phis with non-SSA sources

  • nak: Support both destinations in PLOP3

  • nak: Drop the special cases for single-component vec/split

  • nak: Don’t emit MOVs for overlapping vec and split src/dst

  • HACK: nak: Lower iadd64 again

  • nak: Add a parallel copy in struction with lowering

  • nak: Use OpParCopy for OpVec and OpSplit lowering

  • nak: Get rid of the BitSet and BitSetMut traits

  • nak: Rename BitSetView to BitView

  • nak: Add a BitSet struct

  • nak: Add an SSAComp struct

  • nak: Rework dead-code

  • nak: Rework phis

  • nak: Add a space to the end of vec and split arg lists

  • nak: Add a liveness analysis pass

  • nak: Add a non-trivial register allocator

  • nak: Improve the dependency tracker

  • nak: Handle token re-use in dep tracking

  • nak: Implement nir_op_i(eq|ne) for booleans

  • nak: Fold [P]Lop3 sources

  • nak: Predicates default to true

  • nak: Implement nir_op_[iu](min|max)

  • nak: Implement nir_op_fmul

  • nak: Implement nir_op_(fmin|fmax)

  • nak: Implement nir_op_u2f

  • nak: Implement nir_op_vecN

  • nak: Implement MuFu and a bunch of float unops

  • nak: Move nak_sysval_attr_addr/sysval_idx higher in the file

  • nak: Implement input interpolation

  • nak: Handle multiple vector destinations in RA

  • nak: Use immediage offsets for load/store_global

  • nak: Implement OpFSOut with an OpParCopy

  • nak: Implement f2[iu]32

  • nak: Wire up ffma

  • nak: Add more legalization

  • nak: Implement right-shifts

  • nak: Implement nir_op_[iu]mul[_high]

  • nak: Enable nir_lower_idiv

  • nak: Add a NIR texture lowering pass

  • nak: Use more core NIR texture lowering

  • nak: Wire up texture ops

  • nak: Simplify the FromVariants proc macro

  • nak: Simplify the (Srcs|Dsts)AsSlice proc macro

  • HACK: spirv: Add a MESA_SPIRV_DUMP_PATH environment variable

  • nak: Add a NAK_DEBUG environment variable

  • nvk: Drop printing of NAK shaders

  • nvk: Pass NAK flags through to shader cache UUIDs

  • nak: Add a debug flag to assign worst-case instruction deps

  • nak: Rework vector handling

  • nak: Legalize vector sources

  • nak: Add a use tracker to RA

  • nak: Much more believable try_find_unused_reg_range()

  • nak: Implement nir_op[iu]mul_2x32_64

  • Revert “HACK: nak: Lower iadd64 again”

  • nak: Implement nir_op_ixor

  • nak: Implement undef instructions

  • nak: Implement image load/store

  • nak: Wire up OpLd and OpSt for local and shared

  • nak: Implement nir_intrinsic_load/store_scratch

  • nak: Add a smarter new_lop2 helper

  • nak: Improve RA failure messages

  • nak: Legalize OpShf

  • nak: Only put actually live SSA values in the ra.live_in sets

  • nak: Legalize more stuff

  • nak/nir: Lower image size and samples to txq

  • nak: Improve [FI]SETP encoding

  • nak: Legalize Op[FI]Setp

  • nak: Don’t allow r255 in texture or surface ops

  • nak: sin() and cos() require we divide by 2pi

  • nak: Add F2F and implement fquantize16

  • nak: Implement barriers

  • nvk: Plumb num_barriers through from NAK

  • nak: Implement load/store_shared

  • nak: Integers don’t have abs() source modifiers

  • nak: Add a mechanism for decorating sources with types

  • nak: Decorate sources with types

  • nak: Only divide FS inputs by .w for smooth interpolation

  • nak: Rework source modifiers a bit

  • nak: Add a Src::supports_src_type() helper

  • nak: Rework copy-prop to use soruce type decorations

  • nak: Implement nir_intrinsic_global_atomic_*

  • nak: Implement nir_intrinsic_shared_atomic_*

  • nak: Implement global/shared_atomic_comp_swap

  • nak: Implement image atomics

  • nak: Fix the 2nd predicate on LOP3

  • nak: Optimize OpLop3 and OpPLop3

  • nak: DCE things with constant false predicates

  • nak: Rework source modifiers instructions a bit

  • nak: Fold fsat into FAdd/FFma/FMul

  • nak: Delete unused imports and dead code

  • nak: Add accum predicates to Op[FI]Setp

  • nak: Add a Pred struct move the enum to PredRef

  • nak: Fix multisampled textureing

  • nak: Legalize everything

  • nak: Rework cbufs a bit

  • nak: Implement indirect UBO loads

  • nak: Implement nir_op_b2b1 and nir_op_b2b32

  • nak: Follow memcpy semantics with OpParCopy

  • nak: Work in terms of bits for type sizes

  • nak: Add a builder

  • nak: Use the builder in some lowering passes

  • nak: Compute liveness in reverse block order

  • nak: Rework liveness to add next-use information

  • nak: Add a PerRegFile helper struct

  • nak: Record register pressure in liveness

  • nak: Initialize RA with only live registers

  • nak: Use num_regs instead of max_reg in RA

  • nak: Use pcopy.push() in RA

  • nak: Rework RA a bit

  • nak: Add some documentation for SSA values

  • nak: Print to stderr

  • nak/ra: Pass a PerRegFile num_regs into the allocator

  • nak: Allocate the minimum number of GPRs.

  • nak: Separate the CFG from liveness

  • nak: Break guts of liveness into traits

  • nak: Require Rust 1.70.0

  • nak: Handle dead destinations in RA

  • nak: Make calc_max_live a function of the Liveness trait

  • nak: Bring back bitset-based liveness

  • nak: Add mum_gprs and tls_size to Shader

  • nak: Accurately set num_gprs

  • nak: Add a RegFileSet struct

  • nak: Add more SSA iterator options

  • nak: Add a new VecPair type

  • nak/nir: Add more helpers

  • nak: Emit if branches in the predecessor block

  • nak: Add a more awesome CFG data structure

  • nak: Store the blocks in the CFG

  • nak: Base liveness on CFG indices

  • nak: Add loop detection to the CFG

  • nak: Add a phi allocator

  • nak: Refactor nak_assign_regs a bit

  • nak: Use u32 for register indices

  • nak: Rework map_instrs()

  • nak: Add a new OpCopy instruction for parallel copy lowering

  • nak: Use the builder for the legalize pass

  • nak: Use OpCopy in legalize

  • nak: Use more OpCopy

  • nak: Add a Mem register file

  • nak: Handle RegFile::Mem in parallel copy lowering

  • nak: Allow DCE on functions

  • nak: Restructure liveness construction

  • nak: Add interference helpers

  • nak: Add a dominance check to CFG

  • nak: Add helpers to BasicBlock to get phis

  • nak: Add a to-CSSA pass

  • nak: Add an SSA repair pass

  • nak: Union find

  • nak/ra: Drop the pointless AssignRegs struct

  • nak/ra: Handle parallel copies as a special case

  • nak/ra: Don’t free killed for OpPhiSrcs

  • nak: Expose LiveSet for incremental liveness tracking

  • nak: Add a RegFileSet filter to NextUseLiveness::for_function()

  • nak: Add more NextUseLiveness helpers

  • nak: Add a spilling pass

  • nak: Use the correct number of GPRs on Turing+

  • nak: Spill registers before RA

  • nak: Add a debug flag to test spilling

  • nak: Implement shader clock

  • nak/ra: Improve coalescing

  • nak/spill: Tweak the construction of S sets

  • nak: Document spilling and RA

  • nak: Add an alloc_vec() to SSAValueAllocator

  • nak: Move all the IADD3 insanity to a new OpIAdd3X opcode

  • nak/legalize: Fix too many IADD3 source modifiers

  • nak: Disable lower_image_size_to_txs for NAK

  • nak: IMAD also has a destination predicate

  • nak: Remap GLSL_SAMPLER_DIM_SUBPASS and SUBPASS_MS to 2D and MS

  • nak: Fix instruction ordering in nak_ir.rs

  • nak: Rename OpBFind to OpFlo

  • nak: Implement Index[Mut] for RegTracker

  • nak: Use the right number of predicates in RegTracker

  • nak: Rework the barrier insert pass

  • nak: Rework calc_delay.rs

  • nak: Re-work Instr::get_latency()

  • nak: Emit FS_OUT before EXIT

  • nvk: Use sysvals for fragcoord etc. with NAK

  • nak: Handle flat FS inputs

  • nak: Add support for centroid and sample interp modes

  • nak: Use load_interpolated_input for frag_coord

  • nak: Properly handle OpFSOut in RA and liveness

  • nak: Handle empty OpFSOut

  • nak/nir: Several FS output fixes

  • nak: Implement load_sample_id and load_sample_mask_in

  • nak: Implement discard and demote

  • nak: Set TLS size properly in the shader header

  • nvk,nak: Plumb through the zs_self_dep key bit

  • nak: Use count_attribute_slots for FS input var sizes

  • nak: Pull sm, num_gprs, and tls_size into a ShaderInfo struct

  • nak: Stash a ShaderInfo in ShaderFromNir

  • nak: Rework FS outputs again

  • nak: Re-plumb compute shader info

  • nak: Plumb more FS info through to the C API

  • nvk/nak: Translate our new FS flags from NAK to nvk_shader

  • nak: Saturate depth writes

  • nak: Add support for gl_FrontFace

  • nak/nir: Fix helper invocations

  • nak/nir: Use nir_shader_intrinsics_pass for FS inputs

  • nak: Handle interpolate_at_offset

  • nak: Take components into account in load_*input

  • nak: Plumb uses_kill through from nak_from_nir

  • nak/nir: Plumb the FS key into lower_fs_input_intrin

  • nak/nir: Move frag_coord/sample_pos lowering to FS input lowering

  • nak/nir: Fix sample vs. pixel input interpolation

  • nak/nir: Add a load_frag_w helper

  • nak/nir: Interpolate gl_PointCoord

  • nak/nir: Return one sample for gl_SampleMaskIn[0] when sample shading

  • nak: Fold source modifiers in legalize

  • nak: Provide more detail when printing IR after passes

  • nak: Handle modifiers in dedup_srcs() in opt_lop()

  • nvk: Add a helper for lowering system values to root table loads

  • nvk: Lower more draw system values

  • nak: Take component into account in store_output

  • nak: Fix printing of OpASt

  • nak: Move NIR enum translation out of nak_sph.rs

  • nak: rustfmt fixes

  • nak: Simplify I/O gathering

  • nvk: Set clip/cull_enable for NAK shaders

  • nak: Run simple liveness data-flow bottom-up

  • nak/bitset: Add a helper for modifying in-place

  • nak: Don’t allocate bitsets in liveness data-flow

  • nak: Handle non-constant I/O offsets

  • nouveau/parser: Dump SET_STREAM_OUT_CONTROL_* properly

  • nak: Translate XFB info

  • nvk: Plumb through XFB info from NAK

  • nak: Add a Label struct for branch targets

  • nak: Add OpNop which can have a label

  • nak: Break indirect offset encoding into a helper

  • nak: Allow encoding Dst::None

  • nak: Add barrier instructions

  • nak/builder: Return the instruction from push_*()

  • nak: Implement NIR control barriers

  • nak: Implement From for SrcRef for more types

  • nak: Add enums for sysvals and attributes

  • nak: Plumb clip/cull enables through nak

  • nak/nir: Lower tessellation and geometry I/O

  • spirv: Fix locations for per-patch varyings

  • nak: NVIDIA calls them tessellation init shaders

  • nak: Rework OpALd and OpASt a bit

  • nak: Set per patch attribute count both places in the SPH

  • nak: Handle location_frac for FS outputs in nak_from_nir.rs

  • nak: Add lowering for per-vertex I/O

  • nak: Implement more attribute I/O

  • nak/nir: Lower load_primitive_id

  • nak,nvk: Plumb through tessellation info

  • nak: Implement load_tess_coord

  • nak: Fix lowering for patch_vertices_in

  • HACK: Only emit OpBar in compute shaders

  • nak/nir: Use count_vec4_slots instead of count_attribute_slots

  • nak: Add NIR lowering for attribute I/O

  • nak/nir: Lower systm values before lowering I/O

  • nak: Use nak_nir_lower_vtg_io

  • nak: Fix a bunch of warnings

  • nak: Fix opt_out

  • nak/bitset: Improve set_words()

  • nak/bitset: Add an is_empty() helepr

  • nak/bitset: Fix next_set()

  • nak/sph: Round tls_size up to a multiple of 16

  • nak: Fix repair_ssa() for back-edges

  • nak: Fix parallel copy handling in spilling

  • nak: Fix to_cssa()

  • nak/nir: Don’t lower 1-bit phis

  • nak: Support encoding -Zero

  • nak: Fix fneg to do fadd(-0, x)

  • nak: Rename lower_vec_split() to lower_ineg()

  • nak: Use Src::From<u32> and Src::From<bool>

  • nak: A quick rustfmt fix

  • nak: Upgrade to more modern meson

  • nak: Add some #[allow(dead_code)]

  • nak: Drop some unused helpers

  • nak: Get rid of dead code warnings in RegFileSet

  • nak: Get rid of warnings in nak_sph.rs

  • nak: Drop the final calc_max_live() after GPR spilling

  • nak: Don’t print a range for one register

  • nir: Add nvidia barrier intrinsics

  • nak/nir: Add a pass for adding convergence barriers

  • nak: Add OpBreak

  • nak: Handle control-flow barriers

  • nak: Use barriers for re-convergence

  • nak: Remove unnecessary control barriers

  • nak: Call nir_lower_subgroups()

  • nak: Use nir_shader_intrinsics_pass for system values

  • nak: Lower subgroup_id and num_subgroups

  • nak/nir: Allow boolean vote_ieq

  • nak/nir: Zero-pad subgroup masks

  • nak: Implement vote and ballot

  • nak: Fix the encoding of OpShfl

  • nak: Implement read_invocation and shuffle_*

  • nak: Allow 1-component image load/store

  • nak: Emit CCtl in barriers with acq/rel semantics

  • nak: Use strong ordering for Image load/store

  • nak: Use the simplified BAR.SYNC encoding

  • nak: Emit MemBar before Bar

  • nak: Insert an OpNop after OpBar

  • nak: Document a bit in encode_lds()

  • nvk: Enable subgroups features

  • nak: Rely on Rust 1.73 for next_multiple_of() and div_ceil()

  • nak: Require meson 1.3.0 and clean up a couple bits

  • meson: Set build.rust_std

  • ci: Bump container images for NAK dependencies

  • ci: Add syn to –force-fallback-for

  • ci: Update the python env for ci_run_n_monitor.py

  • nvk: Default to NAK on Turing+

  • nvk: Stop asserting 11-bit storage image handles

  • nvk: Free NAK shaders

  • nak: Fix copy-prop for OpPLop3 sources

  • nak: Drop OpAtomCas in favor of OpAtom with atom_op == CmpExch

  • nak: Make ALD/AST.PHYS a boolean

  • nak: Make encode_sm75 a method of Shader

  • nak: Plumb the nak_compiler through to lower_fs_input_intrin

  • nak: Rework FS input interpolation

  • nvk: Only advertise VK_KHR_shader_terminate_invocation if using NAK

  • nvk: Handle load_first_vertex in nvk_nir_lower_descriptors()

  • nak/nir: Lower indirect FS inputs

  • nvk: Only lower outputs to temporaries

  • nvk: Add a codegen helper for nir_shader_compiler_options

  • nvk: Move a bunch of codegen-specific lowering to helpers

  • nvk: Move the optimization loop to the nvk_codegen.c

  • nvk: Move the guts of nvk_compile_nir() to nvk_codegen.c

  • nvk: Move even more lowering into nvk_codegen.c

  • nvk: Use nak_fs_key instead of rolling our own

  • nak: Rename TLS to SLM

  • nak: Properly prefix nak_xfb_info

  • nak: Move clip, cull, and XFB into a nak_shader_info.vtg

  • nak: Add a writes_layer bit to nak_shader_info::vtg

  • nak: Handle the num_gpr offsetting inside nak

  • nvk: Use nak_shader_info natively

  • nak: Enable SM70 for Volta

  • nak: Stop passing undefs to ipa_nv

  • nak: Support dumping shader assembly as part of compile

  • nvk: Don’t set pipeline->base.type manually

  • nvk: Implement VK_KHR_pipeline_executable_properties

  • nvk: Drop nouveau_ws_bo_new_tiled()

  • nvk: Rework error handling in nouveau_ws_bo_new() and from_dma_buf()

  • nvk: Handle VMA allocation failure

  • nvk: Add a separate VMA heap for BDA capture/replay

  • nvk: Implement bufferDeviceAddressCaptureReplay

  • nvk: Advertise VK_KHR_synchronization2

  • nvk: Set the right API version in the ICD json files

  • nak: Add the predicate destination to OpShfl

  • nak: Add builder helpers for a few ops

  • nak: Use c == 0x0 for shuffle_up

  • nak: Lower scan/reduce in NIR

  • nak: Implement quad ops

  • nvk: Advertise the rest of the subgroup ops

  • nak: Rework reg and SSA value printing

  • nak: Make most Display stuff lower-case

  • nak: Rework opcode printing to use a new trait

  • nak: Implement DisplayOp on Op instead of Display

  • nak: Default InstrDeps::delay to 0

  • nak: Only write deps.delay when set

  • nak: Align instructions when printing

  • nak: Display memory access bits with the “.” prefix

  • nak: Make MemAddrType a part of MemSpace

  • nak: Display memory type at the end for load/store ops

  • nak: Rework printing of texture and image dims

  • nak: Two more print fixes

  • nak: gl_FragCoord and gl_PointCoord are screen-space interpolated

  • nvk/codegen: Fragment shader builtins are noperspective

  • nvk: Wire up MESA_VK_VERSION_OVERRIDE

  • nvk: Limit shader stages to supported stages

  • nak: Run rustfmt

  • nak: Only insert barriers around ifs if they actually re-converge

  • vulkan: Default override patch version to VK_HEADER_VERSION

  • nvk: Advertise Vulkan 1.1 on Turing+

  • nak: Drop the PrmtSelection stuff

  • nak: Add a builder helper for OpPrmt

  • nak: Rework OpPrmt a bit

  • nak: Implement nir_op_extract_*

  • nak: Fix int8/16 lowering

  • nak: Add base support for 8 and 16-bit types

  • nak: Implement more int/float conversions

  • nak: Implement integer conversions

  • nak: Handle non-DW-aligned UBO loads

  • nvk: Enable 8 and 16-bit integer types

  • nak: Implement scan/reduce on booleans

  • nak/nir: Handle CBuf alignment rules

  • nak: Revert “nak: Handle non-DW-aligned UBO loads”

  • nvk: Use the copy engine for CmdFillBuffer

  • nvk: Use the copy engine for NVK_DEBUG=zero_memory

  • nvk: Stop initializing the 2D engine

  • vulkan: Move vk_synchronization2 to vk_synchronization

  • vulkan: Add some auto-generated synchronization helpers

  • vulkan: Add helpers for pipeline stage flags

  • vulkan: Add helpers for access flags

  • nvk: Move Begin/EndTransformFeedback to nvk_cmd_draw.c

  • nvk: Rework transform feedback stalling

  • nvk: Implement vkCmdPipelineBarrier2 for real

  • nvk: Drop unnecessary per-draw/dispatch cache maintenance

  • nvk: Drop MME_DMA_SYSMEMBAR before indirect draw/dispatch

  • nak: Drop a bunch of SET_REFERENCE from the pre-Turing paths

  • nvk: Advertise VK_EXT_subgroup_size_control

  • nil: Add support for filling out linear texture headers

  • nouveau: Rename nvidia-headers to headers

  • nouveau: Move headers/classes to headers/nvidia/classes

  • nak: Run rustfmt again

  • nak: Fix integer roll-over when we have a u64vec4

  • nak: Set .64/.32 on CSSR as needed

  • nak/nir: Don’t use nir_lower_bit_size on 64-bit values

  • nak: Implement 64-bit ineg

  • nak: Natively implement 64-bit shifts

  • nak: Lower isign in NIR

  • nak: Rework printing of comparisons

  • nak: Implement 64-bit comparisons

  • nak: Don’t ask NIR to lower [iu]mul64_2x32

  • nak: Use the right source types for I2F, F2I, and F2F

  • nak: Fix encoding of 64-bit F2I, I2F, and F2F

  • nak: Implement b2i64

  • nak/nir: Don’t lower 64-bit conversions

  • nvk: Advertise shaderInt64

  • nvk: Advertise VK_EXT_shader_subgroup_ballot/vote

  • nak/nir: Handle non-32-bit data in lower_scan_reduce

  • nvk: Advertise KHR_shader_subgroup_extended_types

  • nvk: Advertise VK_KHR_shader_atomic_int64

  • nak/nir: Trim image load/stores based on format

  • nak: Lower 64-bit image load/store

  • nak: Handle 64-bit image atomics

  • nil: Add R64_SINT and R64_UINT formats

  • nvk: Don’t disable non-texturable formats

  • nvk: Implement VK_EXT_shader_image_atomic_int64

  • nak: Simplify Src::is_predicate()

  • nak: Replace OpBMov with OpBClear

  • nak: Fix scheduling for control barriers

  • nak: Add a barrier register file

  • nak: Add back OpBMov with better semantics

  • nak: Add support for spilling barriers

  • nak: Take num_barriers from RA

  • nak: Make barriers SSA-friendly

  • nak: Force RA to allocate bar_in/out to the same register

  • nak: Add a barrier propagation pass

  • dxil: Use mesa_prim consistently

  • glsl: Properly remap GL_* to MESA_PRIM

  • intel/vec4: Use MESA_PRIM_* instead of GL_*

  • nir: Return a mesa_prim from gs_in_prim_for_topology

  • compiler: Fix a comment

  • radeonsi: Drop an unnecessary cast

  • nvk: Advertise VK_EXT_scalar_block_layout

  • nak: Advertise subgroupBroadcastDynamicId

  • nak: Add a B32 source type

  • nak: Rework the OpIAdd3/OpIAdd3X split

  • nak/legalize: Handle the src0/1 source mod condition for OpIAdd3X

  • nak: Legalize immediates with source modifiers

  • nak: Implement uadd_sat

  • nak: Implement usub_sat

  • nvk: Implement VK_EXT_texel_buffer_alignment

  • spirv: Plumb variable alignments through to NIR

  • nir: Respect variable alignments in lower_vars_to_explicit_types

  • nak: rustfmt

  • nak: Restructure for better module separation

  • ci: Also rustfmt binaries

  • nir: Split has_[su]dot_4x8 bits into regular and _sat versions

  • nir: Lower [su]dot_4x8_[ui]add_sat to [su]dot_4x8_[ui]add

  • microsoft: Stop claiming dot_4x8_sat support

  • nak: Rework printing of int/float types and rounding modes

  • nak: Wire up DP4

  • nvk: Advertise KHR_shader_integer_dot_product

  • nak: Split legalize into per-SM functions

  • nak: Initial WIP SM50 backend

  • nak: Rework set_src_imm20 in nak_encode_sm50

  • nak: Rewrite SM50 encode_fadd to not use encode_alu

  • nak: Rename LogicOp to LogicOp3

  • nak: Use OpLop2 and OpPSetP pre-SM70

  • nak: Rework the SM50 encoding of isetp

  • nak: Add SM50 encodings for ALD and AST

  • nak: Only split texture destinations on Volta+

  • nak: Rework nvfuzz for SM50

  • nak/nv50: Rewrite the encoding of OpShf

  • nak/sm50: Wire up tex ops

  • nak: Rewrite the SM50 encoding of OpF2I

  • nak/sm50: Rewrite the encoding for OpIMnMx

  • nak: Implement FS input interpolation on SM50

  • nak/sm50: Rewrite the encoding for OpMov

  • nak: Drop the SM50 encoding of BREV

  • nak/sm50: Add better helpers for encoding sources with modifiers

  • nak/sm50: Stop using ALUSrc for IADD2

  • nak/sm50: Drop src_mod_has* in favor of core helpers

  • nak: Clean up compiler warnings

  • nak: Add barriers on Volta

  • nak/nvfuzz: Add an SM parameter

  • nak: Drop the fmnmx from Builder

  • nak: Add an ftz bit to a bunch of float ops

  • nak: Plumb through float controls

  • nvk: Advertise VK_KHR_shader_float_controls

  • nak: Plumb through float controls for fset[p]

  • nak: Plumb through float controls for frnd[p]

  • nak: Add dnz bits to OpFMul and OpFFma

  • nak: Audit remaining FTZ/DNZ bits on sm70+

  • nak: Audit sm50 for FTZ/DNZ bits

  • nak: Clean up instruction printing a bit

  • nak: Rework barrier handling a bit

  • nvk: Make NVK_DEBUG=push an alias for push_dump

  • nvk: s/device/dev in nvk_descriptor_set_layout.c

  • nvk: Plumb a physical device into descriptor_stride_align_for_type

  • nvk: Add a nvk_min_cbuf_alignment() helper and use it

  • nvk: Add an NVK_MIN_TEXEL_BUFFER_ALIGNMENT #define

  • nak: Reduce minStorageBufferAlignment

  • nvk: Simplify alignment limit plumbing

  • nvk: CBuf alignment reduces to 64B on Turing

  • nvk: Throw Tegra behind NVK_I_WANT_A_BROKEN_VULKAN_DRIVER

  • nvk: Rework the way we set up memory heaps/types

  • nir: Add a new has_fmulz_no_denorms flag

  • nak: Set .ftz on f32 ops by default

  • nak: Implement fmulz and ffmaz

  • nvk: Enable NAK by default for Volta

  • nak: Don’t set both FTZ and DNZ at the same time

  • nvk: Implement VK_EXT_multi_draw

  • nak: Add a delay of 2 cycles for barriers

  • nak: Rework the dependency pass

  • nak: Handle negative cbuf offset immediates

  • nak/sm50: Fix immediate encodings

  • nak/sm50: Fix legalization of OpIAdd

  • nak/sm50: Add legalization and encoding for OpLdc

  • nvk/nir: Add cbuf analysis to nvi_nir_lower_descriptors()

  • nvk/nir: Lower UBO loads to load_ubo when we have a cbuf

  • nvk: Add a cbuf_bind_map to nvk_shader

  • nvk: Stash descriptor set sizes

  • nvk: Rework push_indirect to take an address

  • nvk: Set MME_DATA_FIFO_CONFIG on device init

  • nvk: Don’t flush descriptors in BeginConditionalRendering

  • nvk: Upload cbufs based on the cbuf_map

  • nvk: Add debug flags to the physical device

  • nvk: Enable cbufs

  • nvk: Use ENUM_PACKED for enums instead of PACKED

  • nir: Scalarize bounds checked loads and stores

  • nak: Switch to //-style comments

  • nak: Plumb shader model into instruction latency queries

  • nak: Handle minimum execution latencies in the dep tracker

  • nvk: Advertise VK_KHR_vulkan_memory_model

  • nvk: Use render->color_att_count for color write enables

  • nvk: Support extendedDynamicState3ColorWriteMask

  • nak: Move the copy detection part of opt_copy_prop to a helper

  • nak: Fix copy-prop for fp64

  • nak: Copy propagate and constant fold OpPrmt

  • nak: Make OpAtom::cmpr a GPR source

  • nak: Pass SrcTypes around instead of RegFile in legalize

  • nak/sm70: Allow src2 of 3src ops to be an immediate

  • nak: OpDAdd doesn’t have saturate

  • nak: Rework encoding of ALU instructions on SM70+

  • nak: Add the rest of the double-precision ops

  • nak: Split fmul/ffma handling from fmulz/ffmaz

  • nak: Wire up 64-bit nir_op_fadd/ffma/fmul and comparisons

  • nak: Fix nir_op_f2f64

  • nak: Implement b2f64

  • nak/nir: Set nir_lower_io_lower_64bit_to_32 for varyings

  • meson: Update our rust dependencies

  • nak: Fix encoding of dsetp with RZ on SM70+

  • nak: Implement 64-bit nir_op_fsign

  • nak/sm50: Add encoding and legalization for dadd/dfma/dmul/dsetp

  • nak/sm50: Fix encoding of f20 immediates

  • nak/sm50: Fix encoding of iadd with imm32

  • nak/sm50: Properly legalize OpSel and drop an assert

  • nak/sm50: Add DMnMx and use it for fp64 fmin/fmax

  • nir/lower_doubles: Add lowering for fmin/fmax/fsat

  • nak/nir: Lower a bunch of fp64

  • nvk: Advertise shaderFloat64

  • nvk: Free shaders created by codegen

  • nvk: Unref shaders on pipeline free

  • nvk: Don’t exnore ExternalImageFormatInfo

  • nak: Fix TCS output reads

Felix DeGrood (3):

  • anv: remove CS_FLUSH from query regression

  • driconf: add Dying Light 2 to Intel XeSS workaround

  • driconf: add Witcher3 to Intel XeSS workaround

Felix bridault (1):

  • radv: use 32bit va range for sparse descriptor buffers

Florian Weimer (1):

  • meson: C type error in strtod_l/strtof_l probe

Francisco Jerez (70):

  • intel/l3/gfx11+: Add tile cache partition to intel_l3_config struct.

  • intel/l3: Define helper for obtaining the size of an L3 partition in KB.

  • intel/l3: Set up L3FullWayAllocationEnable config if ALL partition has over 126 ways.

  • intel/dg2: Import L3 cache configurations.

  • intel/mtl: Import L3 cache configurations.

  • intel/xehp+: Add TBIMR-related genxml definitions.

  • intel/xehp+: Import algorithm for TBIMR tiling parameter calculation.

  • intel/xehp+: Add dynamic state flags controlling whether TBIMR is enabled during 3D primitives.

  • intel/xehp+: Define driconf option for selectively disabling TBIMR.

  • iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.

  • anv/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.

  • anv/xehp+: Enable TBIMR in generated draw calls.

  • intel/xehp: Adjust TBIMR performance chicken bits.

  • intel/xehp+: Adjust TBIMR batch size based on slice count.

  • intel/xehp+: Use TBIMR tile box check in order to avoid performance regressions.

  • intel/xehp: Enable TBIMR by default.

  • intel/eu/xe2+: Add support for 10-bit SWSB representation on Xe2+ platforms.

  • intel/fs/xe2+: Add comment reminding us to take advantage of the 32 SBID tokens.

  • intel/fs/xe2+: Teach SWSB pass about the behavior of double precision instructions.

  • intel/fs/xe2+: Handle extended math instructions as in-order in SWSB pass.

  • intel/eu/xe2+: Add definition for size of GRF space on Xe2.

  • intel/fs/xe2+: Don’t special case SEL_EXEC in inferred_exec_pipe().

  • intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power.

  • intel/compiler: Add max_polygons FS compilation parameter.

  • intel/compiler: Add multipolygon dispatch fields to brw_wm_prog_data.

  • intel/compiler: Add polygon count statistic to brw_compile_stats.

  • intel/fs: Add separate constructor of fs_visitor for fragment shaders.

  • intel/fs: Map all GS input attributes to ATTR register number 0.

  • intel/fs: Map all VS input attributes to ATTR register number 0.

  • intel/fs: Map all TES input attributes to ATTR register number 0.

  • intel/fs: Assert fs_reg::nr is always zero for ATTR registers in geometry stages.

  • intel/fs: Consider ATTR registers with different fs_reg::nr as belonging to disjoint register spaces.

  • intel/fs: Provide component index explicitly to interp_reg().

  • intel/fs: Pass builder to per_primitive_reg().

  • intel/fs: Fix fs_reg::component_size() to handle two-dimensional register regions.

  • intel/fs: Rework layout of FS vertex setup data in ATTR file to support multi-polygon dispatch.

  • intel/fs: Don’t copy-propagate ATTR registers in multi-polygon FS shaders when invalid.

  • intel/compiler: Don’t change types for copies from ATTR file.

  • intel/fs/gfx12+: Don’t set nir_divergence_single_prim_per_subgroup option for fragment shaders.

  • intel/fs/gfx12: Don’t consider multipolygon PS to have packed dispatch.

  • intel/fs: No need to copy null destinations in lower_simd_width.

  • intel/fs: Fix PS thread payload setup for depth_w_coef_reg.

  • intel/fs/gfx12: Implement multi-polygon format of back/front-facing flag in PS payload.

  • intel/fs/gfx12: Implement multi-polygon format of render target array index in PS payload.

  • intel: Add debug flag for enabling dual-SIMD8 fragment shader dispatch.

  • intel/compiler: Attempt to build dual-SIMD8 variant of fragment shaders on gfx12+ platforms.

  • intel/genxml: Add 3DSTATE_PS definitions needed for dual-SIMD8 dispatch on Gfx12+.

  • intel/gfx12: Enable SIMD8 dispatch in 3DSTATE_PS for FS multipolygon dispatch.

  • iris/gfx12: Hook up dual-SIMD8 fragment shader dispatch.

  • anv/gfx12: Hook up dual-SIMD8 fragment shader dispatch.

  • intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS).

  • intel/fs/xe2+: Stop building SIMD8 fragment shaders.

  • intel/fs/xe2+: Stop building SIMD8 shaders for geometry stages (VS/TCS/TES/GS).

  • intel/eu/xe2+: Add helpers for constructing registers in 512b units.

  • intel/fs/xe2+: Implement PS thread payload register offset setup.

  • intel/fs/xe2+: Fix for new layout of X/Y pixel coordinates in PS payload.

  • intel/fs/xe2+: Update uses of pixel/sample mask from PS thread payload.

  • intel/fs/xe2+: Update location of sample ID fields in PS payload.

  • intel/fs/xe2+: Update poly info PS payload for new multi-polygon dispatch format.

  • intel/fs: Add support for vector payload values to fetch_payload_reg().

  • intel/fs/xe2+: Enable new format of barycentrics in PS payload.

  • intel/fs/xe2+: Update for new layout of vertex setup data in PS payload.

  • intel/fs/xe2+: Implement support for multi-polygon vertex setup data in PS payload.

  • intel/fs/xe2+: Implement layout of mesh shading per-primitive inputs in PS thread payloads.

  • intel/fs: Plumb shader instead of compiler to get_lowered_simd_width() and friends.

  • intel/fs/xe2+: Lower SIMD width of instructions that access ATTR file from SIMD2x8/4x8 FS.

  • intel: Add debug flags for enabling Xe2+ multipolygon fragment shader dispatch modes.

  • intel/fs/xe2+: Attempt to build quad-SIMD8 and dual-SIMD16 FS variants on Xe2+ platforms.

  • intel/xe2+: Implement fragment shader dispatch state setup.

  • intel/compiler/xe2: Don’t disassemble non-existent fields.

Frank Binns (4):

  • pvr: rename some more instances of ‘reserved’ to ‘carveout’ for consistency

  • include/drm-uapi: add pvr_drm.h

  • pvr: Add powervr winsys implementation

  • pvr: alloc WSI memory via GPU when there isn’t a valid display FD

Friedrich Vock (24):

  • aco: Update printed block kinds

  • vulkan: Don’t use set_foreach_remove when destroying pipeline caches

  • radv/ci: Update skips comments

  • ac/gpu_info: Manually compute L3 size for Navi33

  • radv: Enable compute dispatch tunneling

  • radv,vtn,driconf: Add and use radv_rt_ssbo_non_uniform workaround for Crysis 2/3 Remastered

  • radv/rt: Initialize unused children in PLOC early-exit

  • radv/rt: bsearch inlined shaders

  • radv/rt: Free traversal NIR after compilation

  • radv,aco: Convert 1D ray launches to 2D

  • radv/rt: Move per-geometry build info into a geometry_data struct

  • radv/rt: Acceleration structure updates

  • radv/rt: Add workaround to make leaves always active

  • radv: Fix shader replay allocation condition

  • nir: Make is_trivial_deref_cast public

  • nir: Handle casts in nir_opt_copy_prop_vars

  • util: Provide a secure_getenv fallback for platforms without it

  • vulkan: Use secure_getenv for trigger files

  • aux/trace: Guard triggers behind __normal_user

  • vtn: Use secure_getenv for shader dumping

  • mesa/main: Use secure_getenv for shader dumping

  • radv: Use secure_getenv in radv_builtin_cache_path

  • radv: Use secure_getenv for RADV_THREAD_TRACE_TRIGGER

  • util/disk_cache: Use secure_getenv to determine cache directories

GKraats (1):

  • i915G: show correct number of needed ALU instructions at errmess

Ganesh Belgur Ramachandra (9):

  • radeonsi: Fix clear-render-target shader for 1darrays in NIR

  • radeonsi: “create_dma_compute” shader in nir

  • radeonsi: “create_fmask_expand_cs” shader in nir

  • radeonsi: “get_blitter_vs” shader in nir

  • asahi: fixes prevailing ‘-Werror=maybe-uninitialized’ issue

  • radeonsi: enable nir pass for 64 bit operations

  • radeonsi: add comments for unpack_2x16* utility functions

  • radeonsi: convert “create_query_result_cs” shader to nir

  • radeonsi: convert “gfx11_create_sh_query_result_cs” shader to nir

Georg Lehmann (28):

  • aco, radv: vectorize f2f16 if rounding mode is rtz

  • aco: force uniform result for LDS load with uniform address if it can be non uniform

  • aco: stop using cstdint

  • aco: namespace aco_opcode

  • aco: deduplicate instr_class definition

  • aco: deduplicate Format definition

  • aco: don’t CSE v_permlane across exec

  • aco: use null operand for SOPK s_waitcnt

  • aco: fix detecting sgprs read by SMEM hazard

  • aco/tests: add some missing scc defs

  • aco/tests: use correct operand size for some 64bit ops

  • aco: use lm for carry out in vsub32

  • aco: add missing scc def for SALU quad broadcast

  • aco/gfx10+: don’t use v_cmpx with VCC def

  • aco: use correct operand size for int tg4 wa

  • aco: add src/def count and size for all ALU opcodes

  • aco: validate ALU operands and defs

  • aco/sched: treat p_dual_src_export_gfx11 like export

  • aco: don’t optimize DPP across more than one block

  • aco: add test for post-ra DPP clobbered in linear cfg

  • aco: optimize 32bit fsign by using fmulz with Inf

  • aco: shrink buffer stores with undef/zero components

  • aco/gfx12: implement broadcast dmask shrink behavior

  • aco: apply packed fneg commutatively

  • aco: fix applying input modifiers to DPP8

  • aco: clean up fneg/fabs combining

  • aco: apply fneg/fabs to VOP3P

  • aco: stop scheduling at p_logical_end

George Ouzounoudis (9):

  • nvk: Move SET_BLEND_STATE_PER_TARGET to graphics state initialization

  • nvk: Support extendedDynamicState3ColorBlendEnable

  • nvk: Support extendedDynamicState3ColorBlendEquation

  • nvk: Support extendedDynamicState3SampleMask

  • nvk: Support extended dynamic state for alpha to coverage/one

  • vulkan: Fix dynamic graphics state enum usage

  • nvk: Support extended dynamic state for rasterization stream

  • nvk: Remove pipeline state setting functions

  • nvk: Support extended dynamic state for tessellation domain origin

Gert Wollny (15):

  • virgl: Use host reported limits for max outputs

  • r600: Add callbacks for get_driver_uuid and get_device_uuid

  • r600: Add experimental get_compute_state_info

  • r600: Link with libgalliumvl, when enabling rusticl this is needed

  • r600/sfn: Fixup component count only if intrinsic has it

  • r600/sfn: Allow skipping backend shader optimization for a subset of shaders

  • r600/sfn: keep workgroup and invocation ID registers for whole shader

  • r600/sfn: Fix usage of std::string constructor

  • r600/sfn: Don’t try to re-use iterators when the set is made empty

  • zink: Don’t pass a blend state when we have full ds3 support

  • r600: lower dround_even also on hardware that supports fp64

  • virgl: Use better reporting for mirror_clamp features

  • radv: Fix compilation with gcc-13 and tsan enabled

  • nir/lower_int64: Fix compilation with gcc-13 and tsan enabled

  • nir/builder: Fix compilation with gcc-13 when tsan is enabled

Giancarlo Devich (1):

  • nir: Workaround MSVC internal compiler error in ARM64 build

Guilherme Gallo (19):

  • ci/bin: Use iid instead of SHA in gitlab_gql

  • ci/bin: Do not forget to add early-stage dependencies

  • ci/bin: Refactor create_job_needs_dag

  • ci/lava: Use project_name instead of hardcoded `mesa`

  • ci/lava: Fix imports formatting

  • ci/lava: Refactor UART definition building blocks

  • ci/lava: Create LAVAJobDefinition

  • ci/lava: Make SSH definition wrap the UART one

  • ci/lava: Enable SSH by default in fastboot devices

  • ci/lava: Add unit tests covering job definition

  • ci/bin: Fix find_dependency function calls

  • ci/bin: Replace AIOHTTPTransport with RequestsHTTPTransport

  • ci/bin: gql: make the query cache optional

  • ci/bin: gql: Log the caching errors

  • ci/bin: gql: Implement pagination

  • ci/bin: gql: Improve queries for jobs/stages retrieval

  • ci/bin: Fix gitlab_gql methods that uses needs DAG

  • ci/bin: Fix mypy errors in gitlab_gql.py

  • ci/bin: Print a summary list of dependency and target jobs

Haihao Xiang (1):

  • anv: Fix typo in transition_color_buffer

Hans-Kristian Arntzen (2):

  • radv/radeonsi: Forward correct GPU instance to umr.

  • wsi/x11: Add workaround for Detroit Become Human.

Helen Koike (3):

  • ci/zink: add spec@ext_timer_query@time-elapsed to flakes

  • ci/ci_run_n_monitor: abort when target gets skipped

  • ci: fix python-test dependency error on merge requests

Hyunjun Ko (2):

  • vulkan/video: fix a typo

  • anv/video: fix out-of-bounds read

Iago Toral Quiroga (13):

  • v3d,v3dv: fix MMU error from hardware prefetch after ldunifa

  • v3d: implement support for PIPE_CAP_NATIVE_FENCE_FD

  • broadcom: fix scheduling dependencies for SETMSF instruction

  • v3dv: disallow image stores on VK_KHR_DISPLAY surfaces

  • v3dv: switch timestamp queries to using BO memory

  • broadcom: disable perquad tmu loads after discards

  • broadcom: lower null pointers

  • v3dv: implement VK_KHR_shader_terminate_invocation

  • v3dv: implement VK_EXT_shader_demote_to_helper_invocation

  • v3dv: expose VK_EXT_subgroup_size_control

  • broadcom/compiler: fix incorrect flags setup in non-uniform if path

  • broadcom/compiler: fix incorrect flags update for subgroup elect

  • broadcom/compiler: be more careful with unifa in non-uniform control flow

Ian Romanick (39):

  • nir/split_vars: Don’t split arrays of cooperative matrix types

  • nir/lower_packing: Don’t generate nir_pack_32_4x8_split on drivers that can’t handle it

  • nir/lower_packing: Add lowering for nir_op_unpack_32_4x8

  • nir/builder: Teach nir_pack_bits and nir_unpack_bits about 32_4x8

  • intel/vec4: Don’t emit an empty ELSE

  • intel/compiler: Add basic CFG validation

  • intel/compiler: Limit scope of cur_endif variable

  • intel/compiler: Delete bidirectional block links in opt_predicated_break

  • intel/compiler: Don’t create extra CFG links in opt_predicated_break

  • intel/compiler: Don’t create extra CFG links when deleting a block

  • intel/compiler: Don’t promote CFG link types when removing a block

  • intel/fs: Don’t add MOV instructions to DO blocks in combine constants

  • intel/compiler: Verify that DO is alone in the block

  • nir: Handle divergence for decl_reg

  • intel/fs/xe2+: Pass correct dispatch_width to fs_generator for geometry-processing stages.

  • intel/cmat: Update get_slice_type for packed slices

  • intel/cmat: Add lowering for cmat_insert and cmat_extract

  • intel/cmat: Enable packed formats for unary, length, and construct

  • intel/cmat: Enable packed formats for binary ops

  • intel/cmat: Enable packed formats for scalar ops

  • intel/cmat: Add lowering for cmat_bitcast

  • intel/cmat: Lower cmat_load and cmat_store

  • intel/compiler: Initial bits for DPAS instruction

  • intel/disasm: Disassembly support for DPAS

  • intel/compiler: Validation for DPAS instructions

  • intel/fs: Fix scoreboarding for DPAS

  • intel/fs: DPAS lowering

  • intel/fs: nir: Add nir_intrinsic_dpas_intel

  • anv: Add anv_physical_device::has_cooperative_matrix

  • anv: Set COMPUTE_WALKER systolic mode enable flag

  • anv: Set PIPELINE_SELECT systolic mode enable flag

  • anv: Lower indirect derefs again after lowering cooperative matrices

  • anv: Select the SIMD mode very early when cooperative matrices are used

  • intel/dev: Advertise integer configs with saturatingAccumulation too

  • intel/dev: Enable VK_KHR_cooperative_matrix on all Gfx9+ GPUs

  • intel/cmat: Generate better code for nir_intrinsic_cmat_insert

  • intel/compiler: Disable DPAS instructions on MTL

  • intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value

  • intel/compiler: Track mue_compaction and mue_header_packing flags in brw_get_compiler_config_value

Italo Nicola (4):

  • panfrost: fix untracked dependency when converting resource modifier

  • gallium: stop calling resource_copy_region for multisampled copy_image

  • panfrost: legalize afbc before blitting

  • panfrost: expose support for EXT_copy_image

Iván Briano (8):

  • anv: use the right vertexOffset on CmdDrawMultiIndexed

  • hasvk: ensure we reapply always pipeline dynamic state in runtime state

  • anv: allow NULL index buffers

  • anv: remove no longer valid assert

  • anv: handle VkBindMemoryStatusKHR on buffer/image memory bind

  • anv: add support for Cmd*DescriptorSet*2KHR

  • anv: move astc_emu to use descriptors2 calls

  • anv: enable VK_KHR_maintenance6

Jan Beich (2):

  • intel: make CLOCK_TAI optional for non-Linux

  • intel: make CLOCK_BOOTTIME optional for non-Linux

Jani Nikula (7):

  • nir: add names to some typedef’d structs/enums

  • nir: drop **< style documentation comments

  • isl: drop **< style documentation comments

  • docs: Add docs/header-stubs/README.rst

  • docs/vulkan: use hawkmoth instead of doxygen

  • docs/nir: use hawkmoth instead of doxygen

  • docs/isl: use hawkmoth instead of doxygen

Janne Grunau (4):

  • gallium: Avoid empty version scripts in pipe-loader

  • gallium: Fix i915 pipe-loader build

  • gallium: Do not create pipe-loader version scripts for disabled drivers

  • asahi: Fix typo in arch check in agx_get_gpu_timestamp

Jesse Natalie (64):

  • microsoft: Disable post-merge CI for Windows

  • d3d12: Only set draw params root parameter index for actual draw params

  • dzn: Implement VK_MSFT_layered_driver

  • wgl: Take pixelformat color channels into account for choosing a PFD

  • winsys/gdi: Handle 4444 and 1010102 texture formats

  • winsys/gdi: Update is_displaytarget_format_supported to reflect reality

  • d3d12: Don’t support displaytargets that can’t be supported by GDI/DXGI

  • dzn: Use vk_properties helper

  • vulkan: Remove no-longer-needed prototypes for ICD entrypoints

  • vulkan: Consolidate common ICD methods

  • vulkan: Support loader interface v7

  • dzn: Fix memory type sorting

  • microsoft/compiler: Set src/dest nir types on image intrinsics when deducing format

  • d3d12: Disable common state promotion for non-simultaneous-access textures

  • d3d12: Initialize shader key swizzle for non-int textures

  • d3d12: Add a fallback for int clears where value can’t be cast to float

  • d3d12: Binding buffers as SSBO/storage image needs to add buffer ranges

  • d3d12: Change memory barrier implementation

  • d3d12: Support ARB_texture_view

  • d3d12: Use format casting for shader images

  • d3d12: GL4.3

  • microsoft/compiler: Bump signature limits for 32 rows of 4 components

  • microsoft/compiler: Don’t declare PS output registers split across variables

  • microsoft/compiler: Don’t use 64-bit types for signature entries

  • microsoft/compiler: When packing fractional inputs, find a row with space for it

  • microsoft/compiler: Stop lowering all I/O to temps

  • d3d12: Fix location_frac_mask bitfield size

  • d3d12: Split dvec3 interpolatns into devc2 and double

  • d3d12: Support enhanced layouts for VS inputs

  • d3d12: Fix GS variant I/O slot counts

  • d3d12: Enable ARB_enhanced_layouts and ARB_texture_mirror_clamp_to_edge

  • d3d12: Reference count queries in a batch

  • d3d12: ARB_query_buffer_object and GL4.4

  • d3d12: PRIMITIVES_GENERATED for stream > 0 should only be an SO query

  • d3d12: Handle cull distance as an XFB target

  • d3d12: Fix MSAA-disabling pass; sample mask should be 0 for helper lanes

  • d3d12: GL4.5

  • nir_lower_mem_access_bit_sizes: Fix write-mask-constrained 3-byte stores as atomics

  • nir: Add a flag to opt_if to prevent fighting with splitting 64bit phis

  • d3d12: Fixes for QBO shaders

  • d3d12: Enable some 4.6 extensions that were already implemented

  • d3d12: GL4.6

  • nir_lower_mem_access_bit_sizes: Fix assert (bit -> byte size)

  • microsoft/compiler: Fix lower_mem_access_bit_size callback result

  • d3d12/driconf: Force on ARB_texture_view for Blender

  • d3d12: Fix multidimensional array ordering

  • d3d12: Fix h264 encoder 32-bit build (uint64_t -> size_t)

  • d3d12: Fix hevc encoder 32-bit build (uint64_t -> size_t)

  • microsoft/clc: Fix image lowering pass to only erase variables at the end

  • microsoft/clc: Fix images with multiple derefs for real

  • microsoft/clc: Add a test which sinks image derefs

  • microsoft/clc: One more image lowering fix

  • compiler/clc: Don’t fail to parse SPIR-V if there’s no kernels

  • microsoft/clc: Flip on capabilities to prevent warning spew

  • microsoft: Whitespace change to trigger CI

  • vulkan/wsi: Convert bit tests to bool with != 0

  • util: Re-implement getenv for Windows

  • d3d12: Add a debug flag to opt out of singleton behavior

  • d3d12: Only destroy the winsys during screen destruction, not reset

  • libgl-gdi: Update wgl test to use a 32bit framebuffer

  • libgl-gdi: Update wgl test to set debug flags needed for tests

  • dzn: Fix 3D to 2D image copies

  • zink: Add ASSERTED to vars that are only used for asserts

  • mesa: Consider mesa format in addition to internal format for mip/cube completeness

Jianxun Zhang (12):

  • intel/isl: Add a debug option to override modifer list

  • intel: Move mod_plane_is_clear_color() into isl

  • intel/vulkan: Report clear color in subresource layout

  • intel/vulkan: Allow modifiers supporting fast clear

  • intel/vulkan: Specify offset when creating aux state tracker

  • intel/vulkan: Import aux state tracking buffer

  • intel/vulkan: Remove private binding on fast clear region

  • intel/vulkan: Use the last 2 dwords of clear color struct

  • intel/vulkan: Correct a comment about an offset in fast clear

  • intel/vulkan: Update comment of a workaround of modifiers

  • intel/vulkan: Add COMPRESSED_CLEAR state in layout translation

  • intel/isl: Add Gfx 12.x RC_CCS_CC into modifier scores

Job Noorman (5):

  • ir3: correctly set bit size for 64b constant @load_ubo

  • nir: add _safe variants of nir_foreach_reg_load/store

  • ir3: lower 64b registers

  • nir: add helper to create cursor after all @decl_regs

  • ir3: lower 64b registers before creating preamble

Jonathan Gray (2):

  • intel/common: add directory prefix to intel_gem.h include

  • zink: put sysmacros.h include under #ifdef MAJOR_IN_SYSMACROS

Jordan Justen (25):

  • intel/l3: Use devinfo->urb.size when cfg urb-size is 0.

  • anv: Add more space for init_render_queue_state() batch (MTL regression)

  • intel/dev/wa: Raise error if mesa_defs.json contains unknown platforms

  • intel/dev: Rename mtl-m to mtl-u

  • intel/dev: Rename mtl-p to mtl-h

  • intel/compiler: Define XE2 compiler enum

  • intel/genxml: Update COMPUTE_WALKER for xe2

  • iris: Set COMPUTE_WALKER Message SIMD field

  • anv: Set COMPUTE_WALKER Message SIMD field

  • intel/genxml: Update INTERFACE_DESCRIPTOR_DATA for xe2

  • anv, iris: Update INTERFACE_DESCRIPTOR_DATA programming for xe2

  • iris: xe2 doesn’t have INTERFACE_DESCRIPTOR_DATA::BarrierEnable

  • intel/genxml: Update 3DSTATE_TE for xe2

  • isl: Add mocs for xe2

  • intel/genxml: Add UNIFIED_COMPRESSION_FORMAT enum for xe2

  • anv, blorp, iris: Update 3DSTATE_PS programming for xe2

  • anv, blorp, iris, intel/genxml: Update 3DSTATE_VS for xe2

  • anv, blorp, iris, intel/genxml: Update 3DSTATE_PS_EXTRA for xe2

  • intel/batch_decoder: Update 3DSTATE_PS decoding for xe2

  • anv, iris, intel/genxml: Update 3DSTATE_GS for xe2

  • anv, iris, intel/genxml: Update 3DSTATE_HS for xe2

  • intel/compiler: Pass max_polygons to copy-prop from fs_visitor.

  • intel/xe2+: Implement brw_wm_state_simd_width_for_ksp() on Xe2+.

  • intel/genxml/gfx125: Move L1_CACHE_CONTROL to enum

  • intel/genxml/gfx125: Move STATE_SURFACE_TYPE to enum

Jordan Petridis (1):

  • Revert “ci: take microsoft farm offline”

Joshua Ashton (2):

  • nvk: Hook up driconf for nvk_instance

  • nvk: Enable KHR_present_id and KHR_present_wait

José Expósito (5):

  • zink: Fix crash on zink_create_screen error path

  • zink: fix dereference before NULL check

  • zink: allow software rendering only if selected

  • zink: initialize drm_fd to -1

  • egl/glx: fallback to software when Zink is forced and fails

José Roberto de Souza (56):

  • anv: Add missing ANV_BO_ALLOC_EXTERNAL flags when calling anv_device_import_bo()

  • intel: Add more information about the PAT entry used

  • intel: Update MTL scanout PAT entry

  • intel: Add a write combining PAT entry

  • anv: Honor memory coherency of the memory type selected

  • anv: Move PAT entry selection to common code

  • anv: Change default PAT entry to WC

  • anv: Calculate mmap mode based on alloc_flags

  • anv: Remove anv_bo flags that can be inferred from alloc_flags

  • iris: Add iris_bufmgr_get_pat_entry_for_bo_flags()

  • intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()

  • anv: Reduce ifdefs in anv_GetCalibratedTimestampsEXT()

  • anv: Make use of intel_gem_read_correlate_cpu_gpu_timestamp()

  • intel/common/xe: Re implement xe_gem_read_render_timestamp() with xe_gem_read_correlate_cpu_gpu_timestamp()

  • anv: Bring back the non optimized version of build_load_render_surface_state_address()

  • intel: Sync xe_drm.h

  • intel: Sync xe_drm.h

  • iris: Change default PAT entry to WC

  • intel: Rename PAT entries

  • intel: Share function to do device query in Xe KMD

  • iris: Check for maximum allowed priority in Xe KMD

  • anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT

  • anv: Add support all possible cached and coherent memory types

  • intel: Add PAT entries for gfx12 and newer

  • intel: Sync xe_drm.h

  • intel: Enable has_set_pat_uapi for Xe

  • iris: Prepare iris_heap_to_pat_entry() for discrete GPUs

  • iris: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs

  • anv: Prepare anv_device_get_pat_entry() for discrete GPUs

  • anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs

  • anv: Add heaps for Xe KMD in platforms without LLC

  • intel/dev: Adjust prefetch_size values for Xe2 engines

  • anv: Fix vm bind of DRM_XE_VM_BIND_FLAG_NULL

  • iris: Fix the mmap mode for IRIS_HEAP_DEVICE_LOCAL_PREFERRED

  • intel: Sync xe_drm.h take 2 part 3

  • intel/isl: Set mocs.blitter_dst/src for MTL

  • anv: Fix handling of host_cached_coherent bos in gen9 lp in older kernels

  • anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags

  • anv: Promote bos to host_cached+host_coherent in platforms with LLC

  • anv: Avoid unnecessary intel_flush calls

  • intel/genxml/xe2: Update PIPE_CONTROL

  • intel/genxml/xe2: Update PIPELINE_SELECT

  • intel: Sync xe_drm.h final part

  • anv: Remove libdrm usage from Xe KMD backend

  • anv: Add ANV_BO_ALLOC_IMPORTED

  • anv: Replace anv_bo.vram_only by anv_bo.alloc_flags check

  • anv: Assume that imported bos already have flat CCS requirements satisfied

  • intel/isl/xe2: Enable route of Sampler LD message to LSC

  • utils/u_debug: Fix parse of “all,<something else>

  • anv: Increase ANV_MAX_QUEUE_FAMILIES

  • anv: Drop useless STATIC_ASSERT in anv_physical_device_init_queue_families()

  • anv: Simply companion_rcs handling

  • anv: Add missing anv_measure_submit() calls in Xe KMD backend

  • anv: Fix anv_measure_start/stop_snapshot() over copy or video engine

  • anv: Call anv_measure_submit() before anv_cmd_buffer_chain_command_buffers()

  • anv: Fix PAT entry for userptr in integrated GPUs

Juan A. Suarez Romero (12):

  • v3d/ci: run V3D GL tests in 64-bits

  • v3d: use kmsro to create drm screen on real hw

  • vc4/ci: comment why piglit is disabled

  • broadcom/ci: separate hiden jobs to -inc.yml files

  • v3d: include the revision in the device name

  • ci/baremetal: make BM_BOOTCONFIG optional

  • ci: do not mount already mounted directories

  • ci/v3d/vc4: remove explicit modules to load

  • ci/v3dv: add new failures

  • ci/v3dv: update results

  • ci/vc4/v3d: remove some flakes

  • ci/v3d: add support for rpi5

Julia Zhang (1):

  • radeonsi: modify binning settings to improve performance

Juston Li (17):

  • venus: add helper function to get cmd handle

  • venus: refactor out common cmd feedback functions

  • venus: support deferred query feedback recording

  • venus: track/recycle appended query feedback cmds

  • venus: append query feedback at submission time

  • venus: switch to unconditionally deferred query feedback

  • venus: sync protocol for VK_EXT_extended_dynamic_state3

  • venus: pipeline fixes for VK_EXT_extended_dynamic_state3

  • venus: enable VK_EXT_extended_dynamic_state3

  • venus: disable unsupported ExtendedDynamicState3Features

  • venus: implement vkGet[Device]ImageSparseMemoryRequirements

  • radv: enable stippledBresenhamLines on GFX9 chips

  • venus: fix query feedback copy sanitize off by 1

  • venus: rename buffer cache to buffer reqs cache

  • venus: use vk_format helper for plane count

  • venus: support caching image memory requirements

  • venus: add LRU cache eviction for image mem reqs cache

Kai Wasserbäch (1):

  • fix: ac/llvm: LLVM 18: remove useless passes, partially removed upstream

Karol Herbst (74):

  • vtn/opencl: always lower to libclc fmod

  • rusticl/device: restrict image_buffer_size

  • rusticl/device: restrict param_max_size further

  • rusticl/mem: properly set pipe_image_view::access

  • zink: support CLAMP_TO_BORDER with unnormalized coords

  • zink: alias nir scratch memory by lowering to common bit_size

  • zink: emit float controls

  • zink: lower fisnormal as it requires the Kernel Cap

  • radv: fix buffers in vkGetDescriptorEXT with size not aligned to 4

  • rusticl/queue: Only take a weak ref to the last Event

  • rusticl/device: restrict const max size to 1 << 26 bytes

  • rusticl/mesa: pass PIPE_BIND_LINEAR in resource_create_texture_from_user

  • rusticl: handle failed maps gracefully

  • zink: validate pointer alignment in resource_from_user_memory

  • zink: handle denorm preserve execution modes

  • zink: deallocate global_bindings array

  • zink: emit MemoryAccess flags for coherent global load/stores

  • rusticl/mesa/screen: do not derefence the entire pipe_screen struct

  • nir: Stop assuming glsl_get_length() returns 0 for vectors

  • ir2: Stop assuming glsl_get_length() returns 0 for vectors

  • nvc0: implement PIPE_CAP_TIMER_RESOLUTION

  • radeonsi: support importing arbitrary resources

  • radeonsi: hack for importing 3D textures

  • rusticl/context: fix importing gl cube maps

  • docs/features: mark rusticl gl_sharing as done

  • rusticl/queue: do not send empty lists of event to worker queue

  • rusticl/queue: fix implicit flushing of queue dependencies

  • rusticl: only support the matching device for gl_sharing

  • rusticl/memory: fix new clippy::needless-borrow warning

  • nir: allow vec derefs on system values

  • vtn: add hack for system values placed in CrossWorkgroup memory

  • rusticl/api: workaround DPCPP fetching clSetProgramSpecializationConstant

  • rusticl: add x11 dependency

  • rusticl/gl: make GLX support optional

  • clc: allow debug flag to be read from other files

  • clc: add dump_llvm debug options

  • nir/opt_preamble: make load_workgroup_size handling optional

  • radeonsi: lower relative shuffle subgroup ops

  • radeonsi: lower 64bit subgroup shuffle to 32 bit

  • clc: add support for cl_khr_subgroup_shuffle and shuffle_relative

  • rusticl: implement cl_khr_subgroup_shuffle and shuffle_relative

  • ci/fedora: bump to meson 1.3.0

  • rusticl: bump meson req

  • rusticl: use rust.proc_macro for proc macros

  • clc: use addMacroDef/Undef instead of -D/-U flags

  • nak: fix some sm checks for volta

  • nir/algebraic: add support for custom arguments

  • nak: add algebraic lowering pass

  • nak: move nir_lower_subgroups into nak_postprocess_nir

  • rusticl/kernel: explicitly set rounding modes

  • radeonsi: fix reg_saved_mask for non graphics contexts

  • clc: add workaround for clang always defining __IMAGE_SUPPORT_ and __opencl_c_int64

  • rusticl: do not warn on empty RUSTICL_DEBUG or RUSTICL_FEATURES

  • rusticl: silence clippy::arc-with-non-send-sync for now

  • rusticl: fix constant and printf buffer size

  • rusticl/nir: add missing nir include

  • rusticl: check rustc version for flags requiring newer rustc/clippy

  • ci: merge debian-rusticl-testing into debian-testing

  • zink: lock screen queue on context_destroy and CreateSwapchain

  • clc: remove code supporting pre llvm-10

  • zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources

  • rusticl: specify buffer bindings explicitly

  • rusticl: add QueueContext to track GPU state

  • rusticl/queue: release bound constant buffer

  • rusticl: use real buffer for cb0 for drivers prefering

  • ci,rusticl: bump meson req to 1.3.1

  • rusticl/meson: generate bindings for LLVM

  • rusticl/program: add LLVM functions to cache timestamp

  • rusticl/llvm: do not include spirv-tools/linker.hpp

  • rusticl/kernel: run opt/lower_memcpy later to fix a crash

  • nir: rework and fix rotate lowering

  • nak/opt_out: fix comparison in try_combine_outs

  • rusticl/kernel: check that local size on dispatch doesn’t exceed limits

  • clc: force fPIC for every user when using shared LLVM

Kenneth Graunke (21):

  • intel/compiler: Delete unused emit_dummy_fs()

  • intel/compiler: Delete unused repclear shader uniform handling

  • intel/compiler: Delete repclear shader’s special case for 1 color target

  • intel/compiler: Drop unused saturate handling in repclear shader

  • intel/compiler: Convert the repclear shader to use send-from-GRF

  • intel/compiler: Assert that FS_OPCODE_[REP_]FB_WRITE is for pre-Gfx7

  • iris: Make an iris_bucket_cache structure and array per heap

  • iris: Make an iris_heap_is_device_local() helper

  • iris: Rename heap_flags -> heap in i915_gem_create

  • iris: Split system memory heap into cached-coherent and uncached heaps

  • iris: Use 64K BOs for the shader uploader

  • iris: Align fresh BO allocations to 2MB in size

  • iris: Ensure virtual addresses are aligned to 2MB for 2MB+ blocks

  • anv: Implement rudimentary VK_AMD_buffer_marker support

  • anv: Drop 3/4 of PPGTT size restriction for sys heap size calculation

  • anv: Don’t report more memory available than the heap size

  • intel/fs: Allow omitting the destination of A64 untyped atomics

  • intel/fs: Drop opt_register_renaming()

  • iris: Initialize bo->index to -1 when importing buffers

  • iris: Don’t search the exec list if BOs have never been added to one

  • iris: Skip mi_builder init for indirect draws

Konstantin Seurer (40):

  • radv: Add RADV_MAX_HIT_ATTRIB_DWORDS

  • radv/nir: Add radv_nir_lower_hit_attrib_derefs

  • radv/nir: Handle boolean hit attribs

  • radv/clang-format: Do not indent C++ modifiers

  • radv: Add radv_nir_lower_hit_attrib_derefs_tests

  • radv/sqtt: Fix tracing acceleration structure commands

  • radv/sqtt: Handle monolithic RT pipelines

  • radv/rt: Use a helper for inlining non-recursive stages

  • radv/rt: Skip null checks for small case counts

  • nir/lower_vars_to_scratch: Remove all unused derefs

  • drm-shim/nouveau: Set nv_device_info_v0::platform

  • drm-shim/nouveau: Expose the 2D engine on NV50+

  • drm-shim/nouveau: Stub mitting ioctls

  • nvk: Do not preserve metadata after lower_load_global_constant_offset_instr

  • radv: Add more offsets acceleration_structure_layout

  • radv/bvh: Stop emitting leaf nodes inside the encoder

  • nir: Optimize fpow with small constant exponents

  • radv: Implement VK_KHR_ray_tracing_position_fetch

  • radv: Make pipeline cache object data generic

  • radv: Don’t store library stack sizes

  • radv: Add more ray tracing data to the cache

  • radv/rt: Skip compiling a traversal shader

  • radv: Skip compiling chit and miss shaders

  • radv/rt: Remove useless assert

  • radv/rt: Use radv_shader for compiled shaders

  • radv/sqtt: Avoid duplicate stage check

  • radv/rt: Repurpose radv_ray_tracing_stage_is_compiled

  • vtn: Remove transpose(m0)*m1 fast path

  • ac/nir: Export clip distances according to clip_cull_mask

  • vtn: Handle DepthReplacing correctly

  • radv/rmv: Fix tracing ray tracing pipelines

  • radv/rt/rmv: Log pipeline library creation

  • radv: Use PLOC for TLAS builds

  • radv: Remove the BVH depth heuristics

  • radv/rt: Lower ray payloads to registers

  • vtn: Allow for OpCopyLogical with different but compatible types

  • ac/llvm: Enable helper invocations for quad OPs

  • lavapipe: Fix DGC vertex buffer handling

  • lavapipe: Mark vertex elements dirty if the stride changed

  • lavapipe: Report the correct preprocess buffer size

Lang Yu (1):

  • radeonsi: emit SQ_NON_EVENT for GFX11_5

Leo Liu (2):

  • gallium/vl: match YUYV/UYVY swizzle with change of color channels

  • radeonsi: fix video processing path without VPE enabled

LingMan (9):

  • rusticl: Show an error message if the build is attempted with an outdated bindgen version

  • rusticl: Show an error message if the version of bindgen can’t be detected

  • rusticl: Directly pass a `&Device` to `Mem::map_image` and `Mem::map_buffer`

  • rusticl: Only put an Arc around PipeScreen where needed

  • rusticl: Avoid repeatedly creating Vecs during Platform initialization

  • rusticl: Turn pointers in enqueue_svm_mem_fill_impl into proper Rust types

  • rusticl: Turn pointers in enqueue_svm_memcpy_impl into slices

  • rusticl/api: Add checking wrappers around `slice::from_raw_parts{_mut}`

  • rusticl: Use the `from_raw_parts` wrappers

Lionel Landwerlin (88):

  • intel/fs: fix dynamic interpolation mode selection

  • anv/meson: add missing dependency on the interface header

  • anv: ensure we reapply always pipeline dynamic state in runtime state

  • intel/fs: Xe2 fix for ExBSO on UGM

  • blorp: handle binding table & surface state allocation failures

  • anv: rename internal heaps

  • anv: deal with state stream allocation failures

  • anv: add max_size argument for block & state pools

  • anv: make sure pools can handle more than 2Gb

  • anv: fail pool allocation when over the maximal size

  • anv: use anv_state_pool_state_address for blorp vertex buffer address

  • anv: fix corner case of mutable descriptor pool creation

  • anv: dynamically allocate utrace batch buffers

  • perfetto/pps-producer: add optimized cpu/gpu timestamp correlation support

  • intel/ds: use improved timestamp correlation if available

  • isl: disable MCS compression on R9G9B9E5

  • intel: fix PXP status check

  • anv: handle protected memory allocation

  • anv: allow creation of protected queues

  • anv: Emit protection + session ID on protected command buffers

  • anv: allow protected GEM context creation

  • anv: enable protected memory

  • intel/fs: fix residency handling on Xe2

  • anv: workaround XeSS for Satisfactory

  • intel/fs: rerun divergence analysis prior to convert_from_ssa

  • intel/nir/rt: fix reportIntersection() hitT handling

  • anv: fix source_hash propagation with libraries

  • anv: fix missing naming for dirty bit

  • anv: fix CC_VIEWPORT pointer dirty after blorp/simple-shaders

  • anv: fix dirty state tracking for 3DSTATE_PUSH_CONSTANT_ALLOC

  • intel/decoder: handle 3DPRIMITIVE_EXTENDED in accumulated prints

  • intel/blorp: move Wa_18019816803 out of blorp code

  • anv: get rid of the duplicate pipeline fields in command buffer state

  • anv/blorp: move helper function about BTI changes to blorp

  • intel/perf: fix querying of configurations

  • intel/fs: fix incorrect register flag interaction with dynamic interpolator mode

  • intel/fs: reuse set_predicate()

  • intel/aux_map: introduce ref count of L1 entries

  • anv: use main image address to determine ccs compatibility

  • anv: track & unbind image aux-tt binding

  • anv: remove heuristic preferring dedicated allocations

  • intel/ds: add trace of buffer markers

  • intel/tools: add hang_replay tool

  • intel/hang_replay: add the ability to pass the context image to sim-drm

  • intel: add error2hangdump tool

  • intel/aubinator_error_decode: bump max buffers to 1024

  • intel/error_decode: map i915 gfx12.5 register names to our names

  • intel/tools: hang viewer/editor

  • anv: add a sampler state pool

  • anv: move descriptor set type selection to earlier

  • anv: make a couple of descriptor function private

  • anv: add missing push descriptor flush on ray tracing pipelines

  • anv: set layout printer

  • anv: use 2 different buffers for surfaces/samplers in descriptor sets

  • intel/hang_replay: fix compile race with generated files

  • intel/tools: 32bit compile fixes

  • vulkan/runtime: retain video session creation flags

  • anv/video: only report matching memory types for protected sessions

  • util/u_printf: add a u_printf_ptr() variant

  • nir: make printf_info (de)serializer available

  • nir/clone: fix missing printf_info clone

  • nir: include printfs from linked shaders

  • nir/divergence: handle printf intrinsic

  • nir/serialize: untangle printf serialization from a particular stage

  • nir: fixup nir_printf intrinsic description

  • anv: fix incorrect queue_family access on command buffer

  • isl: constify isl_device_get_sample_counts()

  • anv: get features after initializing drm

  • anv: switch to use runtime physical device properties infrastructure

  • anv: promote EXT_vertex_attribute_divisor to KHR

  • anv: promote EXT_calibrated_timestamps to KHR

  • isl: drop AUX-TT CCS alignment with INTEL_DEBUG=noccs

  • anv: wait for CS write completion before executing secondary

  • isl: further restrict alignment constraints

  • isl: implement Wa_22015614752

  • intel/fs: fix depth compute state for unchanged depth layout

  • anv: remove ANV_ENABLE_GENERATED_INDIRECT_DRAWS variable

  • anv: fix disabled Wa_14017076903/18022508906

  • intel/aux_map: fix fallback unmapping range on failure

  • anv: hide vendor ID for The Finals

  • anv: fix pipeline executable properties with graphics libraries

  • anv: implement undocumented tile cache flush requirements

  • anv: don’t prevent L1 untyped cache flush in 3D mode

  • anv: add missing alignment for AUX-TT mapping

  • anv: factor out aux-tt binding logic for future reuse

  • anv: rename aux_tt image field

  • anv: retain ccs image binding address

  • anv: fix transfer barriers flushes with compute queue

Louis-Francis Ratté-Boulianne (4):

  • panfrost: factor out method to check whether we can discard resource

  • panfrost: add copy_resource flag to pan_resource_modifier_convert

  • panfrost: add can_discard flag to pan_legalize_afbc_format

  • panfrost: Legalize before updating part of a AFBC-packed texture

Luc Ma (1):

  • loader: Remove a line of unused include

Luca Weiss (1):

  • freedreno: Enable A305B

Lucas Fryzek (2):

  • freedreno/drm: Add more APIs to per backend API

  • gallivm/nir: Load all inputs into indirect inputs array

Lucas Stach (2):

  • etnaviv: drm: don’t update cmdstream timestamp when skipping submit

  • etnaviv: disable 64bpp render/sampler formats

Lynne (1):

  • radv: change queue family order in radv_get_physical_device_queue_family_properties

M Henning (21):

  • nak: Fix a warn(unused_must_use) by calling drop

  • nak: Remove MemScope::Cluster

  • nak: Memory order/scope encodings for Ampere

  • nak: Specify MemScope on MemOrder::Strong

  • nak: Bind nir_intrinsic_access

  • nak: Add MemOrder::Constant

  • nvk: Use load_global_constant for ubo loads

  • nak: Add encodings for cache eviction priorities

  • nak: Set “evict first” from ACCESS_NON_TEMPORAL

  • nak: Request alignment that matches the load width

  • nak: Use nir_combined_align

  • nvk: Fix descriptor alignment offset

  • nak: Provide robustness info to postprocess_nir

  • nak: Call nir_opt_load_store_vectorize

  • nak: Call nir_opt_combine_barriers

  • nak: Call nir_opt_shrink_vectors

  • nak: Clamp negative texture array indices to zero

  • nak: Enable loop unrolling.

  • nak: Print out an instruction count

  • nak: Add a jump threading pass

  • nak: Optimize jumps to fall-through if possible

Marcin Ślusarz (1):

  • anv: fix minSubgroupSize for xe2

Marek Olšák (199):

  • radeonsi: initialize perfetto in the right place

  • ac: add missing gfx11.5 bits

  • ac/gpu_info: adjust attribute ring size for gfx11

  • ac/surface: cosmetic changes

  • ac/surface/tests: cosmetic changes

  • radeonsi: don’t use nir_optimization_barrier_vgpr_amd with ACO

  • radeonsi: inline si_allocate_gds and si_add_gds_to_buffer_list

  • radeonsi: inline si_screen_clear_buffer

  • radeonsi: remove redundant VS_PARTIAL_FLUSH for streamout

  • radeonsi: remove AMD_DEBUG=nogfx

  • radeonsi: rename ctx -> sctx in si_emit_guardband

  • radeonsi: remove and inline si_shader::ngg::prim_amp_factor

  • radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024

  • radeonsi: cosmetic changes in si_pm4.c

  • radeonsi: split setting num_threads in si_emit_dispatch_packets

  • radeonsi: use si_shader_uses_streamout properly

  • radeonsi: adjust setting PA_SC_EDGERULE once more

  • radeonsi: various isolated cosmetic changes

  • radeonsi: move max_dist for MSAA into si_state_msaa.c

  • radeonsi: cosmetic changes in si_state_viewport.c

  • radeonsi: cosmetic changes in si_state_binning.c, si_state_msaa.c

  • radeonsi: move setting registers at the end of si_emit_cb_render_state

  • ac/gpu_info: split has_set_pairs_packets into context and sh flags

  • ac/gpu_info,llvm: trivial cosmetic changes

  • radeonsi: clean up si_set_streamout_targets

  • radeonsi: upload shaders using a compute queue instead of gfx

  • radeonsi: rewrite PM4 packet building helpers with less duplication

  • radeonsi: move buffered_xx_regs into a substructure

  • radeonsi: rename HAS_PAIRS -> HAS_SH_PAIRS_PACKED

  • radeonsi: rename radeon_*push_*_sh_reg -> gfx11_*push_*_sh_reg

  • radeonsi: rewrite gfx11_*push*_sh_reg helpers

  • radeonsi: restructure blocks in si_setup_nir_user_data

  • radeonsi: restructure blocks in si_emit_graphics_{shader,compute}_pointers

  • radeonsi/gfx11: use PKT3_SET_CONTEXT_REG_PAIRS_PACKED for PM4 states

  • radeonsi: don’t call nir_lower_compute_system_values too many times

  • radeonsi: don’t check DCC compatibility on chips where it’s no-op

  • radeonsi: cosmetic changes in si_emit_db_render_state

  • radeonsi: prettify code around PA_SC_LINE_STIPPLE

  • radeonsi: move emitting VGT_TF_PARAM into gfx10_emit_shader_ngg

  • radeonsi: remove num_params variable from gfx10_shader_ngg

  • radeonsi: move SPI_SHADER_IDX_FORMAT into the preamble (it’s immutable)

  • radeonsi: adjust the total viewport area

  • radeonsi/gfx11: use SET_CONTEXT_REG_PAIRS_PACKED for other states

  • radeonsi/gfx11: don’t set OREO_MODE to fix rare corruption

  • radeonsi: don’t dma-upload shaders on APUs

  • radeonsi/ci: update failures for gfx103

  • st/mesa: disable light_twoside if back faces are culled

  • glsl/nir: return failure from link_varyings if there is a linker error

  • nir: add lowering from FS LAYER input to LAYER_ID sysval

  • nir: return progress from nir_remove_sysval_output

  • ac/nir: add kill_layer flag to VS/GS/NGG lowering

  • st/mesa: set pipe_framebuffer_state::layers for PBO blits

  • radeonsi: clean up si_nir_kill_outputs

  • radeonsi: don’t allocate output space for LAYER/VIEWPORT before TES and GS

  • radeonsi: implement gl_Layer in FS as a system value

  • radeonsi: remove the LAYER output if the framebuffer state has only 1 layer

  • nir: fix gathering TESS_LEVEL_INNER/OUTER usage with lowered IO

  • nir: don’t declare illegal varyings in nir_create_passthrough_tcs

  • nir/print: print PATCH0 and VARn_16BIT names instead of numbers for TCS and TES

  • gallium/docs: make CAP doc order match definition order

  • gallium: add PIPE_CAP_PERFORMANCE_MONITOR for GL_AMD_performance_monitor

  • radeonsi: group equal CAP cases

  • radeonsi: only expose GL_AMD_performance_monitor on gfx7-10.3

  • ac: rename ac_parse_ib.c -> ac_ib_parser.c

  • ac: move the IB parsers into ac_parse_ib.c

  • ac: add an IB parser that gathers context rolls

  • mesa: optimize _mesa_matrix_is_identity

  • mesa: skip checking for identity matrix in glMultMatrixf with glthread

  • mesa: optimize setting the identity matrix

  • glthread: add a marker at the end of batches indicating the end

  • glthread: eliminate push/pop calls in PushMatrix+Draw/MultMatrixf+PopMatrix

  • glthread: add option to put autogenerated marshal structures in the header file

  • glapi: rename primcount -> instance_count in a few Draw functions

  • glthread: use autogenerated marshal structures for custom functions

  • glthread: rework type reduction and reduce vertex stride params to 16 bits

  • glapi: only expose GL_EXT_direct_state_access functions to GL compatibility

  • glthread: don’t do “if (COMPAT)” if the function is not in the GL core profile

  • glapi: only allow deprecated=”” on non-aliased functions

  • glthread: pass struct marshal_cmd_DrawElementsUserBuf into Draw directly

  • mesa: deduplicate glVertexPointer and glNormalPointer vs DSA error checking

  • glthread: add a string table of function names

  • radeonsi/gfx11: fix unaligned SET_CONTEXT_PAIRS_PACKED

  • radeonsi: don’t set non-existent VGT_GS_MAX_PRIMS_PER_SUBGROUP on gfx10

  • radeonsi: change the low-priority compiler queue to normal priority

  • radeonsi: update shaders for blend state only if the shader key changed

  • radeonsi: update shaders for rasterizer state only if the shader key changed

  • radeonsi: clean up setting poly/line/stipple shader key bits

  • radeonsi: rewrite how shader key bits dependent on current_rast_prim are updated

  • radeonsi: rewrite si_get_total_colormask as si_any_colorbuffer_written

  • radeonsi: in bind_{blend,rs}_state, only call 1 update function per if

  • radeonsi/gfx11: skip si_set_streamout_enable because it has no effect

  • radeonsi: execute streamout_begin after cache flushes

  • radeonsi: don’t print the preamble state separately for GALLIUM_DDEBUG

  • radeonsi: replace gl_FrontFacing with a constant if one side is always culled

  • radeonsi: set OOB_SELECT for VBOs in si_create_vertex_elements

  • radeonsi: group most vertex element fields

  • radeonsi/gfx11: prefer Wave64 for PS without inputs for better VALU perf

  • radeonsi/gfx11: disable the shader profile for Medical that forces Wave64

  • radeonsi/gfx11: disable the shader profile for Medical that disables binning

  • radeonsi: clean up how debug flags and shader profiles determine the wave size

  • radeonsi/gfx11: prefer Wave64 for VS/TCS/TES/GS because it’s slightly faster

  • winsys/amdgpu: bypass GL2 for command buffers

  • radeonsi: track NIR progress properly for optimizations in si_get_nir_shader

  • ac,radeonsi: rename pos_inputs -> fragcoord_components

  • nir,radeonsi: add FLAGS into load_vector_arg_amd to record color input usage

  • radeonsi: change the signature of si_nir_lower_ps_color_input

  • radeonsi: gather lowered color inputs for monolithic PS

  • radeonsi: add PS input info into si_shader_binary_info

  • radeonsi: don’t include the PARAM_GEN input in si_shader_info

  • radeonsi: decrease NUM_INTERP if uniform inlining eliminated PS inputs

  • radeonsi: update comments about uniform inlining

  • radeonsi: decrease NUM_INTERP if export formats/colormask eliminated PS inputs

  • util: make BITSET_TEST_RANGE_INSIDE_WORD take a value to compare with

  • radeonsi: merge context_reg_saved_mask and other_reg_saved_mask into a BITSET

  • radeonsi: convert depth-stencil-alpha state to tracked registers

  • radeonsi: convert rasterizer state to tracked registers

  • ac/gpu_info: fix printing radeon_info after adding VPE

  • radeonsi: rework how guardband registers are updated to decrease overhead

  • mesa: fix _mesa_matrix_is_identity

  • mesa: remove some DrawTransformFeedback duplication

  • mesa: remove some DrawElementsInstanced duplication

  • mesa: remove more DrawArrays/Elements duplication

  • mesa: remove non-relevant 16-year-old comment

  • st/mesa: make prepare_(indexed_)draw non-static

  • mesa: inline st_draw_transform_feedback

  • mesa: call st_prepare_(indexed_)draw before Driver.DrawGallium(MultiMode)

  • st/mesa: no need to check index_size in st_prepare_indexed_draw anymore

  • mesa: move index bounds code (st_prepare_indexed_draw) into draw.c

  • cso: do cso_context inheritance how we do it elsewhere

  • cso: inline cso_get_pipe_context

  • mesa: execute an error path sooner in _mesa_validated_drawrangeelements

  • gallium: add typedef pipe_draw_func matching the draw_vbo signature and use it

  • ac/llvm: remove code for converting txd from 1D to 2D because NIR does it

  • ac,radeonsi: require DRM 3.27+ (kernel 4.20+) same as RADV

  • winsys/amdgpu: don’t return a value from cs_add_buffer

  • winsys/amdgpu: cosmetic changes in amdgpu_cs_add_buffer

  • winsys/amdgpu: inline amdgpu_add_fence_dependencies_bo_lists

  • winsys/amdgpu: use inheritance for the cache_entry BO field

  • winsys/amdgpu: use inheritance for the real BO

  • winsys/amdgpu: use inheritance for the sparse BO

  • winsys/amdgpu: use inheritance for the slab BO

  • winsys/amdgpu: move lock from amdgpu_winsys_bo into sparse and real BOs

  • winsys/amdgpu: don’t count memory usage because it’s unused

  • winsys/amdgpu: change real/slab/sparse_buffers to buffer_lists[3]

  • winsys/amdgpu: change amdgpu_lookup_buffer to take struct amdgpu_buffer_list

  • winsys/amdgpu: clean up duplicated code around amdgpu_lookup/add_buffer

  • winsys/amdgpu: return amdgpu_cs_buffer* from add/lookup_buffer instead of index

  • winsys/amdgpu: pass amdgpu_buffer_list* to amdgpu_add_bo_fences_to_dependencies

  • winsys/amdgpu: clean up the rest of the code for cs->buffer_lists

  • winsys/amdgpu: fix amdgpu_cs_has_user_fence for VPE

  • winsys/amdgpu: document BO structures

  • ci: disable the google/freedreno farm because it’s down

  • glthread: add a missing end-of-batch marker

  • mesa: micro-improvements in draw.c

  • st/mesa: restore pipe_draw_info::mode at the end of st_hw_select_draw_gallium

  • mesa: add a pipe_draw_indirect_info* parameter into the DrawGallium callback

  • mesa: enable GL_SELECT and GL_FEEDBACK modes for indirect draws

  • winsys/amdgpu: reduce wasted memory due to the size tolerance in pb_cache

  • gallium/pb_slab: move group_index and entry_size from pb_slab_entry to pb_slab

  • iris,zink,winsys/amdgpu: remove unused/redundant slab->entry_size

  • winsys/amdgpu: rename to amdgpu_bo_slab to amdgpu_bo_slab_entry

  • winsys/amdgpu: stop using pb_buffer::vtbl

  • gallium/pb_cache: remove pb_cache_entry::end to save space

  • gallium/pb_cache: switch time variables to milliseconds and 32-bit type

  • radeon_winsys: add struct radeon_winsys* parameter into fence_reference

  • r300,r600,radeon/winsys: always pass the winsys to radeon_bo_reference

  • winsys/amdgpu: don’t layer slabs, use only 1 level of slabs, it improves perf

  • winsys/amdgpu: add amdgpu_bo_real_reusable slab for the backing buffer

  • winsys/amdgpu: remove now-redundant amdgpu_bo_slab_entry::real

  • winsys/amdgpu: remove va (gpu_address) from amdgpu_bo_slab_entry

  • winsys/amdgpu: don’t use gpu_address to compute slab entry offset in bo_map

  • gallium/pb_buffer: define pb_buffer_lean without vtbl, inherit it by pb_buffer

  • gallium/pb_cache: switch to pb_buffer_lean

  • gallium/pb_cache: remove pb_cache_entry::mgr

  • gallium/pb_cache: remove pb_cache_entry::buffer

  • winsys/radeon: stop using pb_buffer::vtbl

  • r300,r600,radeonsi: switch to pb_buffer_lean

  • winsys/amdgpu: allocate 1 amdgpu_bo_slab_entry per cache line

  • winsys/amdgpu: compute bo->unique_id at pb_slab_alloc, not at memory allocation

  • winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system

  • winsys/amdgpu: rename amdgpu_winsys_bo::bo -> bo_handle

  • winsys/amdgpu: rename amdgpu_bo_sparse::lock -> commit_lock

  • winsys/amdgpu: rename amdgpu_bo_real::lock to map_lock

  • winsys/amdgpu: remove dependency_flags parameter from cs_add_fence_dependency

  • winsys/amdgpu: implement explicit fence dependencies as sequence numbers

  • winsys/amdgpu: use pipe_reference for amdgpu_ctx refcounting

  • winsys/amdgpu: don’t use amdgpu_fence::ctx for fence dependencies

  • winsys/amdgpu: simplify code using amdgpu_cs_context::chunk_ib

  • radeonsi/ci: add gfx11 flakes

  • glthread: don’t unroll draws using user VBOs with GLES

  • glthread: add proper helpers for call fences

  • gallium/u_threaded_context: use function table to jump to different draw impls

  • mesa,u_threaded_context: add a fast path for glDrawElements calling TC directly

  • gallium/u_threaded: use a dummy end call to indicate the end of the batch

  • gallium/u_threaded: remove unused param from tc_bind_buffer/add_to_buffer_list

  • gallium/u_threaded: keep it enabled even if the CPU count is 1

  • meson: require libdrm_amdgpu 2.4.119

  • winsys/amdgpu: remove amdgpu_bo_real::gpu_address, use amdgpu_va_get_start_addr

  • winsys/amdgpu: remove amdgpu_bo_sparse::gpu_address, use amdgpu_va_get_start_addr

Mario Kleiner (1):

  • v3d: add B10G10R10[X2/A2]_UNORM to format table.

Mark Collins (8):

  • meson: Only include virtio when DRM available

  • meson: Only link libvdrm to Turnip with virtio KMD

  • meson: Update lua wrap to 5.4.6-4

  • freedreno/rddecompiler: Emit explicit scope for CP_COND_REG_EXEC

  • freedreno/rddecompiler: Decode ELSE branches using NOPs

  • freedreno/rddecompiler: Reset buffers after RD_CMDSTREAM_ADDR

  • freedreno/rddecompiler: Print pkt values in hex

  • freedreno/rddecompiler: Add ability to read GPU buffer into file

Mark Janes (7):

  • iris: make shader cache content deterministic

  • anv: make shader cache content deterministic

  • intel: remove workaround for preproduction DG2 steppings

  • intel/dev: improve descriptions of workaround macros.

  • intel/dev: poison macros for workarounds fixed at a stepping

  • intel: remove MTL a0 workarounds

  • intel/dev: update workaround definitions to latest defect status

Mart Raudsepp (1):

  • docs: Fix typo in OpenGL 3.3 support on Asahi

Martin Roukala (né Peres) (12):

  • zink/ci: drop the concurrency of the zink-radv-vangogh-valve job

  • ci/b2c: fix artifact collection

  • radv/ci: fix `vkcts-navi21-valve` execution

  • Revert “ci/deqp-runner: turn paths in errors into links”

  • radv: disable meshShaderQueries on gfx10.3

  • amd/ci: reduce Renoir’s concurrency to 16

  • ci/b2c: fix the `cmdline_extra` variable name

  • ci: disable the valve-kws farm until it can be rebooted

  • Revert “ci: disable the valve-kws farm until it can be rebooted”

  • ci: disable mupuf’s farm

  • ci: disable collabora’s farm which appears to be down

  • Revert “ci: disable mupuf’s farm”

Mary Guillemard (37):

  • venus: skip bind sparse info when checking for feedback query

  • nir: Add AGX-specific doorbell and stack mapping opcodes

  • agx: Add doorbell and stack mapping opcodes

  • agx: Handle doorbell and stack mapping intrinsics

  • asahi: clc: Handle doorbell and stack mapping intrinsics

  • agx: Add stack load and store opcodes

  • agx: Implement scratch load/store

  • agx: Add stack adjust opcode

  • agx: Emit stack_adjust in the entrypoint

  • zink: Check for VK_EXT_extended_dynamic_state3 before setting A2C

  • nak: sm75: Fix panic when encoding MUFU with SQRT and TANH

  • nak: Make PRMT selection a Src

  • nak: Add support for fddx and fddy

  • nak: Add for_each_instr in Shader

  • nak: Gather global memory usage for ShaderInfo

  • nak: Fix ALD/AST encoding for vtx and offset

  • nak: Add a complete wrapper around SPH

  • nak: Collect information to create SPH

  • nak: Remove encode_hdr_for_nir

  • nak: Restructure ShaderInfo

  • nak: Add geometry shader support

  • nak: Ensure we allocate one barrier when using BAR.SYNC

  • nak: Implement VK_KHR_shader_terminate_invocation

  • nak: Move nir_lower_int64 after I/O lowering

  • nak: Pass offset to load_frag_w

  • nak: Rewrite nir_intrinsic_load_sample_pos and implement nir_intrinsic_load_barycentric_at_sample

  • nir: Add a ldtram_nv intrinsic

  • nak: Add more bits discovered in SPH

  • nvk: Implement VK_KHR_fragment_shader_barycentric

  • nvk: Disable flush on each queries and flush at the end

  • nvk: Implement VK_EXT_primitives_generated_query

  • venus: Do not submit batch manually when no feedback is required

  • nak: Fix NAK_ATTR_CLIP_CULL_DIST_7 wrong value

  • nak: sm50: Implement FFMA

  • zink: Force 128 fs input components under Venus for Intel

  • zink: Initialize pQueueFamilyIndices for image query / create

  • zink: Always fill external_only in zink_query_dmabuf_modifiers

Matt Turner (11):

  • r600: Add missing dep on git_sha1.h

  • util: Include stdint.h in libdrm.h

  • util: Provide DRM_DEVICE_GET_PCI_REVISION definition

  • ci/lava: Add firmware-misc-nonfree on amd64

  • intel: Only validate inst compaction if debugging a shader stage

  • iris: Only initialize batch decoder if necessary

  • symbols-check: Add _GLOBAL_OFFSET_TABLE_

  • nir: Fix cast

  • nir/tests: Reenable tests that failed on big-endian

  • util: Add DETECT_ARCH_HPPA macro

  • util/tests: Disable half-float NaN test on hppa/old-mips

Mauro Rossi (3):

  • Android.mk: filter out cflags to build with Android 14 bundled clang

  • Android.mk: disable android-libbacktrace to build with Android 14

  • Android.mk: be able to build radeonsi without llvm

Max R (3):

  • virgl: Implement clear_render_target and clear_depth_stencil

  • ci: Uprev virglrenderer

  • d3d10umd: Fix compilation

Maíra Canal (22):

  • v3dv: implement VK_EXT_multi_draw

  • v3dv: move multisync functions to the beginning of the file

  • v3dv: allow different in/out sync queues

  • v3dv: allow set_multisync() to accept more wait syncobjs

  • drm-uapi: extend interface for indirect CSD CPU job

  • v3dv: check CPU queue availability

  • v3dv: create a CPU queue type

  • v3dv: use the indirect CSD user extension

  • v3dv: occlusion queries aren’t handled with a CPU job

  • drm-uapi: extend interface for timestamp query CPU job

  • v3dv: use the timestamp query user extension

  • drm-uapi: extend interface for reset timestamp CPU job

  • v3dv: use the reset timestamp user extension

  • drm-uapi: extend interface for copy timestamp results CPU job

  • v3dv: use the copy timestamp query results user extension

  • drm-uapi: extend interface for the reset performance query CPU job

  • v3dv: don’t start iterating performance queries at zero

  • v3dv: use the reset performance query user extension

  • drm-uapi: extend interface for copy performance query CPU job

  • v3dv: use the copy performance query results user extension

  • v3d/v3dv: move V3D_CSD definitions to a separate file

  • v3dv: enable CPU jobs in the simulator

Michael Catanzaro (1):

  • util: create parents of disk cache directory if needed

Michael Tretter (1):

  • egl/wayland: fix formatting and add trailing comma

Michel Dänzer (2):

  • gallium/dri: Return __DRI_ATTRIB_SWAP_UNDEFINED for _SWAP_METHOD

  • glx: Handle IGNORE_GLX_SWAP_METHOD_OML regardless of GLX_USE_APPLEGL

Mike Blumenkrantz (48):

  • zink: don’t block large vram allocations

  • vulkan/wsi: unify all the image usage flag caps

  • draw: fix uninit variable false positive

  • zink: add copy box locking

  • tc: add non-definitive tracking for batch completion

  • tc: always track fb attachments

  • tc: add batch usage tagging to threaded_resource

  • tc: use strong refs for fb attachment tracking

  • tc: allow unsynchronized texture_subdata calls where possible

  • zink: handle unsynchronized image maps from tc

  • zink: barrier_cmdbuf -> reordered_cmdbuf

  • zink: assert that transfer_dst is available before doing buf2img

  • zink: rework cmdbuf submission to be more extensible

  • zink: add a third cmdbuf for unsynchronized (not reordered) ops

  • zink: add flag to restrict unsynchronized texture access

  • zink: add locking for batch refs

  • zink: enable unsynchronized texture uploads using staging buffers

  • ci: skip zink vram test

  • ci: bump VVL to 1.3.269

  • zink: emit SpvCapabilitySampleRateShading with SampleId

  • zink: always set VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT for usermem

  • zink: clamp resolve extents to src/dst geometry

  • zink: only emit xfb execution mode for last vertex stage

  • aux/u_transfer_helper: set rendertarget bind for msaa staging resource

  • zink: unset explicit_xfb_buffer for non-xfb shaders

  • mesa/st/texture: match width+height for texture downloads of cube textures

  • zink: add more locking for compute pipelines

  • radv: correctly return oom from the device when failing to create a cs

  • zink: make (some) vk allocation commands more robust against vram depletion

  • zink: check for cbuf0 writes before setting A2C

  • vk/cmd_queue: exempt more descriptor functions from autogeneration

  • vulkan: add wrappers for descriptor ‘2’ functions

  • zink: enforce maxTexelBufferElements for texel buffer sizing

  • zink: always force flushes when originating from api frontend

  • vk/cmd_queue: stop using explicit casts

  • vk/cmd_queue: generate maint6 functions

  • vk/cmd_queue: fix up indentation a little

  • lavapipe: maint6 descriptor stuff

  • lavapipe: maint6

  • zink: fix buffer rebind early-out check

  • zink: ignore tc buffer replacement info

  • vk/cmdbuf: add back deleted maint6 workgraph bits

  • lavapipe: use pushconstants2 for dgc

  • lavapipe: fix devenv icd filename

  • zink: fix separate shader patch variable location adjustment

  • zink: set more dynamic states when using shader objects

  • zink: always map descriptor buffers as COHERENT

  • zink: fix descriptor buffer unmaps on screen destroy

Mohamed Ahmed (4):

  • nvk: Fix GetImageSubResourceLayout for non-disjoint images

  • nil: Add support for linear images

  • nvk: Wire up rendering to linear

  • nvk: Enable linear images for texturing

Molly Sophia (1):

  • tu: Fix KHR_present_id and KHR_present_wait being used without initialization

Nanley Chery (11):

  • iris: Optimize BO_ALLOC_ZEROED for suballocations

  • iris: Zero the clear color before FCV_CCS_E rendering

  • iris: Don’t memset the clear color BO during aux init

  • iris: Simplify get_main_plane_for_plane

  • iris: Simplify a plane count check in from_handle

  • iris: Use helpers for generic aux plane importing

  • iris: Inline import_aux_info

  • iris: Use common res fields for imported planes

  • iris: Delay main and aux resource creation on import

  • isl: Handle MOD_INVALID in clear color plane check

  • iris: Fix lowered images in get_main_plane_for_plane

Neha Bhende (1):

  • ntt: lower indirect tesslevels in ntt

Patrick Lerda (1):

  • glsl/nir: fix gl_nir_cross_validate_outputs_to_inputs() memory leak

Paulo Zanoni (34):

  • anv: don’t forget to destroy device->vma_mutex

  • anv: alloc client visible addresses at the bottom of vma_hi

  • anv/sparse: join multiple bind operations when possible

  • anv/sparse: join multiple NULL binds when possible

  • anv/sparse: also print bind->address at dump_anv_vm_bind

  • intel/genxml: add the Gen12+ TR-TT registers

  • anv/sparse: extract anv_sparse_bind()

  • anv: setup the TR-TT vma heap

  • vulkan: fix potential memory leak in create_rect_list_pipeline()

  • anv/sparse: allow sparse resouces to use TR-TT as its backend

  • anv/sparse: fix limits.sparseAddressSpaceSize when using vm_bind

  • anv/trtt: join L1 writes into a single MI_STORE_DATA_IMM when possible

  • anv/trtt: also join the L3/L2 writes into a single MI_STORE_DATA_IMM

  • anv/sparse: drop anv_sparse_binding_data from dump_anv_vm_bind()

  • anv/sparse: join all submissions into a single anv_sparse_bind() call

  • anv/sparse: pass anv_sparse_submission to the backend functions

  • anv/sparse: add ‘queue’ to anv_sparse_submission

  • anv/trtt: use ‘queue’ from anv_sparse_submission in the backend

  • anv/sparse: move waiting/signaling syncobjs to the backends

  • anv/sparse: process image binds before opaque image binds

  • anv/i915: extract setup_execbuf_fence_params()

  • anv/xe: allow passing extra syncs to xe_exec_process_syncs()

  • anv/trtt: don’t wait/signal syncobjs using the CPU anymore

  • anv/trtt: add struct anv_trtt_batch_bo and pass it around

  • anv/trtt: add support for queue->sync to the TR-TT batches

  • anv/trtt: properly handle the lifetime of TR-TT batch BOs

  • anv: enable sparse by default on i915.ko

  • anv/sparse: don’t support YCBCR 2x1 compressed formats

  • anv+zink/ci: document new sparse failures

  • anv/sparse: reject binds that are not a multiple of the granularity

  • anv/tr-tt: assert the bind size is a multiple of the granularity

  • anv/sparse: check if the non-sparse version is supported first

  • anv/sparse: document USAGE_2D_3D_COMPATIBLE as non-standard too

  • intel/tools: fix compilation of intel_hang_viewer on 32 bits

Pavel Asyutchenko (1):

  • mesa/main: allow S3TC for 3D textures

Pavel Ondračka (17):

  • r300: add late vectorization after nir_move_vec_src_uses_to_dest

  • r300: small adress register load optimization

  • r300: nir fcsel/CMP lowering pass for R500

  • r300: add some more early bool lowering

  • r300: lower flrp in NIR

  • r300: fcsel_ge lowering from lowered ftrunc

  • r300: lower ftrunc in NIR

  • r300: remove backend CMP lowering

  • r300: remove backend LRP lowering

  • r300: mark load_ubo_vec4 with ACCESS_CAN_SPECULATE

  • r300: fix memory leaks in compiler tests

  • ci: uprev mesa-trigger container

  • ci: add r300 RV530 dEQP gles2 CI job

  • r300/ci: add missing kernel url quotes

  • r300/ci: switch to b2c v0.9.11

  • r300/ci: add piglit job

  • r300: fix reusing of color varying slots for generic ones

Peyton Lee (6):

  • frontends, va: add new parameters of post processor

  • amd,radeonsi: add libvpe

  • amd: add new hardware ip for vpe

  • amd, radeonsi: add si_vpe.c with helper functions of VPE lib

  • amd, radeonsi: supports post processing entrypoint

  • winsys, amdgpu, drm: add VPE submission handle

Phillip Pearson (1):

  • radeonsi: use PRIu64 instead of %lu for uint64_t formatting

Pierre-Eric Pelloux-Prayer (23):

  • mesa: restore call to _mesa_set_varying_vp_inputs from set_vertex_processing_mode

  • radeonsi/ci: update failures

  • radeonsi: check sctx->tess_rings is valid before using it

  • Revert “radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024”

  • egl/wayland: set the correct modifier for the linear_copy image

  • radeonsi: use a compute shader to convert unsupported indices format

  • radeonsi: update guardband if vs_disables_clipping_viewport changes

  • radeonsi/sqtt: fix RGP pm4 state emit function

  • radeonsi/sqtt: clear record_counts variable

  • radeonsi/sqtt: rework pm4.reg_va_low_idx

  • radeonsi/sqtt: use calloc instead of malloc

  • radeonsi/sqtt: reformat with clang-format

  • radeonsi/sqtt: fix capturing indirect dispatches with SQTT

  • radeonsi/winsys: add cs_get_ip_type function

  • radeonsi/sqtt: fix emitting SQTT userdata when CAM is needed

  • radeonsi/sqtt: fix capturing RGP on RDNA3 with more than one Shader Engine

  • radeonsi/sqtt: handle COMPUTE queues as well

  • radeonsi: fix extra_md handling with fmask

  • ac/surface: don’t oversize surf_size

  • radeonsi: compute epitch when modifying surf_pitch

  • Revert “ci/radeonsi: disable VA-API testing on raven”

  • radeonsi: emit cache flushes before draw registers

  • radeonsi: adjust flags for si_compute_shorten_ubyte_buffer

Qiang Yu (35):

  • aco: do not fix_exports when separately compiled ngg vs or es

  • aco: add create_end_for_merged_shader

  • aco: extend max operands in a instruction to 128

  • aco: move end program handling to select_shader

  • aco: stop emit s_endpgm for first stage of merged shader

  • aco: add aco_is_gpu_supported

  • radeonsi: add vs prolog args needed by aco ls vgpr fix

  • radeonsi: fill aco shader info for part mode merged shader

  • radeonsi: enable aco compilation for merged shader parts

  • radeonsi: move use_aco to si_screen

  • radeonsi: move llvm compiler alloc/free into create/destroy funcntion

  • radeonsi: stop llvm context creation when use aco

  • radeonsi: move llvm internal header to si_shader_llvm.h

  • radeonsi: selectively build si llvm compiler create/destroy

  • radeonsi: selectively build llvm compile

  • radeonsi: set use_aco when no llvm available

  • radeonsi: include ac_llvm_util.h when llvm available

  • radeonsi: disk cache remove llvm dependancy when use aco

  • radeonsi: does not call llvm init when no llvm available

  • radeonsi: change compiler name for aco

  • radeonsi: selectively build llvm files

  • meson: be able to build radeonsi without llvm

  • radeonsi: fix piglit image coherency test when use aco

  • aco,radv: add aco_is_nir_op_support_packed_math_16bit

  • radeonsi: only vectorize nir ops that aco support

  • ac/llvm: remove nir_op_*2*mp ops handling

  • nir: add force_f2f16_rtz option to lower f2f16 to f2f16_rtz

  • aco,ac/llvm,radeonsi: lower f2f16 to f2f16_rtz in nir

  • aco: set MIMG unrm for GL_TEXTURE_RECTANGLE

  • aco: handle GL_TEXTURE_RECTANGLE in tg4_integer_workarounds

  • radeonsi: add missing args in spi_ps_input_ena when fbfetch output

  • nir: fix load layer id system_values_read info gather

  • aco: fix set_wqm segfault when ps prolog

  • radeonsi: fix legacy merged LS/ES workgroup size for aco compilation

  • radeonsi: unify elf and raw shader binary upload

Raphaël Gallais-Pou (1):

  • gallium: add sti DRM entry point

Rhys Perry (55):

  • nir: add helpers to skip idempotent passes

  • radv: use NIR_LOOP_PASS helpers

  • aco: add VALU/SALU/VMEM/SMEM statistics

  • aco: collect Pre-Sched SGPRs/VGPRs before spilling

  • radv: call lower_array_deref_of_vec before lower_io_arrays_to_elements

  • radv: skip radv_remove_varyings for mesh shaders

  • radv: disable gs_fast_launch=2 by default

  • aco/tests: fix tests with LLVM 17

  • aco/tests: fix tests with LLVM 18

  • aco: workaround LS VGPR initialization bug in RADV prologs

  • aco: skip LS VGPR initialization bug workaround if the prolog exists

  • radv: set prolog as_ls if has_ls_vgpr_init_bug=true

  • docs: fix RADV_THREAD_TRACE_CACHE_COUNTERS default

  • nir/lower_fp16_casts: correctly round RTNE f64->f16 casts

  • nir/lower_fp16_casts: add option to split fp64 casts

  • radeonsi: use nir_lower_fp16_casts

  • radv: use nir_lower_fp16_casts

  • aco: remove f16<->f64 conversions

  • intel/compiler: use nir_lower_fp16_casts

  • radv: add radv_disable_trunc_coord option

  • radv: enable radv_disable_trunc_coord for vkd3d-proton/DXVK

  • ac/gpu_info: update conformant_trunc_coord comment

  • ac/nir: fix partial mesh shader output writes on GFX11

  • ac/nir: ignore 8/16-bit global access offset

  • ac/nir: fix 32-bit offset global access optimization

  • aco: flush denormals for 16-bit fmin/fmax on GFX8

  • aco: implement 16-bit fsign on GFX8

  • aco: implement 16-bit derivatives

  • aco: implement 16-bit fsat on GFX8

  • aco: simplify v_mul_* labelling slightly

  • aco: insert p_end_wqm before p_jump_to_epilog

  • nir/loop_analyze: skip if basis/limit/comparison is vector

  • nir/loop_analyze: scalarize try_eval_const_alu

  • nir/loop_analyze: fix vector basis/limit/comparison

  • nir/loop_analyze: check min compatibility with comparison

  • nir/loop_analyze: support umin and {u,i,f}max

  • nir/loop_analyze: support loops with min/max and non-add incrementation

  • vulkan/wsi: don’t support present with queues where blit is unsupported

  • vulkan/wsi: fix win32 compilation

  • vulkan/wsi: always create command buffer for special blit queues

  • nir/loop_analyze: remove invariance analysis

  • aco/tests: use more raw strings

  • aco: correctly set min/max_subgroup_size for wave32-as-wave64

  • radv: use CS wave selection for task shaders

  • radv: remove radv_shader_info’s cs.subgroup_size

  • nir: add msad_4x8

  • nir/algebraic: optimize vkd3d-proton’s MSAD

  • aco: implement msad_4x8

  • ac/llvm: implement msad_4x8

  • radv: enable msad_4x8

  • nir: remove sad_u8x4

  • radv: do nir_shader_gather_info after radv_nir_lower_rt_abi

  • nir/lower_non_uniform: set non_uniform=false when lowering is not needed

  • nir/lower_shader_calls: remove CF before nir_opt_if

  • aco: fix labelling of s_not with constant

Rob Clark (34):

  • ci: Only strip debug symbols

  • tu/msm: Fix timeline semaphore support

  • tu/virtio: Fix timeline semaphore support

  • freedreno/drm: Fix race in zombie import

  • freedreno: Fix modifier determination

  • freedreno: Handle DRM_FORMAT_MOD_QCOM_TILED3 import

  • virtio/drm: Split out common virtgpu drm structs

  • freedreno/drm: Simplify backend mmap impl

  • virtio: Add vdrm native-context helper

  • freedreno/drm/virtio: Switch to vdrm helper

  • tu/drm/virtio: Switch to vdrm helper

  • freedreno/a6xx: Assume MOD_INVALID imports are linear

  • freedreno/a6xx: Fix antichamber trace replay assert

  • Revert “ci/freedreno: disable antichambers trace”

  • freedreno/a6xx: Don’t set patch_vertices if no tess

  • freedreno/a6xx: Rework wave input size

  • freedreno/drm: Fix mmap leak

  • freedreno: Always attach bo to submit

  • isaspec: Sort labels with same output

  • freedreno/drm: Fix zombie BO import harder

  • freedreno/a6xx: Fix NV12+UBWC import

  • freedreno: De-duplicate 19.2MHz RBBM tick conversion

  • freedreno: Fix timestamp conversion

  • freedreno: Implement PIPE_CAP_TIMER_RESOLUTION

  • drm-uapi: Sync drm-uapi

  • freedreno/layout: Add layout metadata

  • tu: Add metadata support for dedicated allocations

  • freedreno/drm: Add BO metadata support

  • freedreno: Add layout metadata support

  • ci: More context for color_clear skips for Wayland

  • ci: List specific color_clears skips

  • ci: Add wayland-dEQP-EGL.functional.render.* skips

  • ci: Remove per-driver wayland-dEQP-EGL xfails

  • freedreno/drm/virtio: Fix typo

Robert Foss (3):

  • egl/surfaceless: Fix EGL_DEVICE_EXT implementation

  • egl: Add _eglHasAttrib() function

  • egl/surfaceless: Don’t overwrire disp->Device if using EGL_DEVICE_EXT

Robert Mader (4):

  • util: Add new helpers for pipe resources

  • panfrost: Support parameter queries for main planes

  • vc4/resource: Support offset query for multi-planar planes

  • v3d/resource: Support offset query for multi-planar planes

Rohan Garg (31):

  • intel/compiler: migrate WA 14013672992 to use WA framework

  • blorp,anv,iris: refactor blorp functions into something more generic

  • iris: Wa 16014538804 for DG2, MTL A0

  • iris: pull WA 22014412737 into emit_3dprimitive_was

  • anv: WA 16014538804 for DG2, MTL A0

  • blorp: WA 16014538804 for DG2, MTL A0

  • anv: Refactor loading indirect parameters and filling IDD

  • anv: refactor kernel dispatch to use new common functions

  • intel/dev: Add a bit for when the HW can do a indirect draw/dispatch unroll

  • genxml/12.5: Add the EXECUTE_INDIRECT_DRAW instruction

  • genxml/12.5: Add the EXECUTE_INDIRECT_DISPATCH instruction

  • anv: Emit EXECUTE_INDIRECT_DRAW when available

  • anv: Emit a EXECUTE_INDIRECT_DISPATCH when available

  • iris: Emit a EXECUTE_INDIRECT_DISPATCH when available

  • anv: memcpy the thread dimentions only when they’re on the CPU

  • anv: introduce ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH

  • intel/genxml: Add the preferred slm size enum for xe2

  • intel: Set a preferred SLM size for LNL

  • intel/genxml: Update COMPUTE_WALKER_BODY for xe2

  • intel/genxml: Update IDD for new fields

  • blorp: set min/max viewport depths to -FLT_MAX/FLT_MAX when EXT_depth_range_unrestricted is enabled

  • anv: ensure that we clamp only when EXT_depth_range_unrestricted is not enabled

  • anv: enable VK_EXT_depth_range_unrestricted

  • iris: Emit EXECUTE_INDIRECT_DRAW when available

  • intel/compiler: use the proper enum type to store the op

  • intel/compiler: infer the number of operands using lsc_op_num_data_values

  • anv: rename anv_create_companion_rcs_command_buffer to anv_cmd_buffer_ensure_rcs_companion

  • iris,isl: Adjust driver for several commands of clear color (xe2)

  • intel/fs/xe2+: Lift CPS dispatch width restrictions on Xe2+.

  • intel/compiler: Update disassembly for new LSC cache enums

  • anv: untyped data port flush required when a pipeline sets the VK_ACCESS_2_SHADER_STORAGE_READ_BIT

Roland Scheidegger (1):

  • lavapipe: bump image alignment up to 64 bytes

Roman Stratiienko (5):

  • v3d: Don’t implicitly clear the content of the imported buffer

  • u_gralloc: Extract common code from fallback gralloc

  • u_gralloc: Add QCOM gralloc support

  • egl/android: Switch to generic buffer-info code

  • u_gralloc: Add support for gbm_gralloc

Ruijing Dong (12):

  • radeonsi/vcn: vcn4 encoding interface dummy update

  • radeonsi/vcn: preparation for enc intra-refresh

  • radeonsi/vcn: change intra-ref name

  • radonesi/vcn: enable intra-refresh in vcn encoders

  • frontends/va: add intra-refresh in VAAPI interface

  • radesonsi/vcn add qp_map definition

  • frontends/va: add ROI feature

  • radeonsi/vcn: ROI feature implementation

  • radeonsi/vcn: enable ROI feature in vcn.

  • radeonsi/vcn: ROI capability value initialization.

  • frontends/va: remove some TODOs in hevc encoding

  • radeonsi/vcn: update session_info from vcn3 and up.

Ryan Neph (6):

  • virgl: implemement resource_get_param() for modifier query

  • venus: add VN_PERF=no_tiled_wsi_image

  • venus: strip ALIAS_BIT for WSI image creation on ANV

  • venus: reject multi-plane modifiers for tiled wsi images

  • venus: add dri option to enable multi-plane wsi modifiers

  • venus: fix shmem leak on vn_ring_destroy

Sagar Ghuge (24):

  • iris: Disable auxiliary buffer if MSRT is bound as texture

  • iris: Disable CCS compression on top of MSAA compression on ACM

  • isl: Enable MCS compression on ACM platform

  • anv: Write timestamp using MI_FLUSH_DW on blitter

  • anv: Avoid emitting PIPE_CONTROL command for copy/video queue

  • anv: Flush data cache while clearing depth using HIZ_CCS_WT

  • anv: Add comment to copy image code block

  • iris: Init aux map state for compute engine

  • anv,hasvk: Use uint32_t for queue family indices

  • blorp: Handle stencil buffer compression on blitter engine

  • anv: Use RCS cmd buffer if blit src/dest has 3 components

  • intel/compiler: Adjust assertion in lower_get_buffer_size() for Xe2

  • intel/fs: Adjust destination size for image size intrinsic

  • intel/fs: Adjust destination size for global load constant on Xe2+

  • intel/fs: Adjust destination size for load ubo on Xe2+

  • intel/genxml: Add BCS/VD0 aux table base address register

  • anv: Handle video/copy engine queue initialization

  • anv: Invalidate aux map for copy/video engine

  • iris: Handle aux map init for copy engine

  • docs: Document INTEL_COPY_CLASS

  • anv: Enable blitter engine unconditionally on ACM+

  • iris: No need to emit PIPELINE_SELECT on Xe2+

  • anv: No need to emit PIPELINE_SELECT on Xe2+

  • intel/fs: Check fs_visitor instance before using it

Samuel Pitoiset (169):

  • radv: move RADV_DEBUG_NO_HIZ check in radv_use_htile_for_image()

  • radv: implement VK_EXT_image_compression_control

  • radv: advertise VK_EXT_image_compression_control

  • ac/gpu_info: remove bogus assertion about number of COMPUTE/SDMA queues

  • radv: dump the pipeline hash to the gpu hang report

  • radv: fix a synchronization issue with primitives generated query on RDNA1-2

  • ac/registers: allow to parse GCVM_L2_PROTECTION_FAULT_STATUS

  • ac/debug: add a helper to print GPUVM fault protection status

  • radv: use the GPUVM fault protection status helper

  • radv: remove NGG streamout support for RDNA1-2

  • radv: remove unnecessary VS_PARTIAL_FLUSH for NGG streamout

  • ac/nir: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd

  • aco: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd

  • radv/ci: update list of expected failures/flakes for NAVI31

  • radv: add RADV_DEBUG=nomeshshader

  • radv/ci: enable RADV_DEBUG=nomeshshader for vkcts-navi31-valve

  • radv: bind the non-dynamic graphics state from the pipeline unconditionally

  • radv: adjust binning settings to improve performance on GFX9

  • radv: fix compute shader invocations query on compute queue on GFX6

  • radv: emit COMPUTE_PIPELINESTAT_ENABLE for CS invocations on ACE

  • ci: backport two mesh/task query fixes for VKCTS

  • radv/ci: document one more flake test

  • nir: fix inserting the break instruction for partial loop unrolling

  • radv: add initial VK_EXT_device_fault support

  • radv: advertise VK_EXT_device_fault

  • ci: re-apply two mesh/task query fixes for VKCTS

  • radv: add a helper to determine if it’s possible to preprocess DGC

  • radv: emit individual SET_SH_REG for inlined push constants with DGC

  • radv: optimize emitting inlined push constants with DGC

  • radv: enable DGC preprocessing when all push constants are inlined

  • radv: restore sampling CPU/GPU clocks before starting SQTT trace

  • ac/rgp: update dumping queue event records to the capture

  • radv: add radv_write_timestamp() helper

  • radv: add support for RGP queue events

  • radv: add drirc options to force re-compilation of shaders when needed

  • radv: fix VRS subpass attachment when HTILE can’t be enabled on GFX10.3

  • radv: fix registering queues for RGP with compute only

  • radv: set radv_zero_vram=true for Unreal Engine 4/5

  • radv: fix a descriptor leak with debug names and host base descriptor set

  • radv: add a missing async compute workaround for Tonga/Iceland

  • zink/ci: add a manual job on radv-navi31

  • aco: remove useless nir_intrinsic_load_force_vrs_rates_amd

  • radv: remove redundant check when forcing VRS rates

  • radv: check earlier if a graphics pipeline can force VRS per vertex

  • ac/surface: change tile mode for 3D PRT surfaces with bpp < 64 on GFX6-8

  • radv: re-enable sparseResidencyImage3D on POLARIS10+

  • aco: rename color_exports to exports in create_fs_jump_to_epilog()

  • radv: rename ps_epilog_inputs to colors for PS epilogs

  • radv: add radv_physical_device::emulate_mesh_shader_queries for GFX10.3

  • radv: add support for mesh primitives queries on GFX10.3

  • radv: define new pipeline statistics indices for mesh/task on GFX11

  • radv: bump the pipeline state query size to 14 on GFX10.3

  • radv: do not harcode the pipeline stats mask for query resolves

  • radv: add support for mesh shader invocations queries on GFX10.3

  • radv: rework gfx10_copy_gds_query() slightly

  • radv: make some gang functions non-static

  • radv: add support for task shader invocations queries on GFX10.3

  • radv: enable meshShaderQueries on GFX10.3

  • radv/ci: add missing expected failures for mesh queries on VANGOGH

  • radv: disable TC-compatible HTILE on Tonga and Iceland

  • radv: add missing FDCC_CONTROL bits for GFX1103 R2

  • radv: set radv_invariant_geom=true for War Thunder

  • radv: do not set OREO_MODE to fix rare corruption on GFX11

  • ci: uprev vkd3d-proton to 2.11

  • radv/ci: add new flakes for VEGA10

  • radv: remove useless NIR instructions when emitting IBO with DGC

  • radv: set the stream VA for DGC graphics

  • radv: use an indirect draw when IBO isn’t updated as part of DGC

  • radv: enable DGC preprocessing for IBO

  • radv: fix bogus interaction between DGC and RT with descriptor bindings

  • radv: make sure to prefetch the compute shader for DGC

  • radv: remove radv_pipeline_key::dynamic_color_write_mask

  • radv: simplify creating image views for src resolve images

  • radv: stop performing redundant resolves with the HW resolve path

  • radv: remove unused layers support for the HW/FS resolve paths

  • radv: only re-initialize DCC for one level for the HW resolve path

  • radv: adjust assertions for multi-layer resolves with the HW/FS paths

  • radv: remove never used binds_state for DGC

  • radv: only initialize the VBO reg if VBOs are bound with DGC

  • radv: only initialize the VTX base SGPR if non-zero with DGC

  • radv: add DGC support for mesh shader only

  • radv: advertise VK_EXT_depth_clamp_zero_one

  • radv: update the reset stipple pattern mode

  • radv: change the reset stipple pattern mode for adjacent lines

  • radv: make sure to reset the stipple line state when it’s disabled

  • radv: set combinedImageSamplerDescriptorCount to 1 for multi-planar formats

  • radv: switch to on-demand PS epilogs for GPL

  • radv: remove unused code for compiling PS epilogs as part of pipelines

  • aco: export depth/stencil/samplemask in create_fs_jump_to_epilog()

  • ac/nir: add an option to skip MRTZ exports in ac_nir_lower_ps()

  • radv: determine if MRTZ needs to be exported via PS epilogs

  • radv: prepare the PS epilog key for exporting MRTZ on RDNA3

  • radv,aco: declare PS epilog VGPR arguments for depth/stencil/samplemask

  • radv: determine and emit SPI_SHADER_Z_FORMAT for PS epilogs

  • zink/ci: remove skipped tests from the list of expected failures for NAVI31

  • radv: export MRTZ via PS epilogs when alpha to coverage is dynamic on GFX11

  • radv: enable extendedDynamicState3AlphaToCoverageEnable on GFX11

  • zink/ci: skip more tests that run OOM on NAVI31

  • zink/ci: update list of failures for NAVI31

  • zink/ci: stop running zink-radv-navi31-valve sequentially

  • ci: uprev vkd3d-proton to a0ccc383937903f4ca0997ce53e41ccce7f2f2ec

  • radv: simplify disabling MRT compaction for PS epilogs

  • vulkan: bump headers/registry to 1.3.273

  • radv: promote EXT_calibrated_timestamps to KHR

  • docs: update features.txt for RADV

  • radv: remove useless check for TC-compat CMASK images during fb emission

  • radv: stop clearing FMASK_COMPRESS_1FRAG_ONLY for TC-compat CMASK images

  • vulkan/runtime: promote VK_EXT_vertex_attribute_divisor to KHR

  • radv: advertise VK_KHR_vertex_attribute_divisor

  • radv/ci: remove dEQP-VK.mesh_shader.ext.query.* from the lists

  • radv: emit the task shader in radv_emit_graphics_pipeline()

  • radv: cleanup ac_nir_lower_ps options

  • radv: cleanup gathering PS info with/without PS epilogs

  • radv: cleanup radv_pipeline_generate_ps_epilog_key()

  • radv: add support for MRT compaction with PS epilogs

  • radv: fix binding partial depth/stencil views with dynamic rendering

  • radv: stop asserting some image create info fields

  • radv: remove some declared but unused functions/macros

  • radv: add missing HTILE support for fb mip tail workaround

  • radv: stop checking FMASK for the fb mip tail workaround

  • radv: move emitting the fb mip tail workaround when rendering begins

  • radv: remove radv_get_tess_output_topology() declaration

  • radv: move meta declarations to radv_meta.h

  • radv: move RADV_HASH_SHADER_xxx flags to radv_pipeline.c

  • radv: move radv_image_is_renderable() to radv_image.c

  • radv: move more descriptor related declarations to radv_descriptor_set.h

  • radv: move radv_depth_clamp_mode to radv_cmd_buffer.c

  • radv: move more shader related declarations to radv_shader.h

  • radv: move SI_GS_PER_ES to radv_constants.h

  • radv: move buffer view related code to radv_buffer_view.c

  • radv: move image view related code to radv_image_view.c

  • vulkan: bump headers/registry to 1.3.274

  • vulkan: drop VK_ENABLE_BETA_EXTENSIONS for video encode layouts

  • radv/ci: update CI lists for NAVI10,NAVI31 and RENOIR

  • ci: apply two bugfixes for VKCTS

  • radv: move radv_{emulate,enable}_rt() to radv_physical_device.c

  • radv: make a couple of NIR RT functions as static

  • radv: move radv_rt_{common,shader} files to nir/

  • radv: move radv_BindImageMemory2() to radv_image.c

  • radv: add support for VkBindMemoryStatusKHR

  • radv: rename RADV_GRAPHICS_STAGES to RADV_GRAPHICS_STAGE_BITS

  • radv: add support for version 2 of all descriptor binding commands

  • radv: add support for NULL index buffer

  • radv: advertise VK_KHR_maintenance6

  • radv: disable FMASK for MSAA images with layers on GFX9

  • radv: stop clearing CMASK to 0xcc when FMASK is present on GFX9

  • radv: disable stencil test without a stencil attachment

  • radv: constify a variable in radv_emit_depth_control()

  • radv: remove duplicated si_tile_mode_index() function

  • radv: rename si_make_texture_descriptor() to gfx6_make_texture_descriptor()

  • radv: remove radv_write_scissors()

  • radv: drop si_ prefix from all functions

  • Revert “radv: disable DCC with signedness reinterpretation on GFX11”

  • radv: stop disabling DCC for mutable with 0 formats on GFX11

  • radv: do not program COMPUTE_MAX_WAVE_ID (GDS register) on GFX6

  • radv/winsys: replace ‘<= GFX6’ by ‘== GFX6’

  • radv: query drirc options in only one place

  • radv: move dri options to radv_instance::drirc

  • radv: rework declaring color arguments for PS epilogs

  • Revert “radv/rt: Lower ray payloads to registers”

  • radv: do not issue SQTT marker with DISPATCH_MESH_INDIRECT_MULTI

  • radv: add missing disable_shrink_image_store to the pipeline key

  • radv: move RADV_HASH_SHADER_KEEP_STATISTICS to radv_pipeline_key

  • radv: initialize radv_device::disable_trunc_coord earlier

  • radv: introduce radv_device_cache_key for per-device cache compiler options

  • radv: move all per-device keys from radv_pipeline_key to radv_device_cache_key

  • radv: fix indirect dispatches on the compute queue on GFX7

  • radv: fix indirect draws with NULL index buffer on GFX10

  • radv: fix segfault when getting device vm fault info

Sarah Walker (3):

  • pvr: Update AM62 DSS compatible string to match upstream

  • pvr: csbgen: Add dummy implementation of stream type

  • pvr: Add command stream and static context state layout to rogue_kmd_stream.xml

Sathishkumar S (1):

  • frontends/va: use va interface for jpeg partial decode

Sebastian Wick (1):

  • radeonsi: Destroy queues before the aux contexts

Sergi Blanch Torne (8):

  • ci: disable Collabora’s LAVA lab for maintance

  • Revert “ci: disable Collabora’s LAVA lab for maintance”

  • ci: disable Collabora’s LAVA lab for maintance

  • Revert “ci: disable Collabora’s LAVA lab for maintance”

  • Revert “ci: disable collabora farm as it is currently offline”

  • ci: disable Collabora’s LAVA lab for maintance

  • Revert “ac/nir: Export clip distances according to clip_cull_mask”

  • Revert “ci: disable Collabora’s LAVA lab for maintance”

Shuicheng Lin (1):

  • intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY’s ioctl

Sil Vilerino (76):

  • d3d12: d3d12_video_buffer_create_impl - Fix resource importing

  • d3d12: Allow creating d3d12_dxcore_screen from existing ID3D12Device

  • vl/win32: Add vl_win32_screen_create_from_d3d12_device

  • gallium/auxiliary: Fix pb_bufmgr_slab.c leak

  • pipe: Extend get_feedback with additional metadata

  • pipe: Add PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED

  • pipe: Add PIPE_VIDEO_CAP_ENC_INTRA_REFRESH_MAX_DURATION

  • pipe: Add H264 VUI encode params

  • pipe: Add HEVC VUI encode params

  • pipe: Add max_slice_bytes for H264, HEVC encoding

  • frontend/va: Add log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 for h264enc

  • frontend/va: Parse VUI H264 parameters

  • frontend/va: Parse VUI HEVC parameters

  • frontend/va: Support VAEncMiscParameterMaxSliceSize

  • meson: add vp9 and av1 codec support options

  • gallium/vl: Check for VP9 and AV1 meson option support flags

  • d3d12: Plumb pipe_h264_enc_picture_desc.dbk.disable_deblocking_filter_idc

  • d3d12: Use log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 from pipe_pic_params_h264

  • d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported

  • d3d12: Disable codecs according to meson video-codecs option

  • d3d12: Implement H264 VUI Writer

  • d3d12: Implement HEVC VUI Writer

  • d3d12: Implement Intra Refresh for H264, HEVC, AV1

  • d3d12: Support PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED

  • d3d12: Implement get_feedback with additional metadata

  • d3d12: fix usage of GetAdapterLuid() in mingw/GCC using ABI helper

  • ci: Build d3d12 gallium driver in debian-x86_32

  • pipe: Support inserting new headers on each H264/HEVC IDR frame

  • pipe: Add get_feedback_fence for encode async waiting on pipe_feedback_fence

  • pipe: Add fence_get_win32_handle to get HANDLE from pipe_fence_handle

  • pipe: Add p_video_codec.get_encode_headers for out of band VPS, SPS, PPS

  • pipe: Add PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP

  • pipe: Add PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE

  • pipe: Add PIPE_H264_MAX_REFERENCES

  • frontend/va: Add h264 encode ip_period param

  • frontend/va: Add VACodedBufferSegment Average QP metadata

  • frontend/va: Use p_video_codec.get_feedback_fence to report errors on frame submission

  • vl_winsys_win32: call winsys->destroy(winsys) in error conditions

  • d3d12: Implement inserting optional new headers on each H264/HEVC IDR frame

  • d3d12: Do not increase active_seq_parameter_set_id on new SPS. Force PPS on new SPS

  • d3d12: H264 encode - Allow CONSTRAINED_BASELINE profile to be written in headers

  • d3d12: Implement get_feedback_fence for encode async waiting on pipe_feedback_fence

  • d3d12: Implement fence_get_win32_handle to get HANDLE from d3d12_fence

  • d3d12: Only pass texture dimensions to d3d12_video_encoder_update_current_encoder_config_state

  • d3d12: Implement d3d12_video_encoder_get_encode_headers for out of band VPS, SPS, PPS

  • d3d12: Use new pipe h264 encode ip_period param

  • d3d12: max_frame_poc workaround for infinite GOPs

  • d3d12: Fix max slice size and max frame size metadata reporting

  • d3d12: Implement PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP

  • d3d12: Autodetect d3d12_video_buffer imported handle/resource format and dimensions when not passed

  • d3d12: Implement PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE

  • d3d12: Detect imported resource buffer unknown format

  • d3d12: Improve error detection and reporting for video encoder

  • d3d12: Fix d3d12_tcs_variant_cache_destroy leak in d3d12_context

  • d3d12: Fix screen->winsys leak in d3d12_screen

  • d3d12: d3d12_create_fence_win32 - Fix double refcount bump

  • d3d12: Fix max reference frames reporting when HW does not support B frame

  • d3d12: Video Encoder - When setting rate control dirty flags take into account rolled back optional configs

  • d3d12: Video Encoder: Support reporting non contiguous NALU, offsets for frontend extraction

  • meson: Add all, all_free (default) options for video-codecs option.

  • d3d12: Fix usage of H264/HEVC specific classes when VIDEO_CODEC_H26XENC not set

  • d3d12: Fix AV1 video encode 32 bits build

  • d3d12: Fix typos in d3d12_video_encoder_bitstream_builder_h264

  • d3d12: Use enc_constraint_set_flags for H264 NALU writing

  • frontends/va: Parse enc_constraint_set_flags from packed SPS

  • d3d12: Check video encode codec cap before checking encode profile/level cap

  • meson: Only build WGL for Windows platform when opengl option is active

  • d3d12: Bump directx-headers dependency to v611.0 for latest video codecs and features

  • d3d12: Remove D3D12_SDK_VERSION checks after bumping directx-headers dependency to v611

  • d3d12: Fix warning C4065 switch statement contains default but no case labels

  • d3d12: Implement Delta QP ROI In h264, hevc and av1 video encode

  • d3d12: Report support for PIPE_VIDEO_CAP_ENC_ROI for Delta QP

  • Revert “d3d12: Only destroy the winsys during screen destruction, not reset”

  • Revert “d3d12: Fix screen->winsys leak in d3d12_screen”

  • d3d12: Fix AV1 Encode - log2 rounding for tile_info section

  • d3d12: Implement cap for PIPE_VIDEO_CAP_ENC_INTRA_REFRESH

Simon Ser (3):

  • egl: extract EGLDevice setup in dedicated function

  • egl: move dri2_setup_device() after dri2_setup_extensions()

  • egl: ensure a render node is passed to _eglFindDevice()

Simon Zeni (2):

  • EGL: sync files with Khronos

  • egl: implement EGL_EXT_query_reset_notification_strategy

Sviatoslav Peleshko (23):

  • nir/loop_analyze: Fix inverted condition handling in iterations calculation

  • anv: Fix MI_ARB_CHECK calls in generated indirect draws optimization

  • nir/loop_analyze: Don’t test non-positive iterations count

  • intel/fs: Don’t optimize DW*1 MUL if it stores value to the accumulator

  • intel/compiler: Add variable to dump binaries of all compiled shaders

  • intel/disasm: Print half-float values instead of placeholder

  • intel/compiler: Set flag reg to 0 when disabling predication

  • intel/disasm: Print src1_len correctly depending on ExDesc type

  • intel/fs: Set group 0 for Wa_14010017096 MOV instruction

  • intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff

  • intel/tools/i965_asm: Add SWSB handling

  • intel/tools/i965_asm: Handle HF immediates

  • intel/tools/i965_asm: Handle sync instruction

  • intel/tools/i965_asm: Allow neg and abs modifiers on accumulator register

  • intel/tools/i965_asm: Don’t override flag reg from cond modifier

  • intel/tools/i965_asm: Allow src0 and src2 of ternary instructions to be imm

  • intel/tools/i965_asm: Implement gfx12 and gfx12.5 send/sendc

  • intel/tools/i965_asm: Add dp4a and add3 instructions

  • intel/tools/i965_asm: Don’t set src0 for break and while on gfx12

  • intel/tools/tests: Fix sends indirect argument in gfx9 test

  • intel/tools/tests: Unbreak i965_asm tests

  • intel/tools/tests: Add i965_asm tests for gfx12 and gfx12.5

  • nir: Use alu source components count in nir_alu_srcs_negative_equal

Sylvain Munaut (1):

  • mesa/st, dri2, wgl, glx: Restore flush_objects interop backward compat

Tapani Pälli (34):

  • intel/dev: provide intel_device_info_is_adln helper

  • iris: add required PC for Wa_14014966230

  • anv: add current_pipeline for batch_emit_pipe_control

  • anv: add required PC for Wa_14014966230

  • intel/dev: fix intel_device_info_is_adln check

  • iris: handle tile case where cso width, height is zero

  • anv: skip engine initialization if vm control not supported

  • iris: add data cache flush for pre hiz op

  • anv/drirc: add option to disable FCV optimization

  • drirc: use fake_sparse for Armored Core 6

  • drirc: Set limit_trig_input_range option for Valheim

  • iris: implement Wa_18020335297

  • anv: refactor state emission

  • anv: implement Wa_18020335297

  • iris: implement dummy blit for Wa_16018063123

  • anv: implement dummy blit for Wa_16018063123

  • mesa: lower EXT_render_snorm version requirement

  • anv: use slow clear for small surfaces with Wa_18020603990

  • iris: use slow clear for small surfaces with Wa_18020603990

  • anv/hasvk/drirc: change anv_assume_full_subgroups to have subgroup size

  • drirc: setup anv_assume_full_subgroups=16 for UnrealEngine5.1

  • anv: cleanup, use intel_needs_workaround instead of is_dg2

  • iris: cleanup, use intel_needs_workaround instead of is_dg2

  • iris: use intel_needs_workaround with 14015055625

  • mesa: fix enum support for EXT_clip_cull_distance

  • drirc/anv: disable FCV optimization for Baldur’s Gate 3

  • isl: implement Wa_14018471104

  • iris: use workaround framework for Wa_22018402687

  • anv: use workaround framework for Wa_22018402687

  • anv: check for wa 16013994831 in emit_so_memcpy_end

  • iris: expand pre-hiz data cache flush to gfx >= 125

  • anv: expand pre-hiz data cache flush to gfx >= 125

  • iris: replace constant cache invalidate with hdc flush

  • anv: move *bits_for_access_flags to genX_cmd_buffer

Tatsuyuki Ishi (25):

  • fast_urem_by_const: #ifdef DEBUG an assertion.

  • radv: Fix mis-sizing of pipeline_flags in radv_hash_rt_shaders.

  • radv: Use sizeof(flags) instead of hardcoded size in radv_hash_shaders.

  • aco: Replace aco_vs_input_state.divisors with bitfields.

  • radv: Remove last VS prolog reuse logic.

  • radv, aco: Rework VS prolog key handling.

  • radv, aco: Inline struct aco_vs_input_state.

  • radv: Pre-mask misaligned_mask for VS prolog.

  • radv: Implement helpers for shader part caching.

  • radv: Use shader part caching helpers for VS prolog and PS/TCS epilog.

  • zink: Fix missing sparse buffer bind synchronization.

  • zink: Defer freeing sparse backing buffers.

  • zink: Fix waiting for texture commit semaphores.

  • zink: Remove now unused dead_framebuffers.

  • radv: Remove aspect mask “expansion” for copy_image.

  • radv: Add workaround to allow sparse binding on gfx queues.

  • radv: Enable radv_legacy_sparse_binding for DOOM Eternal.

  • radv/amdgpu: Remove virtual bo dump logic.

  • radv/amdgpu: Separate the concept of residency from use_global_list.

  • radv: Simplify shader config assignment.

  • radv: Move up radv_get_max_waves, radv_get_max_scratch_waves.

  • radv: Precompute shader max_waves.

  • radv: Add layer to skip UnmapMemory for Quantic Dream Engine

  • radv: Recompute max_waves after postprocessing RT config

  • radv: never set DISABLE_WR_CONFIRM for CP DMA clears and copies

Tele42 (1):

  • drirc: enable `vk_wsi_force_swapchain_to_current_extent` for “The Talos Principle VR”

Teng, Jin Chung (1):

  • d3d12: Decode - Adding more supported resolution

Thomas Devoogdt (1):

  • util: os_same_file_description: fix unknown linux < 3.5 syscall SYS_kcmp

Thomas H.P. Andersen (13):

  • docs: update nvk extensions

  • nvk: use nvk_pipeline_zalloc

  • nouveau: drop unused #includes of tgsi_parse.h

  • nvk: VK_EXT_color_write_enable

  • docs: update features.txt for nvk

  • nvk: loop over stages in MESA order

  • nvk: add hashing for shaders

  • nvk: allocatable nvk_shaders

  • nvk: pipeline shader cache

  • nvk: VK_EXT_pipeline_creation_feedback

  • nvk: VK_EXT_pipeline_creation_cache_control

  • nvk: VK_EXT_shader_module_identifier

  • docs: update features.txt for nvk

Thong Thai (1):

  • radeonsi/vcn: remove EFC support for renoir

Timothy Arceri (24):

  • nir: move build_write_masked_stores() to nir builder

  • glsl/nir: implement a nir based lower distance pass

  • glsl: switch to NIR distance lowering pass

  • glsl: remove now unused lower distance pass

  • nir: simplify nir_build_write_masked_store()

  • glsl: drop ir_binop_ubo_load

  • glsl: add nir based lower_named_interface_blocks()

  • glsl: use the nir based lower_named_interface_blocks()

  • glsl: remove GLSL IR lower_named_interface_blocks()

  • nir: add nir_fixup_deref_types()

  • glsl: support glsl linking in nir block linker

  • glsl: use new nir based block linker

  • glsl: remove now unused GLSL IR block linker

  • glsl/st: move has_half_float_packing flag to consts struct

  • glsl/st: move remaining glsl ir lowering to linker

  • mesa/st: drop additional validate_ir_tree() call

  • glsl: combine shader stage loops in linker

  • radeonsi: fix divide by zero in si_get_small_prim_cull_info()

  • glsl: tidy up validation loop in linker

  • glsl: remove some unused linker code

  • glsl: copy precision val of function output params

  • glsl: add additional lower mediump test

  • glsl: move glsl ir lowering out of glsl_to_nir()

  • glsl: add support for inout params to glsl_to_nir()

Timur Kristóf (32):

  • radv: Remove always false tmz variables from SDMA functions.

  • radv: Expose radv_get_dcc_max_uncompressed_block_size function.

  • radv: Implement buffer/image copies on transfer queues.

  • radv: Add temporary BO for transfer queues.

  • radv: Implement workaround for unaligned buffer/image copies.

  • ac: Rename SDMA max copy size macros to reflect SDMA version.

  • ac: Remove CIK prefix from SDMA opcodes.

  • ac: Add sdma_version enum and use it for SDMA features.

  • radv: Use GPU info for determining SDMA metadata support.

  • radv: Use SDMA version instead of gfx_level where possible.

  • radv: disable HTILE/DCC for concurrent images with transfer queue if unsupported.

  • radv: Disable DCC on exclusive images with transfer queue when SDMA doesn’t support it.

  • radv: Disable HTILE on exclusive images with transfer queues when SDMA doesn’t support it.

  • radv: Don’t retile DCC on transfer queues.

  • radv: Implement barriers for transfer queues.

  • radv: Implement vkCmdFillBuffer on transfer queues.

  • radv: Implement vkCmdWriteTimestamp2 on transfer queues.

  • radv: Implement vkCmdWriteBufferMarker2AMD on transfer queues.

  • radv: Implement buffer copies on transfer queues.

  • radv: Implement vkCmdUpdateBuffer on transfer queues.

  • radv: Move SDMA function and struct declarations to a new header.

  • radv: Unify SDMA surface struct for linear and tiled images.

  • radv: Refactor and simplify SDMA surface info functions.

  • radv: Pass radv_sdma_surf from copy functions to SDMA.

  • radv: Use SDMA surface structs for determining unaligned buffer copies.

  • radv: Clean up SDMA chunked copy info struct.

  • radv: Use correct plane and binding index with SDMA.

  • radv: Correct binding index for transfer buffer-image copies.

  • radv: Implement image copies on transfer queues.

  • radv: Implement T2T scanline copy workaround.

  • radv: Expose transfer queues, hidden behind a perftest flag.

  • radv: Correctly select SDMA support for PRIME blit.

Vignesh Raman (5):

  • ci: Add CustomLogger class and CLI tool

  • ci: copy logging script to install

  • ci: bare-metal: poe: Create strutured logs

  • ci: bare-metal: cros-servo: Create strutured logs for a630

  • ci/freedreno: add FARM variable

Vinson Lee (6):

  • ac/surface/tests: Remove duplicate variable block_size_bits

  • nir: Fix decomposed_prmcnt copy-paste error

  • nvk: Fix tautological-overlap-compare warning

  • etnaviv: Remove duplicate initializers

  • ac/rgp: Fix single-bit-bitfield-constant-conversion warning

  • intel/disasm: Remove duplicate variable reg_file

Violet Purcell (1):

  • gallium: Fix undefined symbols in version scripts

Vitaliy Triang3l Kuzmin (13):

  • r600: Move r600_create_vertex_fetch_shader to r600_shader.c

  • r600: Remove Gallium dependencies in r600_isa

  • r600: Replace R600_ERR with R600_ASM_ERR in shader code

  • r600: Remove Gallium dependencies in r600_asm

  • r600: Split r600_shader.h into common and Gallium parts

  • r600/sfn: Make r600 header include paths relative

  • r600/sfn: Split r600_shader_from_nir into common and Gallium parts

  • r600: Fix outputs typo in print_pipe_info

  • r600: Replace TGSI I/O semantics with shader_enums

  • r600/sfn: Change sampler_index to texture_index in buffer txs

  • r600/sfn: Remove unused sampler reference in emit_tex_lod

  • nir: Don’t skip lower_alu if only bit_count needs lowering

  • vulkan: Fix pipeline layout allocation scope

Vlad Schiller (1):

  • pvr: Fix VK_EXT_texel_buffer_alignment

VladimirTechMan (1):

  • venus/android: Switch to using u_gralloc

Yiwei Zhang (57):

  • venus: use common vk_image_format_to_ahb_format helper

  • venus: use common vk_image_usage_to_ahb_usage helper

  • venus: tiny refactor of device memory report interface

  • venus: avoid modifier prop query in vn_android_get_image_builder

  • venus: use common vk_image as vn_image base

  • venus: use common vk_device_memory as vn_device_memory base

  • venus: use common AHB management and export impl

  • venus: use vk_device_memory tracked export and import handle types

  • venus: use vk_device_memory tracked size

  • venus: use vk_device_memory tracked memory_type_index

  • venus: fix query feedback batch leak and race upon submission

  • zink: apply can_do_invalid_linear_modifier to Venus

  • venus: scrub msaa sample mask only with valid msaa state

  • venus: fix async compute pipeline creation

  • venus: properly initialize ring monitor initial alive status

  • venus: add missing shmem pool fini for cs_shmem pool

  • venus: reduce ring idle timeout from 50ms to 5ms

  • venus: use STACK_ARRAY to prepare for indirect submission

  • venus: enable renderer shmem cache dump for cache debug

  • venus: add ring helper to avoid redundant ring wait requests

  • venus: use instance allocator for ring allocs

  • venus: use instance allocator for indirect cs storage alloc

  • venus: add vn_instance_fini_ring helper

  • venus: refactor instance creation failure path

  • venus: move ring monitor to instance for sharing across rings

  • venus: refactor to add vn_watchdog

  • venus: further cleanup vn_relax_init to take instance instead of ring

  • venus: always set reply command stream to avoid seek

  • venus: make vn_renderer_shmem_pool thread-safe

  • venus: remove command_dropped tracking

  • venus: relax ring mutex

  • venus: move ring shmem into vn_ring

  • venus: move the rest ring belongings into ring

  • venus: move ring submission into ring

  • venus: move the actual ring creation into ring as well

  • venus: add vn_ring_get_id and hide vn_ring internals entirely

  • venus: switch to vn_ring as the protocol interface - part 1

  • venus: switch to vn_ring as the protocol interface - part 2

  • venus: switch to vn_ring as the protocol interface - part 3

  • venus: add vn_gettid helper

  • venus: dispatch background shader tasks to secondary ring

  • driconfig: add a workaround for Hades (Vulkan backend)

  • vulkan/wsi/wayland: ensure drm modifiers stored in chain are immutable

  • venus: clang format fixes

  • venus: split up the pipeline fix description into self and pnext

  • venus: refactor to add pipeline info fixes helpers

  • venus: properly ignore formats in VkPipelineRenderingCreateInfo

  • meson/vulkan/util: allow venus to drop compiler deps

  • venus: make tls hint specific to pipeline creation

  • venus: TLS ring

  • venus: clean up secondary ring

  • venus: allow to retrieve pipeline cache on TLS ring

  • venus: populate oom from ring submit alloc failures

  • vulkan/wsi/wayland: fix returns and avoid leaks for failed swapchain

  • venus: fix pipeline layout lifetime

  • venus: fix pipeline derivatives

  • venus: fix to respect the final pipeline layout

Yogesh Mohan Marimuthu (10):

  • winsys/amdgpu: add _dw to max_ib_size variable for code readability

  • winsys/amdgpu: remove ib_type variable from struct amdgpu_ib

  • winsys/amdgpu: rename struct amdgpu_ib main variable as main_ib everywhere

  • winsys/amdgpu: rename ib variable name to chunk_ib

  • winsys/amdgpu: remove rcs variable from struct amdgpu_ib

  • winsys/amdgpu: move 125% comment to correct line of code

  • winsys/amdgpu: rename requested_size_dw to projected_size_dw

  • winsys/amdgpu: rename ptr_ib_size_inside_ib to is_chained_ib

  • winsys/amdgpu: rename big_ib_buffer,ib_mapped variables in struct amdgpu_ib

  • winsys/radeon: remove unused gpu_address variable from struct radeon_cmdbuf

Yonggang Luo (61):

  • compiler: Implement num_mesh_vertices_per_primitive to match u_vertices_per_prim

  • treewide: Merge num_mesh_vertices_per_primitive and u_vertices_per_prim into mesa_vertices_per_prim

  • nir: remove redundant include of gallium headers

  • nir: #include “util/macros.h” for BITFIELD64_MASK in nir.c

  • compiler,vulkan,drm-shim: Remove unused include directories from meson.build

  • nvk: Should use alignment instead of align

  • microsoft/clc: Using sampler_id instead PIPE_MAX_SHADER_SAMPLER_VIEWS for dxil_lower_sample_to_txf_for_integer_tex

  • microsoft/clc: Use 128 instead of PIPE_MAX_SHADER_SAMPLER_VIEWS

  • micosoft: define enum dxil_tex_wrap to avoid the usage of enum pipe_tex_wrap

  • micosoft: decouple microsoft vulkan driver and compiler from gallium

  • dzn: Fixes -Werror=incompatible-pointer-type

  • d3d12,dzn: Simplify the usage of #include <wsl/winadapter.h>

  • util: Fixes note: the alignment of ‘_Atomic long long int’ fields changed in GCC 11.

  • glsl: move glsl_get_gl_type into glsl/linker_util.h

  • meson/win32: There is no need install OpenGL headers on win32

  • intel: Remove unused ALIGN macro

  • clover: Rename function align to align_vector to avoid conflict with global align

  • treewide: Avoid use align as variable, replace it with other names

  • util,vulkan,mesa,compiler: Generate source files with utf8 encoding from mako template

  • intel: Generate source file with utf-8 encoding from mako template

  • zink: Generate source file with utf-8 encoding from mako template

  • docs: Generate document with utf8 encoding

  • v3dv: Use correct type VkStencilOp in function translate_stencil_op

  • broadcom/compiler: Use correct type pipe_logicop for logicop_func in struct v3d_fs_key

  • broadcom/compiler: remove unused blend in v3d_fs_key

  • broadcom: remove unused headers include

  • osmesa: Make osmesa.h compatible with Windows SDK’s GL.h

  • broadcom/(compiler,common): avoid include of gallium headers in header files

  • broadcom/compiler: remove include of gallium headers from meson.build

  • osmesa: Fixes building osmesa.c on windows

  • meson: Support for both packaging and distutils

  • dzn: Remove #if D3D12_SDK_VERSION blocks now that 611 is required

  • ci/msvc: update flex and bison to winflexbison3

  • ci/msvc: Install graphics tools(DirectX debug layer) easy to stuck, place it at the beginning

  • ci/msvc: Split install vulkan sdk out of choco

  • ci/msvc: Rename vs2019 to msvc

  • ci/msvc: Rename vs to msvc for consistence

  • ci/msvc: Improve msvc init

  • ci/msvc: Remove &windows_msvc_image_tag

  • ci/msvc: Upgrade to vs2022 build tools

  • ci/msvc: Install msvc2019 only from vs2022

  • ci/msvc: Install both msvc2019 and msvc2022

  • ci/msvc: Stick deqp-runner to version v0.16.1

  • ci/msvc: Stick VK-GL-CTS to specific version 56114106d860c121cd6ff0c3b926ddc50c4c11fd

  • ci/msvc: Split the install of rust and d3d out of mesa_deps_test.ps1

  • ci/microsoft: Update the image-tag and image-path for msvc2019/msvc2022

  • treewide: Replace the include of nir_types.h with glsl_types.h

  • compiler/glsl: Move glsl specific _mesa_glsl_initialize_types out and glsl_symbol_table of glsl_types.h

  • intel: Avoid use align as variable, replace it with other names

  • intel: Use ALIGN_POT instead of ALIGN inside macro define

  • intel: Cleanup duplicate ALIGN macro defines

  • intel,crocus,iris: Use align64 instead of ALIGN for 64 bit value parameter

  • amd: Use align64 instead of ALIGN for 64 bit value parameter

  • util,compiler: Avoid use align as variable, replace it with other names

  • panfrost: Avoid use align as variable, replace it with other names

  • glsl: Fixes glcpp/tests with mingw/gcc

  • util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t

  • nvk: Avoid use align as variable, replace it with alignment

  • nouveau: Use align64 instead of ALIGN for 64 bit value parameter

  • etnaviv/drm: Remove redundant ALIGN macro by #include “util/u_math.h”

  • compiler/spirv: The spirv shader is binary, should write in binary mode

Zhang Ning (2):

  • iris: use helper util_resource_at_index

  • lima: Support parameter queries for PIPE_RESOURCE_PARAM_NPLANES

Zhang, Jianxun (5):

  • intel/genxml: Remove 3DSTATE_CLEAR_PARAMS instruction (xe2)

  • intel/genxml: update 3DSTATE_WM_HZ_OP instruction (xe2)

  • intel/genxml: update 3DSTATE_DEPTH_BUFFER instruction (xe2)

  • intel/isl: update 3DSTATE_STENCIL_BUFFER (xe2)

  • intel/genxml: Add RENDER_SURFACE_STATE for xe2

antonino (4):

  • nir: don’t take the derivative of the array index in `nir_lower_tex`

  • vulkan: use instance allocator for `object_name` in some objects

  • nir/zink: drop NIH helper in favor of `mesa_vertices_per_prim`

  • egl: only check dri3 on X11

daoxianggong (1):

  • zink - Fix for blend color change without blend state change

duncan.hopkins (4):

  • util: Update util/libdrm.h stubs to allow loader.c to compile on MacOS.

  • dri: added build dependencies for systems using non-standard prefixed X11 libs.

  • glx: fix automatic zink fallback loading between hw and sw drivers on MacOS

  • vulkan: added build dependencies for systems using non-standard prefixed X11 libs.

i509VCB (3):

  • asahi,docs: add PBE to hardware glossary

  • asahi: create queue for screen

  • agx: remove internal agx_device queue

jphuang (1):

  • dzn: Change dst image layout according to aspect

llyyr (1):

  • docs: document AMD_DEBUG=noefc and useaco

ratatouillegamer (2):

  • hasvk: Add Vulkan API version override

  • hasvk: Enable hasvk override Vulkan API Version for Brawlhalla