Mesa 24.1.0 Release Notes / 2024-05-22

Mesa 24.1.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 24.1.1.

Mesa 24.1.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 24.1.0 implements the Vulkan 1.3 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

b7eac8c79244806b1c276eeeacc329e4a5b31a370804c4b0c7cd16837783f78b  mesa-24.1.0.tar.xz

New features

  • VK_EXT_map_memory_placed on RADV, ANV and NVK

  • VK_KHR_shader_subgroup_rotate on RADV and ANV and NVK

  • VK_KHR_load_store_op_none on RADV, ANV, NVK and Turnip

  • VK_KHR_line_rasterization on RADV, ANV, NVK and Turnip

  • VK_KHR_index_type_uint8 on RADV, ANV, NVK and Turnip

  • VK_KHR_shader_expect_assume on all Vulkan drivers

  • VK_KHR_shader_maximal_reconvergence on RADV, ANV and NVK

  • VK_KHR_shader_quad_control on RADV

  • OpenGL 4.6 on Asahi

  • OpenGL ES 3.2 on Asahi

  • Mali G610 and G310 on Panfrost

  • Mali T600 on Panfrost

  • VK_KHR_shader_subgroup_uniform_control_flow on NVK

  • alphaToOne/extendedDynamicState3AlphaToOneEnable on RADV

  • VK_EXT_device_address_binding_report on RADV

  • VK_EXT_external_memory_dma_buf for lavapipe

  • VK_EXT_queue_family_foreign for lavapipe

  • VK_EXT_shader_object on RADV

  • VK_EXT_nested_command_buffer on NVK and RADV

  • VK_EXT_queue_family_foreign on NVK

  • VK_EXT_image_drm_format_modifier on NVK

Bug fixes

  • anv: unbounded shader cache

  • radv: Crash due to nir validation fail in Enshrouded

  • bisected: turnip: deqp regressions

  • android: sRGB configs no longer exist after !27709

  • [24.1-rc4] fatal error: intel/dev/intel_wa.h: No such file or directory

  • vcn: rewinding attached video in Totem cause [mmhub] page fault

  • When using amd gpu deinterlace, tv bt709 properties mapping to 2 chroma

  • ci: switch from CI_JOB_JWT to id_tokens

  • VCN decoding freezes the whole system

  • [RDNA2 [AV1] [VAAPI] hw decoding glitches in Thorium 123.0.6312.133 after https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960

  • nvk: Tracker issue for gamescope support

  • nvk: Implement VK_EXT_image_drm_format_modifier

  • WSI: Support VK_IMAGE_ASPECT_MEMORY_PLANE_i_BIT_EXT for DRM Modifiers in Vulkan

  • [Regression][Bisected] EGL/Wayland: QT applications terminated by SIGSEGV (Address boundary error) when using dGPU

  • radv: Enshrouded GPU hang on RX 6800

  • NVK Zink: Wrong color in Unigine Valley benchmark

  • panforst: T604 issue with using u32 for flat varyings

  • [anv] FINISHME: support YUV colorspace with DRM format modifiers

  • gen9/11 test became flaky: piglit.spec.!opengl 1_4.blendminmax

  • 24.0.6: build fails

  • mesa 24 intel A770 KOTOR black shadow smoke scenes

  • [bisected][regression] kitty fails to start due to `glfwWindowHint(GLFW_SRGB_CAPABLE,true)`

  • r600: bisected 5eb0136a3c561 breaks a number of piglits

  • Graphical glitches in RPCS3 after updating Vulkan Intel drivers

  • [R600] OpenGL and VDPAU regression in Mesa 23.3.0 - some bitmaps get distorted.

  • VAAPI radeonsi: VBAQ broken with HEVC

  • radv/video: 10-bit support

  • radv: vkCmdWaitEvents2 is broken

  • Zink: enabled extensions and features may not match

  • glRasterPos: “Assertion `prog->base_serialized_nir’ failed.” if a shader is loaded from the shader cache

  • radv: mesa-9999/src/amd/vulkan/radv_image_view.c:147: radv_set_mutable_tex_desc_fields: Assertion `(plane->surface.u.gfx9.surf_pitch * plane->surface.bpe) % 256 == 0’ failed.

  • ACO doesn’t hide lds_param_load latencies

  • ACO doesn’t form a VMEM clause for image stores in one case on GFX11

  • r600: Valheim hangs CAYMAN gpu (regression/bisected)

  • r600: Artifacts in Oxygen Not Included around air ducts and pipes (regression, bisected)

  • radv: UMR wave parsing format is outdated

  • radv: GetImageMemoryRequirements2 does not look at VkImagePlaneMemoryRequirementsInfo

  • RADV, regression : Objects randomly appear/disappear on Unreal Engine 4 titles using D3D12 backend on Polaris

  • mesa 23.1.0-rc3 flickering textures/lighting in Unreal 4 games Polaris10

  • ACO tests SIGSEGV in debian-vulkan job with LTO enabled

  • radv: Address binding report for images is incorrect.

  • blorp: avoid dirtying push constants in 3D

  • anv: flaky vkd3d-proton test_buffer_feedback_instructions_sm51

  • FTBFS: commit aaccc25a4dd9ccfc134e51a7e81168334d63a909 broke mesa snapshot build

  • d3d12_screen.cpp:60:10: fatal error: ShlObj.h: No such file or directory

  • r300: crash when compiling some GSK shaders

  • anv: vkd3d-proton test_stress_suballocation failure

  • d3d12: Zwift renders with bad textures/lighting

  • nir_opt_remove_phis breaks divergence analysis

  • intel: Require 64KB alignment when using CCS and multiple engines

  • NVK: Misrendering with Civilization 6

  • radv: RDR2 might need zerovram

  • intel-clc build failure, i think?

  • Issues rendering gtk4 window decorations on v3d on Fedora-40/mesa-24.0

  • clc: Failure when linking with llvm+clang 18.1 (-Dshared-llvm=disabled)

  • LLVM-18 build issue

  • vulkan/wsi/x11: VK_SUBOPTIMAL_KHR is never reported by the swapchain

  • Broken vaapi encoding on Radeon RX 6900XT

  • RUSTICL creating a shared reference to mutable static is discouraged and will become a hard error

  • anv: GPU hang on Assassin’s Creed Valhalla while running benchmark

  • nvk: dota 2 crashes after ~5 seconds in game

  • dzn: conflicting defines with DirectX headers 1.613.0

  • VAAPI: Incorrect HEVC block size reported with radeonsi

  • radv: WWE 2K24 has very quirky DCC issues on RDNA2

  • anv: Dirt 5 crashes at tryCreatingPipelineStateFromCache

  • freedreno: remove headergen2

  • freedreno: remove headergen2

  • vulkan/wsi: crash in dEQP-VK.wsi.wayland.swapchain.simulate_oom.min_image_count

  • Document that Zink on MoltenVK is not expected to work

  • KiCAD 3D Viewer - rounded pads rendered incorrectly (texture mapping or stencil test error)

  • OpenSCAD rendering incorrect and inconsistent on radeonsi

  • intel/fs: regression on MTL with 64bit values in UBO

  • ci: split debian-build-testing?

  • [freedreno] Black background on SuperTux Kart with postmarketOS and Oneplus 6T

  • [radv] Half-Life Alyx renders solid black for reflective surfaces

  • iris: iris_resource_get_handle returns wrong modifier

  • [RX 7900 XTX] Helldivers 2 cause GPU reset

  • radeon: Crash in radeon_bo_can_reclaim_slab

  • regression/bisected: commit 4e3f3c10e14d8778781c81f39ced659b5ec2f148 broke mesa snapshot build

  • RV530 renders improperly at non 4:3 resolutions.

  • anv: new cooperative matrix failures with CTS 1.3.8.0

  • nvk: Missing implementation of VkImageSwapchainCreateInfoKHR and VkBindImageMemorySwapchainInfoKHR

  • mesa > 23.1.9 [opencl,video_cards_nouveau] fails to build due to missing symbol vl_video_buffer_is_format_supported

  • intel/meson: Make intel_stub_gpu work with `meson devenv`

  • Follow-up from “iris: Fix plane indexing and handling on image import”

  • nvk,nak: Implement shaderStorageImageMultisample

  • nvk,nak: Implement VK_KHR_shader_subgroup_uniform_control_flow

  • `[gfxhub0] no-retry page fault` triggered by `AMD_TEST=testdmaperf` on gfx90c APU

  • nvk: glcts hangs

  • v3d: Line rendering broken when smoothing is enabled

  • PowerVR reports minMemoryMapAlignment of 64

  • RADV: GPU crash when setting ‘RADV_DEBUG=allbos’

  • [intel] mesa ftbfs with time_t64

  • d3d12_resource.cpp:307:49: error: no matching function for call to ‘ID3D12Heap::GetDesc()’

  • radv regression between a337a0c8072d0be487e43c2b7b132e003c6d5a5e and 83f741124b66818053b6b1b2f7e42f5217a27004

  • [build failure] [armhf] - error: #error “_TIME_BITS=64 is allowed only with _FILE_OFFSET_BITS=64”

  • R400 should have native support for sin/cos in VS

  • [radv] Crash when VkGraphicsPipelineCreateInfo::flags = ~0u

  • intel: all workarounds disabled with ATS skus

  • vulkan: GPL now broken

  • Gen4 assertion `force_writemask_all’ failed.

  • src/gallium/auxiliary/rtasm/rtasm_x86sse.c:198:10: runtime error: store to misaligned address 0x7fabba0cd011 for type ‘int’, which requires 4 byte alignment

  • [radv] Holographic projection texture glitch in Rage 2

  • RustiCL: Callbacks are not called upon errors

  • MTL: regressions in vulkancts due to BO CCS allocations

  • zink: spec@ext_external_objects@vk-image-overwrite fail

  • vaapi: radeonsi: surface_region.{x,y} is not honored in processing when source is RGB

  • nvk: Implement VK_EXT_shader_object

  • nvk: Implement VK_EXT_graphics_pipeline_library

  • turnip: UBWC disabled for MSAA

  • KHR-Single-GL46.arrays_of_arrays_gl.AtomicUsage fails on MTL

  • GTF-GL46.gtf42.GL3Tests.texture_storage.texture_storage_texture_as_framebuffer_attachment fails on MTL

  • nvk: Implement VK_KHR_maintenance5

  • [intel][anv][build][regression] - genX_grl.h:27:10: fatal error: grl/grl_cl_kernel.h: No such file or directory

  • RX 6600 VDPAU not recognizing HEVC_MAIN_10 correctly

  • Running an app on another AMD GPU (offload, DRI_PRIME) produces corrupted frames on Wayland.

  • regression in radeonsi since 9aa205668bcbf701f8f694551c284cd8e4cc17a3 (crashes in vbo_save_playback_vertex_list)

  • clang/libclc related Mesa build failures

  • Ninja Install Error

  • anv: add a dri config to enable implicit fencing on external memory interop

  • VDPAU declares a texture as “immutable” without also setting its ImmutableLevels attribute.

  • Segfault in glsl_to_nir.cpp nir_visitor::visit when assigning interface block

  • [rusticl]WARNING: Project targets ‘>= 1.1.0’ but uses feature deprecated since ‘1.0.0’: module rust has been stabilized. drop “unstable-” prefix from the module name

  • RX6600 hardware HEVC video decode fails for VDPAU but works for VA-API. (Can lock up GPU!)

  • Rusticl panics when getting program build logs using opencl.hpp

  • ue5 game issues lighting Rog Ally 7080u (z1e)

  • Missing textures in RoboCop: Rogue City with mesh shaders enabled

  • Intel/anv: Allow pre-compiled shader caches to be reused across multiple devices

  • radv: Multiview PSO forgets to export layer in some cases.

  • -Dintel-rt=enabled fails to build on 32-bit

  • MTL: regressions in vulkancts due to BO CCS allocations

  • intel: build failures

  • regression/bisected commit 4de62731f4db56360026cbb6a3b8566f86f22466 broke HW acceleration in the Google Chrome

  • i386 intel build failure: meson.build:45:6: ERROR: Unknown variable “prog_intel_clc”.

  • rusticl: clEnqueueFillBuffer (among others) fails on buffers created from GL object.

  • MTL raytracing regression

  • [ANV/DG2] Unexpectedly slow replay of RenderDoc frame capture of Resident Evil 4 Remake

  • zink: flickering artifacts in Selaco

  • [ADL] gpu hang on dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics

  • Turnip spam on non-turnip devices

  • Intermittent compiler failures when building valhall tests

  • panfrost: graphical artifacts on T604 (T600)

  • Dying Light native artifacts on Intel A770

  • r300: Amnesia: The Dark Descent heavy corruption

  • [ANV/DG2] Age of Empires IV fullscreen “banding” artefacts

  • [mtl][anv] dEQP-VK.pipeline.monolithic.depth.format.d32_sfloat.compare_ops.* failures when run multithreaded

  • [mtl][anv] flaky tests in pipeline.monolithic.extended_dynamic_state*stencil_state_face* series

  • Broken colors/dual-source blending on PinePhone (Pro) since 23.1.0

  • r600/sfn: “Indexed locks of kcache banks 14 and 15 are ignored” in the ALU clause documentation

  • turnip: Logarithmic-time subgroup reductions using brcst.active and getlast.w8 instructions on a6xx gen4+

  • GTF-GL46.gtf42.GL3Tests.texture_storage.texture_storage_compressed_texture_data regression

  • microsoft/compiler: Missing globally-coherent logic

  • Regression between 23.0.4 and 23.1.0: texture glitches in osgEarth

  • [Broadcom] Warning when runnin every OpenGL game on Vulkan using ZINK

  • radeonsi unsynchronized flips/tearing with KMS DRM rendering on 780M

  • radeonsi has an unchecked hard dependency on libelf

  • DR crashes with mesa 24 and rusticl (radeonsi)

  • Piglit tests assert on gen9 with zink

  • vlc crashes when playing 1920x1080 video with Radeon RX6600 hardware acceleration and deinterlacing enabled.

  • [radeonsi] Regression: graphical artifacting on water texture in OpenGOAL

  • Assertion when creating dmabuf-compatible VkImage on Tigerlake

  • Palworld fails to launch on Intel Arc unless “force_vk_vendor” is set to “-1”.

  • panfrost: implement line smoothing

  • r300: backend DCE fails in piglit glsl-vs-copy-propagation-1.shader_test

  • [AMDGPU RDNA3] Antialiasing is broken in Blender

  • MTL: vulkan cooperative matrix tests gpu hang on MTL

  • nvk: Implement VK_KHR_zero_initialize_workgroup_memory

  • Assassin’s Creed Odyssey wrong colors on Arc A770

  • VAAPI: EFC on VCN2 produces broken H264 video and crashes the HEVC encoder

  • etnaviv, modesetting, and glxgears

  • The Finals fails to launch with DX12 on Intel Arc unless “force_vk_vendor” is set to -1.

  • nvk: `VK_KHR_zero_initialize_workgroup_memory` and `VK_KHR_shader_subgroup_extended_types` not marked as complete in features.txt

  • nvk: Implement variableMultisampleRate

  • VA-API CI tests freeze

  • radv: games render with garbage output on RX5600M through PRIME with DCC

  • Warning when use ALIGN over uint64_t and uintptr_t

Changes

Adrian Perez de Castro (1):

  • Revert “egl/wayland: Remove EGL_WL_create_wayland_buffer_from_image”

Agate, Jesse (6):

  • amd/vpelib: Studio Range Handling

  • amd/vpelib: White Screen Fix

  • amd/vpelib: VPT Failing Test Cases

  • amd/vpelib: VPE integration for HLG

  • amd/vpelib: Add PQ Norm to VPE interface

  • amd/vpelib: Refactor norm factor logic

Alan Liu (4):

  • radeonsi/vpe: Add environment variable to set embbuf number

  • radeonsi/vpe: Don’t map and unmap emb_buffer every time in process_frame

  • amd/vpelib: remove unused header file

  • radeonsi/vpe: support vpe 1.1

Alejandro Piñeiro (3):

  • broadcom/compiler: fix coverity warning (unitialized pointer read)

  • v3dv/bo: use mtx_lock/unlock on cache_init too

  • v3dv: expose VK_EXT_depth_clip_enable

Alexandre Marquet (1):

  • pan/mdg: quirk to disable auto32

Alyssa Rosenzweig (328):

  • nir/lower_ssbo: rewrite

  • nir/lower_blend: return progress

  • nir/lower_io_to_temporaries: return prog

  • nir/lower_clip_cull_distance_arrays: return prog

  • nir: return prog from drawpixels

  • nir/lower_bitmap: return prog

  • nir/lower_alpha_test: rewrite with intrinsics_pass

  • nir/lower_point_size_mov: return prog

  • nir/lower_passthrough_edgeflags: return progress

  • nir/lower_io_arrays_to_elements: return prog

  • nir/lower_flatshade: fix metadata

  • glsl: return progress in point size linking

  • glsl: don’t use NIR_PASS_V

  • glsl: fix metadata in gl_nir_zero_initialize_clip_distance

  • mesa/st: return progress in st_nir_lower_wpos_ytransform

  • mesa/st: use instructions_pass for plane lowering

  • mesa/st: return progress lowering builtins

  • mesa/st: don’t use NIR_PASS_V

  • agx: rm deadcode

  • agx: ingest undefs in the backend

  • agx: stop lowering in opt loop

  • agx: only lower vars to ssa once

  • agx: fix metadata in layer lowering

  • agx: unset silly nir opts

  • agx: return progress from passes

  • asahi: return progress from passes

  • asahi: avoid silly internal NIR_PASS in gs lowering

  • asahi: don’t use NIR_PASS for removing entrypoints

  • asahi: don’t use NIR_PASS_V

  • nir/passthrough_gs: plug leak

  • compiler,gallium: move u_decomposed_prim to common

  • nir/passthrough_gs: flesh out gs_in_prim

  • compiler: add a vs.tes_agx bit

  • asahi: add more uapi stubs

  • asahi: gut macOS related code

  • asahi: lower poly stipple

  • asahi: Implement skeleton for tessellation

  • asahi: fix metadata for images with VS lowered to GS

  • asahi: implement VBO robustness

  • asahi: implement reset queries

  • asahi: enable robustness

  • asahi: fix unbound ssbos

  • asahi: optimize more when linking libagx

  • asahi: decode uniform_high records

  • agx: implement load_subgroup_invocation

  • agx: lower more subgroups

  • agx: introduce ballot pseudo

  • agx: fuse ballot+cmp

  • nir: add active_subgroup_invocation_agx sysval

  • agx: implement active_subgroup_invocation_agx

  • agx: optimize first_invocation

  • agx: optimize vote_eq

  • asahi: fix prim restart unrolling with indirects

  • asahi: delete bogus assertion

  • asahi: plug passthrough tcs leak

  • asahi: rework meta shader infra

  • asahi: plug geometry shader leaks

  • asahi: plug pre-gs leak

  • asahi: plug early_serialized_nir leak

  • asahi: plug so target leak

  • asahi: plug glsl type leak

  • asahi: plug geometry heap leak

  • asahi: fix UB in qbo’s

  • agx: add some more bitop tests

  • Revert “asahi: don’t canonicalize nans/flush denorms when copying”

  • asahi: sync with query mismatches

  • asahi: enable tcs caching

  • asahi: don’t sync for uninitialized buffer

  • asahi: fix valid buffer tracking for SSBO/image/XFB

  • asahi: handle read-only SSBOs

  • asahi: honour discard_whole | persistent

  • agx: only run early tests if needed

  • docs/asahi: fix strided linear note

  • ail: add tests for linear<–>twiddled copies

  • ail: port tiling routines to c++

  • ail: use template for tiled memcpy

  • agx: don’t inline imms into stack_store

  • agx: optimize b2x(inot)

  • agx: reassociate bcsel with ior/iand

  • asahi: implement pipeline stats as a checkbox

  • asahi: log geometry shaders separate from xfb

  • asahi: don’t use util_resource_size

  • asahi: fix vbo dirty track

  • asahi: force inline ppp update logic

  • asahi: skip set if tested

  • asahi: rm dead

  • asahi: track bit count, not word count

  • asahi: enable compblit behind dbg flag

  • asahi: allow disk cache with compblit

  • asahi: assert invariant

  • asahi: drop silly else

  • asahi: rewrite queries

  • asahi: split up stage uniform upload

  • asahi: dirty track stage uniforms

  • asahi: collapse stage uniform upload

  • asahi: optimize “no changes” case

  • asahi: optimize no changes descriptor case

  • asahi: move some code into dirty tracking

  • asahi: drop any_draws

  • asahi: fix instance count with indirect draw

  • asahi: collapse if

  • asahi: hoist xfb code

  • asahi: hoist layer id code

  • asahi: rm blank

  • asahi: track batches with incoherent writes

  • asahi: optimize memory_barrier

  • asahi,agx: use intrinsics pass

  • agx: clamp register file based on workgroup size

  • agx: improve scratch size accounting

  • asahi: add has_scratch to shader key

  • agx: set nr_preamble_gprs for preamble scratch

  • asahi: allocate preamble scratch

  • agx: allow 16-bit immediate on stack load/store

  • agx: print register vectors

  • agx: introduce “memory variables”

  • agx: add spill/fill lowering pass

  • agx: unit test memory parallel copies

  • agx: unit test spill/fill lowering

  • agx: add parallel copy printing

  • agx: add =spill debug option

  • asahi: bump max threads per wg

  • asahi: drop xfb hack

  • asahi: allow vertex/geom/tess side effects

  • agx: fix buffer overflow with varying slots

  • asahi,agx: use hw clip distance

  • asahi: fix dirty tracking issue

  • asahi: rip out existing MDI+GS implementation

  • libagx: fix buggy align macro

  • asahi: make GS flatshade_first more dynamic

  • libagx: use native static_assert on host

  • libagx: use real PACKED macro

  • libagx: static assert some sizes

  • libagx: generalize vertex_id_for_topology

  • asahi: simplify IA mode handling

  • asahi: add shader_info::outputs for gs lower

  • asahi: add geometry parameters for separable GS

  • asahi: rework shader stage handling a bit

  • asahi: separate GS from VS

  • asahi: rm arrayed output lowering

  • asahi: allow bindful GS textures

  • asahi: shrink GS key

  • asahi: infer stage in descriptor update

  • asahi: be a bit more methodical with shader stages

  • nir: rm load_vert_id_in_prim_agx

  • asahi: allow lowering bindings after lowering textures

  • asahi: collapse indirection with GS

  • asahi: support stage override in sysval lower

  • asahi: set gs_grid[0] even for direct draws

  • asahi: use load_instance_id in gs lowering

  • asahi: fix vertex out size calc

  • asahi: invert geometry shaders

  • asahi: implement GS disk caching

  • asahi: rm dead

  • asahi: simplify expressions involving xfb

  • asahi: avoid silly psiz writes even with gs

  • asahi: eliminate tri fan %

  • asahi: make provoking vertex dynamic

  • asahi: make gs topology dynamic

  • asahi: support GS in shaderdb

  • asahi: always support ARB_clip_control

  • asahi: make clip_halfz dynamic

  • asahi: rm ia key

  • agx: remove discard -> zs_emit lower

  • agx: rm dead sample count argument

  • agx: call agx_nir_lower_sample_mask earlier

  • agx: rm unused backend nr_samples

  • agx: rm unused opt_ixor_bcsel

  • agx: sink wait_pix

  • asahi: Implement ARB_texture_barrier by decompression

  • asahi: quelch gcc warning

  • agx: rm ridiculous dependency

  • agx: decouple compiler from genxml

  • agx: use #pragma once

  • asahi/lib: use #pragma once

  • ail: use #pragma once

  • asahi: use #pragma once

  • asahi: clean up format table renderability

  • asahi: split out genxml/ directory

  • agx: move SSBO lowering

  • agx: call texture lowering in the driver

  • agx: move texture lowering into lib

  • agx: decouple from libagx

  • asahi: reorder compiler before clc

  • asahi: precompile helper program

  • agx: add “is helper program?” key bit

  • asahi: advertise GL4.6 and ES3.2

  • docs: update for GL4.6 and ES3.2 on asahi

  • vulkan: add vk_index_type_to_restart helper

  • tu: use vk_index_to_restart

  • anv,hasvk: use vk_index_to_restart

  • util/hash_table: add u64 foreach macro

  • util/ralloc: add memdup

  • treewide: use ralloc_memdup

  • panfrost: Add a library to build CSF command streams

  • panfrost: Add support for the CSF job frontend

  • nir/opt_shrink_vectors: hoist alu helpers

  • nir/opt_shrink_vectors: shrink some intrinsics from start

  • util: add _mesa_hash_table_u64_num_entries

  • nir/print: do not print empty lists on intrinsics

  • util/hash_table: add DERIVE macro

  • panfrost: derive ht

  • asahi: derive ht

  • nvk: derive ht

  • radeonsi: derive ht

  • v3d: derive ht

  • glsl_types: derive ht

  • asahi: bump maximum samplers for Blender

  • asahi: allow more samplers for shaderdb

  • asahi: move more code out of agx_preprocess_nir

  • asahi/lib: fix overread with stateful

  • asahi: fix overread with samplers

  • asahi: clarify how unroll index buffers are offsetted

  • asahi: zero more in the unroll path

  • asahi: fix unit mismatch with unroll path

  • asahi: fix stage accounting for meta compute shaders

  • asahi: export build_meta_shader

  • asahi: add flush_query_writers helper

  • asahi: add helper to classify queries

  • asahi: accelerate QBO copies

  • asahi: fix depth bias interactions with points/lines

  • asahi: implement CDM stream linking for GS

  • asahi: be robust against tess batch changes

  • asahi: stop merging VS and TCS

  • asahi: drop TCS key

  • asahi: drop asahi_vs_next_stage

  • libagx: improve static assert message

  • asahi/clc: fix mem leaks

  • agx/opt_cse: alloc less

  • agx: fix stack smash with spilling

  • agx: fix allocating phi sources past the reg file

  • agx: add more asserts

  • agx: add num_successors helper

  • agx: fix 16-bit mem swaps

  • agx: scalarize vector phis

  • agx: allow vector phis to pass validation

  • agx: assert phis don’t have .kill set

  • agx: fix bogus implicit cast with 2d msaa arrays

  • agx: sink harder

  • agx: implement live range splits of phis

  • agx: don’t leak shuffle copies

  • agx: add more iterator macros

  • agx: add temp_like helper

  • agx: add before_function cursor

  • agx: add limit for max sources per non-phi

  • agx: coalesce phi webs

  • agx: try to coalesce moves

  • agx: drop scratch regs for spilling

  • agx: validate phi sources for consistency

  • agx: add SSA reindexing pass

  • agx: add SSA repair pass

  • agx: add Braun-Hack spiller pass

  • agx: switch to Braun-Hack spiller

  • agx: use dense reg_to_ssa map

  • agx: make add_successor public

  • agx: add helpers for multiblock unit tests

  • agx: add tests for SSA repair

  • agx: move spill/fills accounting to shaderdb

  • agx: enable indirect temps

  • agx: generalize remat code

  • agx: implement get_sr remat

  • asahi: use less bindless samplers

  • agx: add more shaderdb stats

  • agx: fix lowering uniforms with abs/neg

  • agx: restrict high uniforms with textures

  • agx: extract “accepts uniform?” ISA query

  • agx: model 64-bit uniform restriction on ALU

  • agx: extract agx_is_float_src

  • agx: promote constants to uniforms

  • agx: compact 32-bit constants

  • agx: test constant compaction

  • agx: implement load_subgroup_id

  • libagx: polyfill glsl ballot()

  • libagx: accelerate restart unroll across a subgroup

  • libagx: accelerate prim restart unroll across wg

  • libagx: deal with silly NIR

  • libagx: parallelize prefix sum over 1024 threads

  • agx: use funop short form

  • agx: split select opt into its own pass

  • agx: vectorize uniform_store

  • agx: start a crude cycle model

  • agx/opt_preamble: improve preamble cost function

  • agx/opt_preamble: restrain ourselves

  • agx/opt_preamble: preamble cycle estimates

  • agx/opt_preamble: improve rewrite cost est

  • docs/asahi: document UVS

  • nir: add offset to load_coefficients_agx

  • nir: add intrinsics for lowered VS outputs

  • asahi: add agx_push_packed

  • asahi: drop =varyings debug

  • asahi: extract agx_cf_binding

  • agx: explicitly assign coeff registers

  • agx: pack indirect CF

  • agx: handle indirect varyings

  • asahi: advertise indirect fs inputs

  • agx: rm unnecessary iter hack

  • agx: pack indirect st_vary

  • agx: inline imm into st_vary

  • asahi: rewrite varying linking

  • asahi: drop now-empty base key

  • asahi: make point size replacement dynamic

  • asahi: stop using GLSL indirect lowering

  • agx/lower_vbo: dce as we go

  • asahi: drop dead linked_so code

  • asahi: use ht derive more

  • asahi: fix _packed USC structs

  • asahi: delete layer id code

  • asahi: don’t set writes_memory for tib spilling

  • agx: optimize out wait_pix in some cases

  • agx: inline sampler states

  • agx: always reserve sampler #0 for txf

  • asahi: fix bit sizes in point sprite lower

  • nir: add samples_log2_agx sysval

  • nir: add export/load_exported_agx intrinsics

  • agx: wire up samples_log2 sr

  • agx: generalize preloaded cache

  • agx: implement exports

  • agx: document non-monolithic ABI

  • asahi: add agx_usc_push_packed helper

  • asahi: constify agx_build_tilebuffer_layout

  • asahi: don’t allocate tib space for gaps

  • nir: add intrinsics for non-monolithic agx shaders

  • agx: drop shader stage assertion

  • asahi: static assert blend key size

  • agx: add agx_shader_part data structure

  • agx: add main_size info

  • asahi: add fast linker

  • asahi/clc: stop padding binaries

  • asahi: switch to VS/FS prolog/epilog system

Amber (3):

  • tu: wideLines support for a7xx.

  • tu: Add MESA_VK_DYNAMIC_RS_LINE_WIDTH to tu_rast_state.

  • tu: re-emit vertex buffer on MESA_VK_DYNAMIC_VI_BINDINGS_VALID dirty.

Amber Harmonia (1):

  • freedreno/common: Fix register stomper ranges for A7XX

Andres Calderon Jaramillo (1):

  • radeonsi: get enc/dec caps from kernel only on amdgpu

Antoine Coutant (2):

  • clc: retrieve libclang path at runtime.

  • drisw: fix build without dri3

Anton Bambura (2):

  • panfrost: Enable Mali-T600

  • docs/panfrost: Document Mali-T600 support

Antonio Gomes (6):

  • mesa/st: Skip querying PCI values in interop_query_device_info if version >= 4

  • rusticl/gl: Bump mesa_glinterop_device_info to version 4

  • gallium: Add new PIPE_CAP_CL_GL_SHARING

  • iris: Set PIPE_CAP_CL_GL_SHARING to true

  • radeonsi: Set PIPE_CAP_CL_GL_SHARING to true

  • rusticl/device: Verify for PIPE_CAP_CL_GL_SHARING when enabling gl_sharing

Arthur Huillet (1):

  • nvk: remove useless MME scratch 26 usage

Asahi Lina (17):

  • asahi: libagx: introduce AGX_STATIC_ASSERT

  • agx: Rename some SRs

  • nir: Add AGX-specific helper opcodes

  • agx: Hook up AGX helper NIR intrinsics

  • agx: Hook up helper intrinsics into CL

  • agx: Add scaffolding to build the helper shader at device init

  • agx: compiler: Add fence_helper_exit_agx barrier

  • agx: compiler: Export scratch size to the driver

  • agx: compiler: Enable stack_adjust

  • asahi: libagx: Move PACKED and GLOBAL macros to libagx.h

  • asahi: cmdbuf: Fix scratch bucket offset/size

  • asahi: Implement scratch allocation

  • asahi: scratch: Add feature to debug core IDs

  • asahi: Hook up scratch

  • asahi: Allocate scratch for shaders

  • asahi: Enable scratch debugging

  • asahi: batch: Trace before waiting for syncobj

Assadian, Navid (2):

  • amd/vpelib: Apply inverse gamut remap to background

  • amd/vpelib: Use uint64 for buffer size

Axel Davy (5):

  • frontend/nine: Fix ff ps key

  • frontend/nine: Fix programmable vs check

  • frontend/nine: Fix missing light flag check

  • frontend/nine: Fix destruction race

  • frontend/nine: Reset should EndScene

Bas Nieuwenhuizen (10):

  • util/disk_cache: Add marker on cache usage.

  • radv: Remove ray_launch_size_addr_amd system value.

  • radv: Add winsys argument to buffer map/unmap.

  • radv/winsys: Use radv_buffer_map wrapper.

  • radv/amdgpu: Use mmap directly.

  • radv: Support for mapping a buffer at a fixed address.

  • radv: Implement reserving the VA range on unmap.

  • radv: Expose VK_EXT_map_memory_placed.

  • radv: Fix differing aspect masks for multiplane image copies.

  • radv: Use zerovram for Enshrouded.

Benjamin Lee (14):

  • nak: support predicate swaps on SM50

  • nak: support predicate sel on SM50

  • nak: fix frnd on SM50

  • nak: implement FSWZADD on SM50

  • nak: implement FLO on SM50

  • nak: fix iabs on SM50 with an explicit i2i op

  • nak: implement rro op on SM50

  • nak: use rro when emitting mufu on SM50

  • nak: implement kill op on SM50

  • nak: implement cs2r op on SM50

  • nak: handle nop ops from NAK IR on SM50

  • nak: fix lod mode encoding for SM50 tld op

  • nak: fix tex offset encoding on SM50

  • nvk: disable shaderResourceMinLod on pre-sm70

Benjamin Tissoires (3):

  • CI: add mr-label-maker.yml config

  • .mr-label-maker.yml: fix wrong label

  • CI: add a test for checking the validity of .mr-label-maker.yml

Biju Das (1):

  • gallium: Add Renesas rzg2l-du DRM entry point

Blisto (1):

  • driconf: set vk_x11_strict_image_count for Atlas Fallen Vulkan

Bob Beckett (2):

  • panfrost: Add an entry for panthor in the renderonly_drivers[] array

  • panfrost: Add the gallium glue to get panfrost loaded when panthor is detected

Boris Brezillon (193):

  • panvk: Fix tracing

  • panvk: Fix access to unitialized panvk_pipeline_layout::num_sets field

  • panfrost: Kill unused forward declarations in pan_texture.h

  • panfrost: Add a per-gen panfrost_format_from_pipe_format() helper

  • panfrost: Add a per-gen panfrost_blendable_format_from_pipe_format() helper

  • panfrost: Make panfrost_format_to_bifrost_blend() a per-gen helper

  • panfrost: Add panfrost_[blendable]_format_table() helpers

  • panfrost: Move panfrost_is_yuv() to pan_format.h

  • panfrost: Move YUV-debugging out of panfrost_new_texture()

  • panfrost: Stop passing a panfrost_device to panfrost_new_texture()

  • panfrost: Don’t pass a panfrost_device to panfrost_format_supports_afbc()

  • panfrost: Don’t pass a panfrost_device to panfrost_afbc_can_tile()

  • panfrost: Stop passing a panfrost_device to pan_blend_get_internal_desc()

  • panfrost: Stop exposing pan_blend_create_shader()

  • panfrost: Stop passing a panfrost_device to pan_blend_create_shader()

  • panfrost: Stop passing a panfrost_device to pan_inline_rt_conversion()

  • panfrost: Make the pan_blend logic panfrost_device-agnostic

  • panfrost: Get rid of unused panfrost_device arguments in pan_blitter.c

  • panfrost: Pass the tile buffer budget through pan_fb_info

  • panfrost: Pass the sample position array through pan_fb_info

  • panfrost: Pass no_hierarchical_tiling info through pan_tiler_context

  • panfrost: Pass tiler heap info through pan_tiler_context

  • panvk: Inline pan_wls_mem_size()

  • panfrost: Make pan_desc.{c,h} panfrost_device agnostic

  • panfrost: Drop unused panfrost_device forward declaration in pan_shader.h

  • panfrost: Make pan_layout.c panfrost_device agnostic

  • panfrost: Make pan_sample.c panfrost_device agnostic

  • panfrost: Make pan_encoder.h panfrost_device agnostic

  • panfrost: Remove unused header inclusions from pan_blitter.h

  • panfrost: Make pan_blitter.h includable from non per-gen files

  • panfrost: Make pan_blitter.{c,h} panfrost_device agnostic

  • panfrost: Make pan_indirect_dispatch panfrost_device agnostic

  • panfrost: Make pan_pool.h panfrost_{device,bo} agnostic

  • panfrost: Make pan_props.c panfrost_device agnostic

  • panfrost: Make pan_texture.{c,h} panfrost_bo agnostic

  • panfrost: Make pan_desc.{c,h} panfrost_bo agnostic

  • panfrost: Remove uneeded pan_device.h inclusions

  • panfrost: Make panfrost_texfeatures.c panfrost_device agnostic

  • panfrost: Make pan_perf panfrost_device agnostic

  • panfrost: Add a helper to retrieve a panfrost_bo from a pan_kmod_bo

  • panvk: Get rid of unused pdev arguments passed to some meta helpers

  • panvk: Stop passing panfrost_device around in internal meta helpers

  • panvk: Store various physical device properties at the physical_device level

  • panvk: Use vk_device::drm_fd instead of going back to the physical device

  • panvk: Move panfrost_device and panvk_meta to panvk_device

  • panvk: Add a decode context at the panvk_device level

  • panvk: Instantiate our own blitter/blend_shader caches

  • panvk: Add pan_kmod_{vm,dev} objects to panvk_device

  • panvk: Add the concept of private BO

  • panvk: Transition panvk_pool to panvk_priv_bo

  • panvk: Transition panvk_descriptor_set to panvk_priv_bo

  • panvk: Transition panvk_pipeline to panvk_priv_bo

  • panvk: Transition panvk_{image,buffer}_view to panvk_priv_bo

  • panvk: Track blit src/dst using pan_kmod_bo objects

  • panvk: Keep a ref to a pan_kmod_bo in panvk_image

  • panvk: Keep a ref to a pan_kmod_bo in panvk_buffer

  • panvk: Keep tiler_heap and sample_positions BOs at the panvk_device level

  • panvk: Move away from panfrost_{bo,device}

  • panfrost: Move pan_{bo,device}.{c,h} to the gallium driver dir

  • panfrost: Clamp the render area to the damage region

  • panfrost: v4 doesn’t have Blend descriptors

  • panfrost: Pad compute jobs with zeros on v4

  • pan/va: Add missing valhall_enums dep to valhall_disasm

  • pan/kmod: Fix typo in pan_kmod_vm_op_check() helper

  • pan/kmod: Add a PAN_KMOD_VM_FLAG_TRACK_ACTIVITY flag

  • pan/kmod: Reject pre 1.1 panfrost kernel drivers

  • panfrost: Rework the way we compute thread info

  • panfrost: Prepare support for GPU variants

  • pan/perf: Reject panthor kernel driver

  • drm-uapi: Add panthor uAPI

  • pan/kmod: Add a backend for panthor

  • panfrost: Add v10 support to libpanfrost

  • pan/genxml: Various CS related improvements in v10.xml

  • pan/decode: Introduce the concept of usermode queue

  • panfrost: Don’t allocate a tiler heap buffer on v10+

  • pan/genxml: Make sure pan_pack() evaluates ‘dst’ only once

  • panfrost: Relax position result alignment constraint on v10+

  • panfrost: Add arch-specific context init/cleanup hooks

  • panfrost: Add a panfrost_context_reinit() helper

  • panfrost: Add a cleanup_batch() method to panfrost_vtable

  • panfrost: Enable v10 in the gallium driver

  • panfrost: Advertize G610 support

  • panfrost: Advertize G310 support

  • panfrost: Update the release note to mention G310/G610 addition

  • vk/meta: Add the PUSH_DESCRIPTOR_BIT flag when creating blit pipeline layouts

  • vk/meta: Fix base_type selection in build_{clear,blit}_shader()

  • panvk: Fix call ordering in panvk_DestroyDevice()

  • panvk: clang-format the source files

  • panvk: Kill the panvk_pack_color() prototype

  • panvk: Add VKAPI_{ATTR,CALL} specifiers to all panvk-specific entrypoints

  • panvk: Do not handle invalid NULL memory object in BindImageMemory2()

  • panvk: Get rid of unused panvk_image_get_plane_size() helper

  • panvk: Get rid of the custom device lost handling

  • panvk: Fix allocation scope of command buffer sub-objects

  • panvk: Add missing util_dynarray_init() in panvk_cmd_open_batch()

  • panvk: Don’t open-code panvk_cmd_open_batch() in CmdBeginRenderPass2()

  • panvk: Don’t allocate a TEXTURE descriptor in CreateImageView()

  • panvk: s/panvk_event_op/panvk_cmd_event_op/

  • panvk: Allocate descriptor set arrays using vk_multialloc_zalloc()

  • panvk: Don’t pass a device where we don’t need one

  • panvk: Get rid of unused panvk_cmd_buffer fields

  • panvk: Kill panvk_{Create,Destroy}SamplerYcbcrConversion()

  • panvk: Drop panvk_framebuffer

  • panvk: Get rid of panvk_pipeline_cache

  • panvk: Make panvk_buffer_view inherit from vk_buffer_view

  • panvk: Make panvk_device_memory inherit from vk_device_memory

  • panvk: Make pan_AllocateMemory() robust to errors

  • panvk: Add extra checks to panvk_MapMemory()

  • panvk: Implement {Map,Unmap}Memory2KHR

  • panvk: Make panvk_sampler inherit from vk_sampler

  • panvk: Fix GetPhysicalDeviceProperties2() to report accurate info

  • panvk: Get rid of fields we already have in vk_xxx objects

  • panvk: Disable global offset on varying and non-VS attribute descriptors

  • panfrost: Move the image attribute offset adjustment to a NIR pass

  • panvk: Implement dynamic rendering entry points

  • nir: Extend nir_get_io_offset_src_number() to support load_push_constant

  • nir: Extend nir_lower_mem_access_bit_sizes() to support push constants

  • pan/bi: Lower push constant accesses

  • pan/bi: Lower load_push_constant with dynamic indexing

  • pan/bi: Update the push constant count when emitting load_push_constant

  • panvk: Move some macros to panvk_macros.h

  • panvk: Move image related definitions to panvk_image.{h,c}

  • panvk: Move the VkBuffer logic to its own source file

  • panvk: Move the VkBufferView logic to its own file

  • panvk: Move the VkDeviceMemory logic to panvk_device_memory.{c,h}

  • panvk: Move the VkSampler logic to its own file

  • panvk: Move panvk_pipeline definition to panvk_pipeline.h

  • panvk: Move VkImageView logic to its own source files

  • panvk: Move the VkEvent logic to panvk_event.{c,h}

  • panvk: Move panvk_descriptor_{set,pool} definitions to panvk_descriptor_set.h

  • panvk: Move VkDescriptorSetLayout logic to panvk_descriptor_set_layout.{c,h}

  • panvk: Move VkPipelineLayout logic to its own file

  • panvk: Move shader related definitions to panvk_[vX_]shader.{c,h}

  • panvk: Kill panvk_[vX_]cs.{c,h}

  • panvk: Move panvk_{draw,dispatch}_info definitions to panvk_vX_cmd_buffer.c

  • panvk: Move the VkCommandPool logic to panvk_cmd_pool.{c,h}

  • panvk: Move VkQueue logic to panvk_[vX_]queue.{c,h}

  • panvk: Add a panvk_arch_dispatch_ret() variant

  • panvk: Make the device creation/destruction per-arch

  • panvk: Move the VkInstance logic to panvk_instance.{c,h}

  • panvk: Move the VkPhysicalDevice logic to panvk_physical_device.{c,h}

  • panvk: Move panvk_meta definitions to panvk_meta.h

  • panvk: Move panvk_device definition to panvk_device.h

  • panvk: Move the panvk_cmd_buffer definitions in panvk_cmd_buffer.h

  • panvk: Move the panvk_priv_bo logic to panvk_priv_bo.{c,h}

  • panvk: Move panvk_wsi definitions to panvk_wsi.h

  • panvk: Kill panvk_private.h

  • panvk: Make panvk_buffer_view per-gen

  • panvk: Make panvk_image_view per-gen

  • panvk: Make panvk_sampler a per-gen

  • panvk: Make panvk_cmd_buffer per-gen

  • panvk: Make panvk_shader per-gen

  • panvk: Make panvk_descriptor_set per-gen

  • panvk: Make panvk_descriptor_set_layout per-gen

  • panvk: Make panvk_pipeline per-gen

  • panvk: Make panvk_queue per-gen

  • panvk: Make panvk_pipeline_layout per-gen

  • panvk: Fix attach-less rendering

  • panvk: Fix the colorAttachmentCount check in begin_rendering_init_fbinfo()

  • pan/bi: Support fragment store_output() with a non-zero offset

  • panvk: Don’t assume VkGraphicsPipelineCreateInfo::pColorBlendState != NULL

  • pan/bi: Allow subpass sampler dims

  • panvk: Fix input attachment support

  • panvk: Fill pan_tls_info::wls::instances

  • panvk: Make sure the sample_pattern is set in the tiler descriptor

  • panvk: We don’t support resolve operations yet

  • pan/bi: Extend bi_emit_texc() to support wider direct tex/sampler idx

  • panvk: Don’t assume pViewportState != NULL

  • panvk: Fix img2buf copies with image X offset not aligned on 16 pixels

  • panvk: Fix has_non_vs_attribute() test in panvk_draw_prepare_vs_attribs()

  • panvk: Make sure we pick a valid wrap_mode_r value for unnormalizedCoordinates

  • panvk: Fix depth/stencil image views

  • panvk: Make sure we have a decode context created when we need one

  • panvk: Don’t advertize vertex_buffer cap on sRGB formats

  • panvk: Swizzle the border color on v7 when the format is BGR

  • panvk: Re-order things in panvk_physical_device_init()

  • panvk: Fill maxCustomBorderColorSamplers

  • panvk: Skip tiler jobs when the vertex shader doesn’t write the position

  • panvk: Make sure we use the proper format for views of depth+stencil images

  • panvk: Abort on fault when PANVK_DEBUG=sync

  • panvk/ci: Make sure we catch GPU faults

  • panvk/ci: Enable dEQP-VK.pipeline.monolithic.*

  • panvk: Add support for KHR_push_descriptor

  • panvk/ci: Re-enable copy_and_blit tests

  • panvk: Stop declaring one push constant array per graphics stage

  • panvk: Pass the push constant array to draw/dispatch calls

  • panvk: Stop lowering push constant loads to UBO loads

  • panvk: Dissociate UBO and push_constant emission

  • nir/lower_blend: Fix nir_blend_logicop() for 8/16-bit integer formats

  • panfrost: do not write outside num_wg_sysval

  • panfrost: Add the BO containing fragment program descriptor to the batch

  • pan/kmod: Fix a syncobj leak in the panthor backend

  • pan/kmod: Make default allocator thread-safe

Boyuan Zhang (5):

  • radeonsi/vcn: only use multi slices reflist when available

  • meson: bump the minimal required vdpau version to 1.4

  • ac/gpu_info: Add vcn dec and enc version query

  • radeonsi/vcn: choose rc_per_pic by encode verison

  • radeonsi/vcn: mark rc_per_pic as obsoleted

Błażej Szczygieł (2):

  • gallivm/ssbo: replace run time loop by compile time loop

  • gallivm/ssbo: mask offset with exec_mask instead of building the ‘if’

Caio Oliveira (268):

  • intel/compiler/xe2: Implement instruction compaction for DPAS.

  • intel/compiler: Add couple of tests for fs_combine_constants

  • intel/compiler: Fix rebuilding the CFG in fs_combine_constants

  • intel: Use an intel enum for cmat scope

  • intel/compiler: Enable lower_rotate_to_shuffle in subgroup lowering

  • anv: Advertise VK_KHR_shader_subgroup_rotate

  • iris: Remove unused brw_* includes

  • iris: Remove prototypes for unsupported Gfx versions

  • iris: Remove unused paramater

  • iris: Call blorp_finish() when destroying context

  • crocus: Call blorp_finish() when destroying context

  • intel/compiler: Rename brw_image_param to isl_image_param

  • intel/compiler: Rename BRW_WM_MSAA_* enums to INTEL_MSAA_*

  • intel/compiler: Rename BRW_TESS_* enums to INTEL_TESS_*

  • intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_*

  • intel/compiler: Rename brw_vue_map to intel_vue_map

  • intel/compiler: Rename brw_cs_dispatch_info to intel_cs_dispatch_info

  • intel/compiler: Move disassemble functions to own header file

  • intel/compiler: Include brw_disasm_info.h where its used

  • intel/compiler: Merge intel_disasm.[ch] into corresponding brw files

  • intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm}

  • intel/blorp: Don’t require specific prog_data type in callback

  • intel/blorp: Remove brw_ prefix when not applicable

  • intel/blorp: Simplify blorp_compile_fs() interface

  • intel/blorp: Simplify blorp_compile_cs() interface

  • intel/blorp: Use a struct to return blorp_compile_*() results

  • intel/blorp: Remove outdated reference in comment

  • intel/blorp: Move brw_blorp_get_urb_length helper

  • intel/blorp: Avoid brw types in blorp_priv.h

  • intel/blorp: Move brw_compiler.h include to where is needed

  • intel/blorp: Use a Meson dependency for blorp

  • intel: Add missing dependencies on blorp

  • intel/decoder: Move decoder to a separate module

  • intel/compiler: Collect NIR-only passes in intel_nir.h

  • intel/compiler: Rename the passes and files related to intel_nir.h

  • intel/compiler: Rename brw_gfx_ver_enum.h to intel_gfx_ver_enum.h

  • intel: Remove brw_ prefix from process debug function

  • intel/isl: Include compiler generic header

  • anv: Remove lower_atomics from storage image lowering opts

  • iris: Remove no-ops from storage image lowering

  • intel/compiler: Use “intel” prefix for walk_order enum

  • iris: Add stage to iris_compiled_shader

  • iris: Don’t use prog_data to guard 3DSTATE_CONSTANT_* code

  • iris: Reduce dependency on brw_*_prog_data structs

  • iris: Take ownership of prog_data when applying it

  • iris: Use uint32_t instead of brw_param_builtin

  • iris: Move compiler creation to iris_program.c

  • iris: Add IRIS_MAX_* constants to replace BRW_MAX_* usage

  • iris: Add helper to access use_tcs_multi_patch

  • iris: Add helper for indirect_ubos_use_sampler

  • iris: Move iris_get_compiler_options to iris_program.c

  • iris: Include brw_compiler.h only when needed

  • intel/meson: Remove usage of meson.source_root and meson.build_root

  • intel/meson: Fix warning about broken str.format

  • intel/elk: Fork Gfx8- compiler by copying existing code

  • intel/elk: Compile ELK library, tests and tools

  • intel/elk: Remove compiler specific devinfo hash

  • intel/elk: Remove a bunch of files that don’t apply for Gfx8-

  • intel/elk: Use common code in intel/compiler

  • intel/elk: Remove stages not used in Gfx8-

  • intel/elk: Remove DPAS lowering

  • intel/elk: Rename files to use elk prefix

  • intel/elk: Rename header guards

  • intel/elk: Update doxygen-like file comments

  • intel/elk: Rename C++ namespace

  • intel/elk: Rename symbols

  • intel/elk: Don’t include elk_eu_defines.h in elk_nir.h

  • intel/elk: Create separate header for opcodes

  • intel/blorp: Move brw specific code to a separate file

  • intel/blorp: Explicitly include brw_compiler.h header

  • intel/blorp: Add ELK support

  • intel/blorp: Remove Gfx9+ references in elk code

  • intel/decoder: Add ELK support

  • crocus: Use ELK compiler

  • hasvk: Use ELK compiler

  • iris: Rename screen->compiler to screen->brw

  • iris: Use ELK compiler for Gfx8

  • intel/tools: Add ELK support for aubinator

  • intel/tools: Add ELK support for aubinator_error_decode

  • intel/tools: Add ELK support for intel_hang_replay

  • intel/tools: Add ELK support for aubinator_viewer

  • intel/tools: Add ELK support for intel_hang_viewer

  • intel: Use _brw suffix for genX headers that rely on brw

  • intel/meson: Rename libintel_compiler to libintel_compiler_brw

  • intel/tools: Add extra compiler device sha only for Gfx9+

  • intel/elk: Move nir_options to its own c/h file pair

  • intel-clc: Use correct set of nir_options when building for Gfx8

  • intel/elk: Use anonymous namespace in fs_combine_constants

  • intel/elk: Remove tests for Gfx9+

  • intel/brw: Remove assembler tests for Gfx8-

  • intel/brw: Remove EU compaction tests for Gfx8-

  • intel/brw: Remove EU validation tests for Gfx8-

  • intel/brw: Remove pass test cases for Gfx8-

  • intel/brw: Assert Gfx9+

  • intel/compiler: Remove has_render_target_reads from wm_prog_data

  • intel/brw: Remove Gfx8- passes from optimize()

  • intel/brw: Pull opt_copy_propagation out of fs_visitor

  • intel/brw: Pull opt_cmod_propagation out of fs_visitor

  • intel/brw: Pull opt_saturate_propagation out of fs_visitor

  • intel/brw: Pull dead_code_eliminate out of fs_visitor

  • intel/brw: Pull opt_combine_constants out of fs_visitor

  • intel/brw: Pull opt_cse out of fs_visitor

  • intel/brw: Pull bank_conflicts out of fs_visitor

  • intel/brw: Pull peephole_sel out of fs_visitor

  • intel/brw: Pull redundant_halt out of fs_visitor

  • intel/brw: Pull opt_algebraic out of fs_visitor

  • intel/brw: Pull split/compact virtual_grf opts out of fs_visitor

  • intel/brw: Pull opt_split_sends out of fs_visitor

  • intel/brw: Pull opt_zero_samples out of fs_visitor

  • intel/brw: Pull eliminate_find_live_channel out of fs_visitor

  • intel/brw: Pull remove_extra_rounding_modes out of fs_visitor

  • intel/brw: Pull register_coalesce out of fs_visitor

  • intel/brw: Pull lower_constant_loads out of fs_visitor

  • intel/brw: Pull lower_pack out of fs_visitor

  • intel/brw: Pull lower_simd_width out of fs_visitor

  • intel/brw: Pull lower_barycentrics out of fs_visitor

  • intel/brw: Pull lower_logical_sends out of fs_visitor

  • intel/brw: Pull fixup_nomask_control_flow out of fs_visitor

  • intel/brw: Pull lower_integer_multiplication out of fs_visitor

  • intel/brw: Pull lower_sub_sat out of fs_visitor

  • intel/brw: Pull lower_derivatives out of fs_visitor

  • intel/brw: Pull lower_regioning out of fs_visitor

  • intel/brw: Pull fixup_sends_duplicate_payload out of fs_visitor

  • intel/brw: Pull lower_uniform_pull_constant_loads out of fs_visitor

  • intel/brw: Pull lower_find_live_channel out of fs_visitor

  • intel/brw: Pull lower_load_payload out of fs_visitor

  • intel/brw: Use references for a couple of backend_shader passes

  • intel/brw: Simplify OPT macro usage in fs_visitor::optimize

  • intel/brw: Pull fixup_3src_null_dest out of fs_visitor

  • intel/brw: Pull emit_dummy_memory_fence_before_eot out of fs_visitor

  • intel/brw: Pull emit_dummy_mov_instruction out of fs_visitor

  • intel/brw: Pull lower_scoreboard out of fs_visitor

  • intel/brw: Pull optimize() out of fs_visitor

  • intel/brw: Move optimize and small optimizations to brw_fs_opt.cpp

  • intel/brw: Move virtual GRF opts into their own file

  • intel/brw: Move fs algebraic to its own file

  • intel/brw: Move small lowering passes into brw_fs_lower.cpp

  • intel/brw: Move lower_integer_multiplication to its own file

  • intel/brw: Expose flag_mask/bit_mask fs helpers

  • intel/brw: Move lower_simd_width to its own file

  • intel/brw: Move workarounds to a separate file

  • intel/blorp: Remove Gfx8- references in BRW code

  • intel/brw: Move brw_compile_* functions out of vec4-specific files

  • intel/brw: Move type_size_* functions out of vec4-specific file

  • intel/brw: Always use scalar shaders

  • intel/brw: Remove vec4 backend

  • intel/brw: Remove now unused vec4-only opcodes

  • intel/brw: Remove unused legacy shader stages

  • intel/brw: Remove Gfx8- code from disassembler

  • intel/brw: Remove Gfx8- code from assembler

  • intel/brw: Remove Gfx8- code from brw_compile_* functions

  • intel/brw: Remove Gfx8- code from scheduler

  • intel/brw: Remove Gfx8- code from register allocator

  • intel/brw: Remove Gfx8- code from thread payload

  • intel/brw: Remove Gfx8- code from NIR conversion

  • intel/brw: Remove Gfx8- code from lower storage image pass

  • intel/brw: Remove Gfx8- code from lower logical sends

  • intel/brw: Remove Gfx8- code from generator

  • intel/brw: Remove Gfx8- code from backend passes

  • intel/brw: Remove Gfx8- code from EU compaction

  • intel/brw: Remove Gfx8- code from IR performance analysis

  • intel/brw: Remove Gfx8- code from EU emission

  • intel/brw: Remove Gfx8- code from EU validation

  • intel/brw: Remove Gfx8- code from NIR passes

  • intel/brw: Remove Gfx4-5 manual compression selection

  • intel/brw: Remove Gfx8- code from EU codegen helpers

  • intel/brw: Remove Gfx8- code from NIR options

  • intel/brw: Remove Gfx8- code from register type helpers

  • intel/brw: Remove Gfx8- specific EU inst helpers

  • intel/brw: Remove Gfx8- code from inst FC and F macros

  • intel/brw: Replace inst F8 macro with F macro

  • intel/brw: Remove Gfx8- code from inst F20 macros

  • intel/brw: Remove Gfx8- code from inst FD20 and FV20 macros

  • intel/brw: Remove Gfx8- code from inst FI macros

  • intel/brw: Remove Gfx8- code from inst BRW_IA*_ADDR_IMM macros

  • intel/brw: Remove Gfx8- code from inst FFDC, FDC and FD macros

  • intel/brw: Update comments for FK macro

  • intel/brw: Replace inst FF macro with F or F20 macros

  • intel/brw: Remove F16TO32 and F32TO16 opcodes

  • intel/brw: Remove Gfx8- code from builder

  • intel/brw: Remove Gfx8- code from fs_inst

  • intel/brw: Remove Gfx8- code from VUE map

  • intel/brw: Remove Gfx8- code from SIMD lowering

  • intel/brw: Remove Gfx8- code from visitor

  • intel/brw: Remove Gfx8- remaining opcodes

  • intel/brw: Remove MRF type

  • intel/brw: Inline brw_nir_apply_sampler_key code

  • intel/brw: Remove unused attrib workarounds

  • intel/brw: Remove edgeflag_is_last VS parameter

  • intel/brw: Remove Gfx8- fields from *_prog_key structs

  • intel/brw: Remove Gfx8- fields from *_prog_data structs

  • intel/brw: Use a single register set

  • intel/brw: Remove runtime_check_aads_emit

  • intel/brw: Remove automatic_exec_sizes

  • intel/brw: Use fs_visitor instead of backend_shader in various passes

  • intel/brw: Fold fs_instruction_scheduler into instruction_scheduler

  • intel/brw: Change cfg_t to refer to fs_visitor

  • intel/brw: Move dump_* functions into fs_visitor

  • intel/brw: Fold backend_shader into fs_visitor

  • intel/brw: Remove extra stage_prog_data field in fs_visitor

  • intel/brw: Remove brw_shader.h

  • intel/meson: Add dependencies for brw and elk

  • intel/compiler: Remove nir_print_instr hack in disasm_info

  • intel/brw: Use C++ for brw_disasm_info.c

  • intel/brw: Hide the definition of cfg_t et al from C code

  • intel/brw: Use fs_inst in cfg_t

  • intel/brw: Use fs_inst explicitly in various passes

  • intel/brw: Use fs_inst in disasm_annotate()

  • intel/brw: Move functions from backend_instruction into fs_inst

  • intel/brw: Fold backend_instruction into fs_inst

  • intel/brw: Remove typedefs from fs_builder

  • intel/brw: Fold backend_reg into fs_reg

  • intel/brw: Simplify usage of reg immediate helpers

  • intel/compiler: Fix SIMD lowering when instruction needs a larger SIMD

  • intel/elk: Remove split sends

  • intel/elk: Remove DPAS opcode

  • intel/elk: Remove BTD and RT opcodes

  • intel/elk: Remove DP4A opcode

  • intel/elk: Remove ROR and ROL opcodes

  • intel/elk: Remove IADD3 opcode

  • intel/elk: Remove EU compaction logic for Gfx9+

  • intel/elk: Remove encoding for Gfx9+

  • intel/elk: Remove SYNC opcode and SWSB annotations

  • intel/elk: Remove Gfx12 SFIDs and related LSC code

  • intel/elk: Remove Gfx9+ sampler messages and modes

  • intel/elk: Rename symbols for A64 OWord Block R/W messages

  • intel/elk: Remove Gfx9+ dataport messages

  • intel/elk: Remove FB_READ opcodes

  • intel/elk: Remove Gfx12.5 URB message

  • intel/elk: Remove ex_desc and ex_mlen from elk_inst

  • intel/elk: Remove Xe2 logical sends lowering

  • intel/elk: Remove unused sources from ELK_SHADER_OPCODE_SEND

  • intel/elk: Remove unused SEND features

  • intel/elk: Remove validation code for Gfx9+

  • intel/elk: Remove Gfx9+ from nir conversion

  • intel/elk: Remove Gfx9+ from compile/run functions

  • intel/elk: Remove FB_WRITE_LOGICAL_SRC_SRC_STENCIL

  • intel/elk: Remove Gfx9+ from passes

  • intel/elk: Remove Gfx9+ from thread payload

  • intel/elk: Remove Gfx9+ from EU emission

  • intel/elk: Remove coarse pixel handling

  • intel/elk: Remove Gfx9+ from FS generator

  • intel/elk: Remove Gfx9+ from Reg related code

  • intel/elk: Remove Gfx9+ from asm grammar

  • intel/elk: Remove Gfx9+ from disasm

  • intel/elk: Remove Gfx9+ from NIR auxiliary code

  • intel/elk: Remove use_tcs_multi_patch

  • intel/elk: Remove Gfx9+-only passes

  • intel/elk: Remove uses of intel_device_info_is_9lp()

  • intel/elk: Remove remaining Gfx9+ code

  • intel/elk: Remove multi-polygon support

  • intel/elk: Clean up unused code in elk_compiler.h

  • intel/brw: Use hstride instead of stride for accumulator

  • intel/brw: Use helper to create accumulator register

  • intel/brw: Fix validation of accumulator register

  • anv: Enable VK_KHR_shader_maximal_reconvergence

  • intel/tools: Make intel_stub_gpu work when using meson devenv

  • intel/brw: Implement quad_vote_any and quad_vote_all

  • intel/brw: Use predicates for quad_vote_any and quad_vote_all when available

  • anv: Enable VK_KHR_shader_quad_control

  • intel/brw: Handle Xe2 in brw_fs_opt_zero_samples

  • intel/brw: Remove vestiges of sources on IF opcode, only valid on Gfx6

  • intel/brw: Add a src array for the common case in fs_inst

  • intel/brw: Refactor FS validation macros

  • intel/brw: Remove two duplicated validate calls in optimizer

  • intel/brw: Move validate out of fs_visitor

  • intel/brw: Support FIXED_GRF when generating code for CLUSTER_BROADCAST

  • intel/brw: Lower VGRFs to FIXED_GRFs earlier

Casey Bowman (1):

  • anv: Override VendorID for Hitman 3

Charlie Turner (2):

  • amd, radeonsi: Lower minimum supported video dimensions for AV1

  • {vulkan,radv,anv}/video: fix issue in H264 scaling lists derivation

Chia-I Wu (7):

  • radv: fix pipeline stats mask

  • meson: fix a build ereror

  • radv: hide the sparse queue when radv_legacy_sparse_binding

  • radv: hide the sparse queue on older kernels

  • radv: set VK_SYNC_FEATURE_GPU_MULTI_WAIT

  • aco: fix nir_op_pack_32_4x8 handling

  • radv: fix 2d/3d image copy on compute queue

Chris Rankin (4):

  • vdpau: Declare texture object as immutable using helper function.

  • vdpau: Refactor query for video surface formats.

  • meson: bump the minimal required vdpau version to 1.5

  • frontends/vdpau: Add support for VDPAU AV1 decoding.

Christian Duerr (1):

  • panfrost: Fix dual-source blending

Christian Gmeiner (100):

  • .gitignore: Add .venv folder

  • etnaviv/isa: Add missing dep of encode.py/decode.py calls on isa.py

  • isaspec: encode.py: Include assert.h

  • isaspec: encode.py: Include util/log.h

  • etnaviv: Remove no_oneconst_limit from etna_inst

  • isaspec: encode: Constify encode.type

  • isaspec: encode: Constify bitset_params

  • etnaviv: Remove not used etna_assemble_set_imm(..)

  • etnaviv: Fix how we determine the max supported number of varyings

  • etnaviv: isa: Remove duplicate #instruction-alu-atomic

  • etnaviv: isa: Add dsx and dsy opcodes

  • etnaviv: isa: Add frc opcode

  • etnaviv: isa: Add norm_dp2, norm_dp3 and norm_dp4 opcodes

  • etnaviv: isa: Add bit_extract opcode

  • etnaviv: isa: Correct dp2 opcode

  • etnaviv: isa: Add branch_any opcode

  • etnaviv: isa: Name cond enum value 22

  • etnaviv: isa: Add movai opcode

  • etnaviv: isa: Add bit_rev opcode

  • etnaviv: isa: Add texldb opcode

  • etnaviv: isa: Add texldl opcode

  • etnaviv: isa: Add texldd opcode

  • etnaviv: isa: Remove note about GC3000

  • etnaviv: isa: Add div opcode

  • etnaviv: isa: Reorder instructions

  • etnaviv: isa: Rename reg_group u2 to u

  • etnaviv: isa: Add internal register group

  • etnaviv: isa: Add movar opcode

  • etnaviv: isa: Move {TEX_SWIZ}

  • etnaviv: isa: Correct SRC0_AMODE

  • etnaviv: isa: Correct #instruction-cf-src1-src2 bitset name

  • etnaviv: isa: Correct #instruction-alu-no-dst-maybe-src1-src2 name

  • etnaviv: isa: Correct #instruction-alu-no-dst-has-src0-src1 expr name

  • etnaviv: isa: Combine branch and branch_if

  • etnaviv: isa: Support unary branch instruction

  • etnaviv: isa: Support unary texkill instruction

  • etnaviv: isa: Support multiple encodings for texldl

  • etnaviv: isa: Fix #instruction-tex-src0-src1-src2 bitset

  • etnaviv: isa: Support multiple encodings for texldb

  • isaspec: Remove not used isa_decode_hook

  • isaspec: deocde: Hide all the internals ISA details

  • isaspec: decode: Add isa specific functions

  • isaspec: decode: Make isa_decode_bitset(..) private

  • freedreno/isa: Rework meson dependency for libir3decode

  • etnaviv: isa: Rework meson dependency for libetnaviv_decode

  • isaspec: deocde: Make isa_bitset arrays static

  • isaspec: deocde: Make isa_decode_field(..) private

  • isaspec: decode: Add libisaspec

  • isaspec: deocde: Remove generic functions from public interface

  • etnaviv: isa: Define a dontcare bit in atomic instructions

  • etnaviv: isa: Add name attributes

  • etnaviv: isa: Generate c header containing enums

  • etnaviv: isa: Generate opcode enum

  • etnaviv: isa: Add an empty libetnaviv_encode

  • etnaviv: Link against libetnaviv_encode

  • etnaviv: Move struct etna_inst to src/etnaviv

  • etnaviv: isa: Make use of generated enums

  • etnaviv: isa: Add rouding to etna_inst

  • etnaviv: Set dst.use for MOVAR

  • etnaviv: isa: Add encode support

  • etnaviv: isa: Add isa_assemble_instruction(..)

  • etnaviv: Switch to isa_assemble_instruction(..)

  • etnaviv: Move swizzle related macros to scr/etnaviv

  • etnaviv: Switch to macros from isa.h

  • etnaviv: Remove isa.xml.h

  • etnaviv: Do not set tex.amode for rounding

  • ci/etnaviv: Remove duplicates

  • ci/etnaviv: Do not skip tex-miplevel piglits

  • etnaviv: Remove offline shader compiler

  • etnaviv: Introduce common etna_core_info

  • etnaviv: drm: Make use of etna_core_info

  • etnaviv: drm: Add etna_gpu_get_core_info(..)

  • etnaviv: Switch to etna_core_info

  • etnaviv: Move hw header to common place

  • etnaviv: Introduce etna_feature enum

  • etnaviv: common: Add feature bitset

  • etnaviv: drm: Initialize etna_core_info based on kernel features

  • etnaviv: Switch to etna_core APIs

  • etnaviv: drm: Query some id values in etna_gpu_new(..)

  • etnaviv: hwdb: Import gc_feature_database from NXP

  • etnaviv: hwdb: Import gc_feature_database from Amlogic

  • etnaviv: hwdb: Import gc_feature_database from ST

  • ci: Install python3-pycparser in build container

  • etnaviv: hwdb: Generate hwdb.h

  • etnaviv: hwdb: Add etna_query_feature_db(..)

  • etnaviv: drm: Make use of hwdb

  • etnaviv: common: Add enum etna_core_type

  • etnaviv: common: Add some limit values

  • etnaviv: hwdb: Fill limits

  • etnaviv: drm: Fill limits

  • etnaviv: Copy values from etna_core_info

  • etnaviv: drm: Remove fallback value for ETNA_GPU_NUM_CONSTANTS

  • etnaviv: Drop not needed check if seamless cube map is supported

  • etnaviv: hwdb: Drop stdint.h dependency

  • nvk: Remove duplicate DRM_NODE_RENDER check

  • meson: Add missing newline at eof

  • etnaviv: Switch to etna_core_disable_feature(..)

  • etnaviv: Fix disabling of features

  • etnaviv: drm: Drop NPU-related params

  • clc: Always use spir for 32 bit

Collabora’s Gfx CI Team (4):

  • Uprev Piglit to e9316bcd12544aaf7e753ce37fe50d64165d9598

  • Uprev Piglit to 2a1c49a81cd9a6bf5d0c3a9b87225be94771ca96

  • Uprev Piglit to 1e631479c0b477006dd7561c55e06269d2878d8d

  • Uprev Piglit to dd6f7eaf82e8dd442da28b346c236141cbcce0b1

Connor Abbott (56):

  • freedreno: Add a7xx crashdump-related registers and enums

  • ir3/ra: Add specialized shared register RA/spilling

  • ir3: Set branchstack earlier

  • ir3: Rewrite (jp) and branchstack handling

  • ir3: Calculate physical edges correctly

  • ir3: Fix comment thinko

  • ir3/ra: Fix bug with collect source handling

  • tu: Add more info to ldg inline uniform path

  • ir3/a7xx: Fix load_global_ir3 with immediate offset

  • ir3: Initial support for pushing globals with ldg.k

  • tu: Follow pipeline compatibility rules for dynamic descriptors

  • tu: Reenable MSAA UBWC on a6xx gen1

  • tu: Enable UBWC for SNORM formats on a740+

  • tu: Enable UBWC for storage images on a7xx

  • vk/graphics_state: Remove bogus assert in CmdSetSampleMaskEXT

  • vk/graphics_state: Add stubs required by VK_EXT_shader_objects

  • freedreno/afuc: Decode (peek) modifier

  • freedreno/afuc: Add missing ALU encode case for bic

  • freedreno/afuc: Bump max instructions for a7xx

  • freedreno/afuc: Fix setbit/clrbit parsing

  • freedreno/afuc: Use left recursion in parser

  • freedreno/afuc: Improve jump table handling

  • freedreno/afuc: Add .align directive

  • freedreno/afuc: Add more general T_IDENTIFIER in lexer

  • freedreno/afuc: Add support for multiple sections when assembling

  • freedreno/afuc: Allow -e option on a7xx

  • freedreno/afuc: Emulate THREAD_SYNC on a660

  • freedreno/afuc: Run entire bootstrap routine

  • freedreno/afuc: Add a7xx test case

  • freedreno/afuc: Add magic control reg values for a740

  • freedreno/afuc: Add section on reassembling firmwares and relocations

  • freedreno/a7xx: Add CP_CCHE_INVALIDATE

  • tu: Implement CCHE invalidation

  • nir/divergence_analysis: Add ir3-specific intrinsics

  • nir/divergence_analysis: Add uniform_load_tears option

  • nir/divergence_analysis: Fix load_view_index divergence in VS

  • ir3: Allow single-predecessor phis

  • ir3: Run divergence analysis at the end

  • ir3: Remove loop shared copy check

  • ir3: Use divergence analysis for (jp) and physical CFG

  • freedreno/afuc: Switch to using the GPU ID in the firmware

  • freedreno/afuc: Add a7xx new-style branch instructions

  • freedreno/afuc: Add initial support for a750

  • freedreno: Make has_ibo_ubwc a7xx specific

  • freedreno,tu: Disable UBWC for storage images on a750

  • ir3/legalize: Fix intra-block state propagation with loops

  • ir3: Rewrite nop insertion

  • docs/android: Fix example meson cross file

  • docs/android: Improve instructions for replacing driver

  • ir3: Don’t use non-contiguous component masks for FS

  • ir3: Don’t pack FS inlocs

  • freedreno/a7xx: Register updates from kgsl

  • ir3: Add scan_clusters_macro to ir3_valid_flags()

  • ir3: Add scan_clusters.macro to is_subgroup_cond_mov_macro()

  • ir3/ra: Don’t demote movmsk instructions to non-shared

  • docs/android: Fix example meson cross file again

Constantine Shablia (6):

  • panvk: implement vkGetBufferDeviceAddress

  • panvk: advertise bufferDeviceAddress

  • vulkan/runtime: fix typo

  • mesa: fix typo

  • pan/bi: fix 1D array tex coord lowering

  • panfrost: report correct MAX_VARYINGS

Corentin Noël (37):

  • zink: Avoid the use of negative array offsets

  • zink: Use memmove when dealing with overlapping memory

  • glsl: Make sure to not cast ir_dereference_variable into ir_variable

  • glsl: Make sure that the variable is a ir_variable before unreferencing it

  • zink: Initialize zink_shader_object

  • zink: Initialize zink_bindless_descriptor to zero on creation

  • zink: Initialize pipe_query_result

  • zink: Do not shadow the variable ret

  • zink: Avoid variable shadowing everywhere

  • zink: Only call reapply_color_write if EXT_color_write_enable is available

  • ci_run_n_monitor: Allow the upstream format to not exist

  • zink: use symbolic values instead of 0

  • zink: do not use undefined stage mask if on missing KHR_synchronization2

  • glsl: Ensure that we are dealing with ir_variable and ir_rvalue

  • venus: sync protocol for VK_EXT_attachment_feedback_loop_layout

  • venus: enable VK_EXT_attachment_feedback_loop_layout

  • zink: Return early if the file descriptor could not have been duplicated/acquired

  • ci: Update virglrenderer and crosvm

  • zink: Make wrap_surface return a zink_ctx_surface directly

  • zink: Use an intermediary variable for create_surface

  • zink: Separate the template from the wrapped surface

  • zink: Return early if the source could not have been acquired

  • zink: Move zink_surface_destroy before zink_create_surface

  • zink: Make sure to not leak anything on surface creation failure

  • zink: Change zink_get_surface to return a zink_surface

  • zink: Add error logging on surface creation failure

  • st_pbo/compute: Use the correct structure type when allocating a specialized key

  • zink: Make sure to initialize all the fields of VkMemoryBarrier

  • dri/kopper: Assume a non-null drawable in flush_frontbuffer

  • zink: Removed unused function

  • zink: Removed unused num_texel_buffers member

  • zink: Removed unused push_valid member

  • zink: Remove ctx from zink_gfx_program

  • ci: Change propagated variables into an array

  • ci: Add VK_DRIVER_FILES passthrough from jobs to tests

  • ci: Allow to pass LIBGL_ALWAYS_SOFTWARE to the guest environment

  • ci: Add zink-venus-lvp job

Daniel Almeida (17):

  • nak/sm50 add support for suld

  • nak/sm50: add support for suatom

  • nak/sm50: add support for isberd

  • nak: sm50: add support for OpOut

  • nak: sm50: fadd: ensure src[0] is in a register

  • nak/sm50: legalize: display instruction on panic

  • nak/sm50: add support for brev

  • nak: sm50: fix some legalization issues

  • nak/sm50: add a memstream abstraction

  • nak/sm50: add an annotate debug flag

  • nak/sm50: support annotations through OpAnnotate

  • nak/sm50: sprinkle OpAnnotate in optimization passes

  • meson,ci: Add the paste crate

  • nil: Add the start of a Rust library

  • nil: Rewrite nil_format in rust

  • nil: Re-implement nil_image in Rust

  • nil: Rewrite the TIC code in Rust

Daniel Schürmann (52):

  • aco/insert_exec_mask: unify exec restore code after divergent control flow

  • aco/insert_exec_mask: replace phi for loop restore mask with explicit copies

  • aco/insert_exec_mask: only create loop phis for exec mask if necessary

  • aco: give spiller more room to assign spilled SGPRs to VGPRs

  • spirv: Fix SpvOpExpectKHR

  • vulkan: enable VK_KHR_shader_expect_assume

  • spirv: Update headers and grammar JSON

  • aco/insert_exec_mask: Fix unconditional demote at top-level control flow.

  • aco/insert_exec_mask: tiny refactor

  • aco: always terminate quads if they have been demoted entirely

  • aco/insert_exec_mask: Reduce latency when switching to WQM.

  • spirv: implement SPV_KHR_maximal_reconvergence

  • aco: enable WQM if demote is used with maximal reconvergence

  • radv: enable VK_KHR_shader_maximal_reconvergence

  • spirv: implement SPV_KHR_quad_control

  • radv: enable VK_KHR_shader_quad_control

  • radv: fix initialization of radv_shader_layout->use_dynamic_descriptors

  • aco: rematerialize constants in every basic block during optimizer

  • aco: reorder code and use namespaces in aco_interface.cpp

  • aco/util: small_vec few additions

  • aco: use small_vec as Block::edge_vec for predecessors and successors

  • aco/spill: refactor SSA repairing

  • aco/spill: don’t allocate extra spill_id for phi operands in add_coupling_code()

  • aco/spill: add spills_entry interferences only when necessary

  • aco/spill: refactor adding spilled vars into separate function add_to_spills()

  • aco/spill: keep live-out variables spilled at branch blocks

  • aco/spill: don’t prefer to spill phis at merge blocks

  • aco/spill: add interferences with variables spilled at loop headers

  • aco/spill: avoid re-spilling loop-carried variables in process_block()

  • aco/spill: avoid re-spilling loop-carried variables in add_coupling_code()

  • aco/spill: keep loop-carried variables spilled at loop headers

  • aco/spill: keep loop-carried variables spilled at merge blocks

  • aco/spill: select more loop-carried variables to be spilled

  • aco/spill: keep loop variables spilled during nested loops

  • aco: use instr_class::branch to identify SOPP branches

  • aco: remove SOPP_instruction::block member

  • aco: unify different SALU types into single struct SALU_instruction

  • aco/builder: use accessor functions instead of casting to subtypes

  • aco: change return type of create_instruction() to Instruction*

  • aco: defer instruction size from aco::Format in create_instruction()

  • aco: remove create_instruction() template parameter

  • aco: move create_instruction() to aco_ir.cpp

  • aco/spill: Fix assertion for nested loops

  • aco/spill: pass live_vars to spill_ctx

  • aco/spill: compute live-in variables from live-out

  • aco/spill: maintain valid live vars at any point

  • aco/spill: use live variables instead of next_use_distances in add_coupling_code()

  • aco/spill: gather information about average use distances

  • aco/spill: use average use distances in process_block()

  • aco/spill: use average use distances in init_live_in_vars() for merge blocks

  • aco/spill: use average use distances to spill loop variables

  • aco/ra: fix kill flags after renaming fixed Operands

Daniel Stone (50):

  • egl: Return BAD_CONFIG when robust access unsupported

  • st/dri: Use correct pipe_resource for GL texture image export

  • dri: Redeclare __DRI_IMAGE_FORMAT_* as PIPE_FORMAT_*

  • st/dri2: Remove __DRI_IMAGE_FORMAT conversion

  • st/dri2: Pass pipe_format to driCreateConfigs

  • st/dri2: Use u_format to get config format information

  • util: Add util_format_get_component_shift

  • st/dri: Remove format tables from driCreateConfigs

  • st/dri: Completely remove mesa_format from config setup

  • st/dri: Add transient HAS_ZS() helper

  • st/dri: Rework depth/stencil format selection

  • st/dri: Use pipe_format for Z/S modes

  • st/dri: Check format properties from format helpers

  • st/dri: Store pipe_format in gl_config

  • egl/wayland: Remove format-query fallback

  • st/dri: Reuse stored renderbuffer format

  • st/dri: Reuse stored texture format

  • dri/kopper: Move format -> FourCC translation up a level

  • dri/kopper: Add translations for sRGB formats

  • dri/kopper: Reorder format tables

  • dri/kopper: Flatten pipe_format/DRIImage/FourCC conversion

  • egl/wayland: Query image FourCC for linear copies

  • egl/wayland: s/DRI_IMAGE_FORMAT/pipe_format/g

  • egl/wayland: Add opaque-equivalent FourCCs

  • egl/wayland: Fix EGL_EXT_present_opaque

  • egl/wayland: Use pipe_format to look up configs

  • egl/wayland: Use FourCC to look up wl_buffer support

  • egl/wayland: Add helper to check server format support

  • egl/wayland: Use helper to look up visual

  • egl/wayland: Eliminate double loop for configs

  • egl/wayland: Simplify alternate-format fallback for configs

  • egl/wayland: Remove WL_SHM_* format listings

  • egl/wayland: Use pipe_format for format names

  • egl/wayland: Remove shift/size masks

  • egl: Fail display creation if no EGLConfigs created

  • egl/wayland: Remove check for EGLConfig presence

  • egl/gbm: Remove check for EGLConfig presence

  • egl/x11: Remove check for EGLConfig presence

  • egl/android: Remove check for EGLConfig presence

  • egl/{surfaceless,device}: Remove check for EGLConfig presence

  • egl: Automatically set EGLConfig ID

  • egl: Use pipe_format for pbuffer configs

  • gbm/dri: Query DRIImage for FourCC directly

  • gbm: Remove hardcoded color-channel data

  • egl/android: Remove hard-coded color-channel data

  • egl/x11: Compare config shifts/sizes locally

  • egl: Remove shifts/sizes from dri2_add_config argument

  • st/dri: Use pipe_format from config directly

  • egl/dri: Use pipe_format instead of DRI_IMAGE_FORMAT

  • egl/wayland: Remove EGL_WL_create_wayland_buffer_from_image

Danylo Piliaiev (37):

  • freedreno/replay: Delete all buffers after each submission

  • freedreno/replay: Correctly free iova on msm backend

  • freedreno/replay: Add WSL backend for Windows

  • ir3: Fix “print” meta instruction synchronization

  • ir3: Add fullsync and fullnop ir3 dbg options for over-syncing

  • freedreno/replay: Make meta “print” instruction take any number of regs

  • tu: Do not print anything on systems without Adreno GPU

  • tu/a7xx: Make A7XX_RB_UNKNOWN_8E06 value configurable per-gen

  • tu: Define and set to zero all SP_*_VGPR_CONFIG regs

  • ir3: Add ldg.k instruction

  • tu/a7xx: Correctly set A7XX_HLSQ_UNKNOWN_A9AE.SYSVAL_REGS_COUNT

  • tu/a7xx: Do not preload shaders, HW does it by default

  • tu: Use SS6_INDIRECT consts upload path for 3d blits

  • turnip,ir3/a750: Implement consts loading via preamble

  • tu: Use SS6_INDIRECT for VS params

  • turnip,ir3/a750: Implement inline uniforms via ldg.k

  • tu/a750: Consider vertex attr buff in gmem allocation

  • freedreno,tu: Move varying interp and varying repl modes to xml

  • freedreno/devices: Update magic regs for a7xx

  • tu: Exclude more a7xx regs from stomping

  • tu: Add workaround for D3D11 games accessing UBO out of bounds

  • tu/a7xx: Write even more magic regs to fix rendering issues on Android

  • tu: Do not emit zero-sized fs params

  • freedreno/a7xx: Fix base_align for non-UBWC depth-stencil

  • tu/autotuner: Use CP_EVENT_WRITE7 for submission fence

  • tu: Update prim restart state when we switch from/to indexed draw

  • tu: Fix dynamic state not always being emitted

  • meson: Correctly get sizeof_pointer with cross-compilers

  • freedreno/devices: Do not write to 8E79 on a750, KGSL has it protected

  • freedreno/replay: Use real queueid for submissions and waits

  • freedreno,tu/a7xx: Add PC_TESS_PARAM_SIZE and PC_TESS_FACTOR_SIZE

  • tu: Update RP state depending on pipeline in first RP draw

  • tu: Emit non-draw-state state at the first draw call

  • freedreno/devices: Add A740v3 from Quest 3

  • util/vma: Add function to get max continuous free size

  • freedreno/replay: Allocated maximum available size for cs overriding

  • ir3: Do not set clip/cull mask if no one writes clip/cull

Dario Mylonopoulos (1):

  • llvmpipe: fixed race condition in lp_rast_destroy that causes a crash on windows

Dave Airlie (46):

  • vulkan/video: drop unused function.

  • vulkan/video: rename some of the parameter tracking structs.

  • vulkan/video: start to wrap the video structs for deep copies.

  • vulkan/video: start deep copying the parameters structures

  • vulkan/video: constify the encoding apis.

  • radv/video: refactor sq start/end code to avoid decode hangs.

  • radv: don’t submit empty command buffers on encoder ring.

  • gallivm: fix coroutines with llvm 18

  • gallivm: passing fp16_split_fp64 to fp16 lowering.

  • nvk: allow 3d compressed textures

  • nvk: mem cannot be null in binding buffers/images.

  • zink: use sparse residency for buffers.

  • vulkan: update registry/includes to 1.3.277

  • vulkan/video: add AV1 decode support to common code

  • radv: fix correct padding on uvd

  • radv: init decoder ip block earlier.

  • radv/uvd: uvd kernel checks for full dpb allocation.

  • radv: don’t submit 0 length on UVD either.

  • egl: don’t bind zink under dri2/3

  • glx/dri3: handle zink fallback if loader picks it.

  • loader: handle picking zink for nouveau for certain GPUs.

  • nouveau/winsys: fix bda heap leak.

  • nvk: fix dri options leak.

  • egl/dri2: if zink is preferred from dri3 skip dri2 paths.

  • radv/video: fix filling out decode operations.

  • radv/video: use vcn ip version in more places.

  • radv: rename it_ptr to it_probs_ptr in advance of adding av1

  • radv/video: use proper struct sizes for decoder structs.

  • radv/video: add VK_KHR_video_decode_av1 support.

  • nvk: free leaked cmd_buffer descriptors state.

  • nvk: only unmap heap bos that were mapped

  • nvk: enable a mappable bar heap when rebar is disabled.

  • radv/video: fix h265 decode with unaligned w/h

  • mesa: reorder st context teardown

  • vulkan/video: copy the profile over for h264 encode.

  • radv/video: export unified queue header/tail functions.

  • radv: add direct cs emit for a dword.

  • radv: add encoder queue support pieces and encoder queries.

  • radv/video: add parameter patching calls.

  • radv/video: add initial support for encoding with h264.

  • radv/video: add h265 encode support

  • radv/video: enable video encoding behind perftest flag

  • radv/video: handle encode control parameters better.

  • radv/video: don’t advertise timestamp bits for decode/encode

  • egl/dri2: don’t bind dri2 for zink

  • radv/video/encode: fix quality params on v2 hw.

David (Ming Qiang) Wu (1):

  • frontends/va: make vlVaSyncSurface blocking

David Heidelberg (57):

  • ci/deqp: uprev deqp-runner for Linux too to 0.18.0

  • ci/lima: update expectations, failing tests are being skipped

  • ci: bump kernel to 6.6.12, modularize i915, add Transparent Huge Pages

  • ci: shorter kernel tag, included Vivante NPU patches

  • ci: disable Valve farm in Keywords

  • ci: bump libdrm to 2.4.120

  • ci/VK-GL-CTS: add patches to fix dEQP-VK.glsl.derivate crashes

  • ci: Valve farm (Keywords location) works again

  • meson: upgrade zlib wrap to 1.3.1

  • util: use crc32_z instead of crc32 and bump zlib dep to 1.2.9

  • ci: bump kernel to 6.6.16 + enable X2APIC

  • ci/freedreno: add fail found by new Piglit

  • ci/etnaviv: update expectations

  • ci: temporarily disable Collabora farm

  • ci: enable Collabora farm

  • ci: re-enable Collabora farm after maintenance

  • ci/intel: decompose anv-tgl-test so we can specify custom devices for TGL

  • ci/intel: add acer-cp514-2h-11{30,60}g7-volteer

  • ci/intel: move machine definition to the intel-tgl-skqp job

  • ci/intel: split asus-cx9400-volteer into acer-cp514-2h-11{30,60}g7-volteer

  • drm-shim: Avoid invalid file and time bits combination

  • intel/tools: avoid invalid time and file bits combination

  • ci/deqp: backport Implement support for the EGL_EXT_config_select_group extension GL-CTS patch

  • ci/freedreno: update expectations comment

  • ci/deqp: add EGL patch for correct suite (GLES, not GL)

  • nine: convert licenses block to SPDX

  • nine: fill missing licenses headers and copyrights

  • nine: drop useless and a bit too long line

  • ci: uprev kernel to 6.6.21

  • ci/freedreno: disable workarounds for Adreno 618, 630, and 660

  • ci/freedreno: mark fails resolved by “drm/msm/gem: Add metadata uapi”

  • ci: reduce irrelevant output to a simple list of libraries

  • util: move gen_zipped_file into generic util and rename to gen_zipped_xml_file

  • ci/r300: implement rules for d3d9 testing

  • ci/svga: add missed test and gl-rules include

  • r300: convert to SPDX license block and fix small typos

  • r300: add missing licence to the r300_public.h

  • r300: add missing copyright header

  • docs: we support EGL 1.5 for a long time

  • ci/amd: meld radv-traces into radv-raven-traces

  • ci/amd: drop old PIGLIT_REPLAY_DESCRIPTION_FILE surpassed by PIGLIT_TRACES_FILE

  • frontend/nine: fix typos

  • r600: update licensing to SPDX header

  • r600: add license header to r600_formats.h

  • r600: add license info to the r600_opcodes.h

  • r600: add license information to the sfn_shader_gs.h

  • r600: fix typos

  • ci: disable sona devices, all devices are offline

  • ci/intel: sona device_type is back online

  • ci: temporarily disable Android test builds

  • ci: disable Igalia farm

  • meson: implement split-debug

  • freedreno/ci: move the disabled jobs from include to the main file

  • ci/deqp: correct EGL_EXT_config_select_group detection

  • egl/x11: Move RGBA visuals in the second config selection group

  • winsys/i915: depends on intel_wa.h

  • subprojects: uprev perfetto to v45.0

David Rosca (31):

  • radeonsi/vcn: Fix H264 slice header when encoding I frames

  • frontends/va: Fix updating AV1 rate control parameters

  • radeonsi/vcn: Don’t reinitialize encode session on bitrate/fps change

  • frontends/va: Only set VP9 segmentation fields when segmentation is enabled

  • frontends/va: Separate QP for I/P/B frames

  • radeonsi/vcn: Use temporal_layer_index to select temporal layer

  • radeonsi/vcn: Implement separate QP for I/P/B frames

  • radv/video: Set maxActiveReferencePictures to 16 for H264/5

  • frontends/vdpau: Fix cdef strengths and lr_unit_shift in AV1 decode

  • frontends/vdpau: Support creating VDP_CHROMA_TYPE_420_16 surfaces

  • radv/video: Fix setting slice QP

  • radv/video: Set correct bitstream buffer size

  • radv/video: Set VBV buffer size and level

  • radv/video: Select temporal layer when encoding each frame

  • radv/video: Set maxSublayerCount to 4 for H265

  • radv/video: Avoid resetting rate control every frame

  • radv/video: Implement per picture type min/max QP

  • radv/video: Set correct bit depth and format for 10bit input

  • radv/video: Check encode profiles and bit depth in capabilities query

  • radv/video: Report maxBitrate in encode capabilities

  • radeonsi/vcn: Allocate session buffer in VRAM

  • radeonsi/vcn: Fix 10bit HEVC VPS general_profile_compatibility_flags

  • radeonsi/vcn: Only enable VBAQ with rate control mode

  • frontends/va: Fix AV1 slice_data_offset with multiple slice data buffers

  • Revert “radeonsi/vcn: AV1 skip the redundant bs resize”

  • frontends/va: Only increment slice offset after first slice parameters

  • radeonsi: Update buffer for other planes in si_alloc_resource

  • frontends/va: Store slice types for H264 decode

  • radeonsi/vcn: Ensure DPB has as many buffers as references

  • radeonsi/vcn: Allow duplicate buffers in DPB

  • radeonsi/vcn: Ensure at least one reference for H264 P/B frames

David Stern (1):

  • vulkan/wsi/x11: Explicitly discard errors from xcb_present_pixmap.

David Tobolik (1):

  • rusticl: implement cl_khr_suggested_local_work_size

Derek Foreman (10):

  • egl/wayland: Fix possible buffer leak

  • loader/wayland: Add named queue fallback

  • egl/wayland: Give names to our Wayland event queues

  • vulkan/wsi/wayland: Give names to our Wayland event queues

  • vulkan/wsi/wayland: Remove confusing comment

  • vulkan/wsi/wayland: Adjust presentation id locking

  • vulkan/wsi/wayland: Use wl_display_dispatch_queue_timeout

  • vulkan/wsi/wayland: More descriptive name for swapchain queue

  • vulkan/wsi/wayland: Fix use after free

  • vulkan/wsi/wayland: Remove unused get_min_image_count_for_mode_group

Dmitry Baryshkov (11):

  • freedreno/drm: don’t crash for unsupported devices

  • freedreno/regs: define the wide bus enable bit in DSI_VID_CFG0

  • freedreno/registers: fix generation dependencies

  • freedreno/registers: add missing copyright imports

  • freedreno/registers: inline mdp4_csc group

  • freedreno/registers: fix WB doffsets array in mdp5.xml

  • freedreno/registers: support processing display display headers

  • freedreno/registers: limit the rules schema

  • freedreno/registers: drop unsupported features from schema

  • freedreno/rnn: drop headergen2

  • freedreno/rnn: drop custom aprintf function

Dmitry Osipenko (2):

  • virtio/vdrm: Fix lockup in vdrm_host_sync()

  • iris: Use Mesa internal drm-uapi headers

Dylan Baker (8):

  • intel/vulkan: assume() that we don’t use “ISL_NUM_FORMATS”

  • intel/hasvk: assume() we don’t get ISL_NUM_FORMATS

  • meson: drop intel-cl deprecation of ‘false’

  • meson: rework intel-rt option to be a feature

  • meson: Allow building intel-clc for the host if it can be run

  • intel/brw: track last successful pass and leave the loop early

  • nvk: drop meson version check that is always true

  • nouveau: require cbindgen >= 0.25

Echo J (9):

  • nvk: Set ICD version to 1.3

  • nvk: Implement the VR-related display extensions

  • nak: Rip out a few dead_code statements

  • nvk: Add NVK to the Vulkan device name

  • nvk: Advertise VK_VALVE_mutable_descriptor_type

  • nvk: Implement calibrated timestamps

  • vulkan: Add implicit pipeline caching support

  • nvk: Use implicit pipeline cache

  • nvk: Don’t advertise residencyAlignedMipSize on MaxwellB+

Emma Anholt (2):

  • ci: Add full-run xfails missed in the 1.3.7.0 CTS update.

  • ci: Disable VK full runs that time out since 1.3.7.0 (hasvk, anv-tgl, a630)

Emmanuel Vadot (1):

  • util: Allow kcmp on FreeBSD

Eric Engestrom (282):

  • VERSION: bump to 24.1

  • docs: reset new_features.txt

  • docs: update calendar for 24.0.0-rc1

  • ci: make sure we evaluate the python-test rules first

  • docs: fix syntax highlighting on non-code text snippet

  • docs: fix syntax highlighting on shell commands

  • ci/deqp: ensure that in `default` builds, wayland + x11 + xcb are all built

  • zink+anv/ci: add known failures

  • ci: fix job dependency error in MRs for bin/ci/* scripts

  • nouveau/ci: don’t run nouveau (gl) tests on nvk changes

  • amd/ci: simplify deqp config

  • amd/ci: add flakes seen today

  • docs: update calendar for 24.0.0-rc2

  • zink+radv/ci: drop duplicates flakes lines

  • CODEOWNERS: add myself as as person of contact for CI changes

  • CODEOWNERS: remove myself as a person of contact for a few things

  • radv/ci: sort navi21 flakes

  • amd/ci: add flakes seen today

  • amd/ci: consider much more of dEQP-VK.query_pool.statistics_query.host_query_reset.* to be flaky

  • r300/ci: add flakes

  • ci/deqp: backport fix for zlib.net not allowing tarball download anymore

  • rpi3/ci: update piglit & deqp expectations

  • rpi4/ci: skip more of the dEQP-VK.ssbo.phys.layout.* tests that timeout occasionally

  • rpi3/ci: add flake seen today

  • rpi4/ci: add timeouts seen today

  • rpi5/ci: add flake seen today

  • docs: add release notes for 23.3.4

  • docs: update calendar for 23.3.4

  • docs: add sha256sum for 23.3.4

  • docs: update calendar for 24.0.0-rc3

  • ci_run_n_monitor: drop always-true condition

  • ci_run_n_monitor: allow passing multiple targets

  • ci/deqp: fix default target check when target is not specified

  • ci/deqp: simplify version log dump

  • ci/deqp: avoid storing the huge list of vk tests on android builds

  • ci/deqp: move editable part to the top of the file

  • ci/deqp: split vk and gl builds

  • ci/deqp: drop the implicit DEQP_TARGET; explicitly set `default` in VK builds

  • ci/deqp: only compile EGL tests in GL builds, not VK builds

  • ci/deqp: only compile the test binaries that are relevant to the build

  • ci/deqp: only keep the mustpass lists that are relevant to the build

  • ci: bump the image tags to rebuild all the deqp variants

  • Revert “bin/ci: Add GitLab basic token validation”

  • Reapply “bin/ci: Add GitLab basic token validation”

  • util: rename __check_suid() to __normal_user()

  • tree-wide: use __normal_user() everywhere instead of writing the check manually

  • zink+anv/ci: add a couple more flakes

  • util: simplify logic in __normal_user()

  • util: check for setgid() as well in __normal_user()

  • ci: always skip dEQP-VK.info.device_extensions

  • vk/util: fix ‘beta’ check for physical device features

  • vk/util: fix ‘beta’ check for physical device properties

  • ci: when specifying a driver remove all other ones

  • docs: update calendar for 24.0.0

  • docs: add release notes for 24.0.0

  • docs: add sha256sum for 24.0.0

  • docs/release-calendar: add planned 24.0.x bugfix releases

  • docs: add release notes for 23.3.5

  • docs: update calendar for 23.3.5

  • docs: add sha256sum for 23.3.5

  • v3d-rpi4-gl: reduce the parallelism from 10 to 8

  • docs/calendar: add 24.1 branchpoint and release schedule

  • ci: drop dash in image tags dates

  • ci: enforce maximum image tag length

  • ci: reduce maximum image tags length from 30 to 20

  • ci: explain purpose of the word after the date in image tags

  • panfrost: fix UB caused by shifting signed int too far

  • ci_run_n_monitor: avoid spamming a ton of “new status: created” for all the jobs at the beginning

  • ci: build panvk in debian-vulkan job

  • nouveau/tests: fix null dereference

  • ci: build nvk in debian-vulkan job

  • v3dv/ci: test the WSI on rpi4 and rpi5

  • radv: enable VK_EXT_headless_surface on all platforms except Windows

  • v3dv: enable VK_EXT_headless_surface on all platforms except Windows

  • tu: enable VK_EXT_headless_surface on all platforms except Windows

  • anv: enable VK_EXT_headless_surface on all platforms except Windows

  • hasvk: enable VK_EXT_headless_surface on all platforms except Windows

  • dzn: enable VK_EXT_headless_surface on all platforms except Windows

  • nvk: enable VK_EXT_headless_surface on all platforms except Windows

  • panvk: enable VK_EXT_headless_surface on all platforms except Windows

  • vn: enable VK_EXT_headless_surface on all platforms except Windows

  • lvp: enable VK_EXT_headless_surface on all platforms except Windows

  • pvr: enable VK_EXT_headless_surface on all platforms except Windows

  • ci_run_n_monitor: warn user if they forgot to push the branch

  • ci_run_n_monitor: add some types for gitlab objects

  • ci_run_n_monitor: update job when it goes through enable_job()

  • ci_run_n_monitor: add method to get a pipeline job by its id

  • ci_run_n_monitor: track new job when retrying a job

  • ci_run_n_monitor: refresh job state when starting it

  • gitlab_gql: print error returned by server in –print-merged-yaml

  • ci_run_n_monitor: implicitly include `parallel:` jobs

  • ci_run_n_monitor: print the target regex before adding the X/N bit

  • docs: add release notes for 24.0.1

  • docs: add sha256sum for 24.0.1

  • docs: add release notes for 23.3.6

  • docs: update calendar for 23.3.6

  • docs: add sha256sum for 23.3.6

  • docs: update calendar for 24.0.1

  • ci_run_n_monitor: explain why/when there might be no tracked remote

  • ci_run_n_monitor: allow detached heads as well

  • docs: add release notes for 24.0.2

  • docs: add sha256sum for 24.0.2

  • docs: update calendar for 24.0.2

  • ci_run_n_monitor: fix handling of optional jobs again

  • ci_run_n_monitor: read job logs as utf-8

  • vk/util: trivial cleanups in vk_icd_gen.py

  • vk/util: print a nice error in vk_icd_gen.py when VK_HEADER_VERSION is not defined

  • ci/android: use a specific version of android-cuttlefish

  • ci: document which image tags to bump when touching build-mold.sh

  • ci: uprev mold to the latest release

  • ci/image-tags: move KERNEL_ROOTFS_TAG to group the test images together

  • ci/deqp: only apply the android patches to the android build

  • ci/deqp: build deqp-egl using mold as well

  • ci/deqp: make deql-egl for android less of a special case

  • ci/deqp: control the GL release independently of VK

  • ci/deqp: control the GLES release independently of GL

  • r300/ci: group tex-miplevel-selection flakes together

  • r300/ci: add another tex-miplevel-selection flake

  • iris/ci: add pbuffer flakes for amly, same as apl and glk

  • panfrost/ci: skip dEQP-GLES31.functional.copy_image.non_compressed.* on t760 as they hang

  • rpi3/ci: update expectations for vc4-rpi3-gl-piglit-full:arm32 2/4

  • freedreno/ci: add another a618 flake

  • zink+anv: update expectations

  • r300/ci: add flakes

  • radeonsi/ci: add vangogh piglit flake

  • zink+radv: update navi31 expectations (one test fixed)

  • softpipe: update expectations

  • ci/deqp: drop zlib url patch

  • ci/deqp: split vk/gl/gles patches

  • ci/deqp-runner: inline never-used DEQP_VARIANT variable

  • ci/deqp: use the proper gl/gles releases for deqp-gl*, deqp-gles*, deqp-egl

  • ci/venus-lavapipe: drop unused DEQP_VER that’s being overwritten by DEQP_SUITE anyway

  • ci/lavapipe: fold `DEQP_VER: vk` and drop .deqp-test-vk

  • docs: delay 24.1 branchpoint by 2 weeks

  • vk/update-aliases: drop VK_ERROR_ prefix substitution

  • ci/deqp-runner: do a release build instead of debug

  • ci/deqp-runner: set android rust target in the caller (debian/x86_64_test-android.sh)

  • ci/deqp-runner: bring “install from crate” & “install from git” to feature parity

  • ci/deqp-runner: update repo url

  • ci/deqp-runner: fix list of image tags to update

  • ci/image-tags: re-generate all the images building deqp-runner

  • docs: add release notes for 24.0.3

  • docs: add sha256sum for 24.0.3

  • ci/deqp: document which build produces which binary

  • ci: include all the src/**/gitlab-ci.yml files

  • nouveau: add missing vl lib

  • nouveau/ci: fix yaml indentation

  • nouveau/ci: only trigger jobs for relevant changes

  • Revert “nouveau: add missing vl lib”

  • ci/deqp: backport fix for dEQP-VK.wsi.direct_drm.* bug

  • vc4/ci: add flake

  • radeonsi/ci: udpate expected failures

  • r300: mark new fails

  • v3dv/ci: update expectations

  • v3d/ci: mark spec@ext_framebuffer_blit@fbo-blit-check-limits as fixed

  • vc4/ci: add another `spec@!opengl 1.1@depthstencil-default_fb-drawpixels` flake

  • vc4/ci: add another `spec@arb_vertex_buffer_object@vbo-subdata-many draw` flake

  • v3dv/ci: mark the `dEQP-VK.wsi.*.maintenance1.deferred_alloc.*` flakes seen so far as happening on all platforms

  • v3dv/ci: add other flakes seen during nightly run

  • ci: fix shader-db job existence condition

  • v3dv/ci: assume dEQP-VK.wsi.wayland.swapchain.simulate_oom.* have been fixed

  • v3dv/ci: add more flakes

  • v3dv/ci: assume list of dEQP-VK.wsi.*.maintenance1.present_modes.* flakes is the same between xcb & xlib and between rpi4 & rpi5

  • ci: enable MESA_VK_ABORT_ON_DEVICE_LOSS globally

  • ci/deqp-runner: split gl & gles groups to use the correct binary

  • ci/deqp-runner: print deqp-gles version log as well

  • ci: deduplicate converting the current job runtime into %M:%S

  • ci: convert the job start date into a timestamp only once

  • ci: simplify unnecessarily complex printf

  • radv/ci: sort tahiti flakes

  • radv/ci: add a bunch of flakes seen recently

  • v3dv/ci: track regression

  • rpi/ci: add flakes

  • radv/ci: add more flakes

  • v3dv/ci: add more flakes

  • docs: update calendar for 24.0.3

  • docs: update calendar for 24.0.4

  • docs: add release notes for 24.0.4

  • docs: add sha256sum for 24.0.4

  • v3dv/ci: another batch of flakes

  • radv/ci: another batch of flakes

  • radv/ci: another batch of flakes

  • radv/ci: dEQP-VK.spirv_assembly.type.vec4.i8.mod_geom Fail -> Crash on tahiti

  • ci: don’t run rustfmt on every core change

  • ci_run_n_monitor: explain how to pass multiple targets without having to use regexes

  • rpi/ci: another batch of flakes

  • docs: mesa also implements gles 3.0+

  • docs/egl: various wording improvements

  • ci: take kws farm offline

  • ci: restore kws farm

  • radv/ci: simplify tahiti flakes list

  • ci: fold .test-check into its only user, python-test

  • ci: run python-test when editing the CI itself

  • ci: run python-test automatically only in merge pipelines

  • docs/macos: drop reference to former github mirror

  • docs/nir: vec4 reference

  • docs/envvars: fix reference

  • docs/isl: fix references to ISL_AUX_USAGE_CCS_*

  • docs/isl: stop trying to link to classic drivers code

  • docs/isl: VK_FORMAT_xxx_PACKEDn is not a real format, don’t try to link to it

  • docs/isl: fix enum references

  • docs: fix inline c identifier reference -> inline code

  • isl: fix inline c identifier reference -> inline code

  • nir: add missing stdint include

  • docs/anv: fix envvar documentation

  • docs/nvk: fix envvar documentation

  • ci: mark vmware farm as offline

  • ci: add missing rule to disable vmware farm

  • ci: raise the log level threshold of spirv logs

  • docs/envvars: document some vulkan loader env vars

  • docs: replace references to the deprecated VK_ICD_FILENAMES with the new VK_DRIVER_FILES

  • docs: replace references to the deprecated VK_INSTANCE_LAYERS with the new VK_LOADER_LAYERS_ENABLE

  • docs/zink: format the envvar value as code instead of plain text

  • meson: add VK_DRIVER_FILES to devenv, alongside the old VK_ICD_FILENAMES

  • ci: drop unused VK_ICD_FILENAMES passthrough from jobs to tests

  • ci: use the new VK_DRIVER_FILES env var

  • ci/deqp: backport fix for dEQP-VK.pipeline.*.render_to_image.*.huge.*

  • ci: fix nightly build

  • ci: fix nightly build (v2)

  • ci/llvmpipe: make sure manual jobs don’t auto-retry

  • ci/llvmpipe: fix out of date fails list

  • ci/lavapipe: fix out of date fails list

  • ci/lavapipe: skip test that sometimes times out

  • ci: add nightly full run of llvmpipe

  • ci: add nightly full run of lavapipe

  • gallium/dri: reuse existing meson variables

  • meson: regroup glvnd lines to get an easier-to-review diff in the next commit

  • meson: turn `glvnd` option into a feature

  • ci: explicitly disable glvnd to avoid regression when making it auto

  • meson: auto-enable glvnd when libglvnd is installed

  • mr-label-maker: include */gitlab-ci-inc.yml in GitLab CI changes

  • mr-label-maker: be explicit about the various CI files

  • docs: add release notes for 24.0.5

  • docs: update calendar for 24.0.5

  • docs: add sha256sum for 24.0.5

  • ci: delete mistaken duplicate llvmpipe-{fails,skips}.txt

  • etnaviv: avoid re-defining prog_python

  • egl: drop dead dri2_dpy param in dri2_wl_visual_idx_from_config()

  • lavapipe: add 1 new failure and 1 new timeout since CTS uprev to 1.3.8.0

  • vk/overlay-layer: drop unused imports

  • vk/overlay-layer: fix None checks

  • vk/overlay-layer: simplify print and make it more readable

  • docs/rusticl: add an intro explaining what Rusticl is

  • wsi/x11: drop unused param in x11_present_to_x11_sw()

  • radv: initialize a couple of variables

  • util: simplify loop logic in util_format_get_first_non_void_channel()

  • util/futex: replace double-cast check with a simple sign check

  • docs/ci: explain how gitlab considers “changes” when pushing on a fork branch

  • rpi5/ci: sort flakes

  • rpi5/ci: add flakes from last night’s run

  • rpi4/ci: sort flakes

  • rpi4/ci: add new flakes from last night’s run

  • radeonsi/ci: update vangogh expectations after piglit uprev

  • llvmpipe/ci: update expectations after piglit uprev

  • VERSION: bump for 24.1.0-rc1

  • .pick_status.json: Update to 4660ee1deaace6457bf5fbf3fc8810e4a2453cb5

  • ci: fix container rules on release branches and tags

  • .pick_status.json: Update to 84632dce93f44e8d88cda47648cfd4cc0958918f

  • .pick_status.json: Update to 8248cc0bf45d0d7558cc3d77a63dcd078a96aa66

  • ci: pass MESA_VK_ABORT_ON_DEVICE_LOSS through to the DUT

  • .pick_status.json: Update to 86281ef15fca378ef48bcb072a762168e537820d

  • .pick_status.json: Update to 47f6e24ad5dfcb59dd1511800aee8c56b4f8fee4

  • meson: simplify `-gsplit-dwarf` compiler argument check

  • meson: move tsan-blacklist.txt to build-support with the other build support files

  • VERSION: bump for 24.1.0-rc2

  • .pick_status.json: Update to 603982ea802b3846e91a943b413a7baf430e875d

  • .pick_status.json: Update to 569c2fcf952a3ec13ddf77c0058e769bf68f3aaf

  • .pick_status.json: Update to 9666756f603f0285d8a93ef93db1c7ec702b671f

  • .pick_status.json: Update to b8e79d2769b4a4aed7e2103cf0405acc5bdadb86

  • VERSION: bump for 24.1.0-rc3

  • .pick_status.json: Update to 18c53157318d6c8e572062f6bb768dfb621a55fd

  • .pick_status.json: Update to 406dda70e7c9baa59c975eb64025e7c3b210c3bc

  • .pick_status.json: Update to 5502ecd7716045e76f13f007a4aa5f5653c80ecd

  • util/format: add missing null check in util_format_is_srgb()

  • .pick_status.json: Update to d516721cd0cb16d0b601c42c01de0fdcc4ae887b

  • .pick_status.json: Update to aa9244c8f6bfa3fb33cf233104b00fc44fc9459f

  • .pick_status.json: Mark a45f1990860db3a8da6d7251bb627a314dfb8423 as denominated

  • VERSION: bump for 24.1.0-rc4

  • .pick_status.json: Update to b2282e3a571f18b48b8b717ec32da1d0ed93f1b5

  • .pick_status.json: Update to 471ac97a4af751226bc51076130deae252bb481e

  • .pick_status.json: Update to 2487a875527f636565a7b39036690fbf7c5d46db

  • .pick_status.json: Update to 3584fc64828ad2ad4d486572ec915aab8321aadd

Eric R. Smith (13):

  • panfrost: fix panfrost drm-shim

  • panfrost: add lowering pass for multisampled images

  • panfrost: support multi-sampled image load/store

  • panfrost: protect alpha calculation from accessing non-existent component

  • panfrost: make drm-shim work again for panfrost

  • panfrost: make sure blends always have 4 components

  • panfrost: mark indirect compute buffer as read

  • gallium: handle copy_image of depth textures

  • panfrost: fix polygon offset calculation for floating point Z

  • panfrost: fix a GPU/CPU synchronization problem

  • panfrost: mark separate_stencil as valid when surface is valid

  • panfrost: fix an incorrect stencil clear optimization

  • panfrost: add a barrier when launching xfb jobs in CSF

Erico Nunes (2):

  • Revert “ci: lima farm is down”

  • ci: enable shader-db on lima

Erik Faye-Lund (32):

  • panfrost: add support for forcing sample-counts

  • panfrost: pass reduced primitive type instead of points

  • panfrost: add line_smooth shader-key and lowering

  • panfrost: clean up active_prim update

  • panfrost: implement line-smoothing

  • mesa/main: add support for EXT_texture_storage

  • mesa: fix error-handling for ETC2/RGTC textures

  • glapi: move EXT_texture_storage to the right position

  • targets/va: override LIBVA_DRIVERS_PATH in devenv

  • mesa/main: fix _mesa_base_tex_format for BGRA

  • mesa/main: mark GL_BGRA as color-renderable

  • mesa/main: mark GL_BGRA8_EXT as color-renderable

  • mesa/main: work around chrome/firefox bug

  • mesa/main: allow GL_BGRA for FBOs

  • panvk: do not handle illegal null

  • glsl: Make error_value a real ir_rvalue type

  • panfrost: give afbc-packing its own flag

  • panfrost: add driconf infrastructure

  • panfrost: add pan_force_afbc_packing driconf

  • mesa: prefer read-format of RG for snorm

  • gallium: remove always-false parameter

  • panvk: use integers instead of strings

  • panfrost: silence compiler warning

  • panfrost: add tiler-heap driconfs

  • panvk: wire up version-overriding

  • panfrost: implement a driver-specific max-miplevel

  • panfrost: use perf_debug_ctx instead of perf_debug

  • panfrost: perf_debug_ctx -> perf_debug

  • panfrost: use util_debug_message for perf_debug

  • panfrost: do not deref potentially null pointer

  • panfrost: correct first-tracking for signature

  • panvk: avoid dereferencing a null-pointer

Erik Kurzinger (2):

  • wsi/wayland: don’t use explicit sync with sw

  • wsi/x11: support explicit sync

Faith Ekstrand (284):

  • nvk: Add an explicit mapping from shader stages to cbuf bindings

  • nvk: Return an nvk_cbuf_map from nvk_lower_nir()

  • nvk: Use s instead of set_idx in CmdBindDescriptorSets

  • nvk: Rework descriptor set binding

  • nvk: Make dynamic cbuf indices relative to the descriptor set

  • nvk: Handle missing descriptor sets in nvk_nir_lower_descriptors

  • nvk: Invalidate state after secondary command buffers

  • nvk: Set a minimum of one patch control point

  • nak: Disallow gl_FragData and set MRT correctly

  • nak: Add explicit padding to nak_shader_info

  • nvk: Emit SET_ANTI_ALIAS at draw time when no render targets are bound

  • nvk: Move SET_HYBRID_ANTI_ALIAS_CONTROL to draw time

  • nvk: Advertise variableMultisampleRate and EDS3RasterizationSamples

  • nvk: Add a couple more features to features.txt

  • nak: Stop passing –explicit-padding to bindgen

  • nak: Implement nir_op_pack_half_2x16_rtz_split

  • nak: Implement nir_op_ufind_msb_rev

  • nak: Rename OpBrev to OpBRev

  • nak: Implement nir_op_bfm

  • nouveau/mme/fermi: Stop truncating iadd immediates

  • nouveau/mme: Stop using isaspec

  • nvk: Set framebufferIntegerColorSampleCounts

  • nvk: Unref shaders on pipeline free

  • nvk: Add a #define for max shared memory size

  • nvk: Properly configure the min/max shared mem size

  • nvk: Implement VK_KHR_zero_initialize_workgroup_memory

  • nir,spirv: Add support for SPV_NV_shader_sm_builtins

  • nak: Add support for SPV_NV_shader_sm_builtins

  • nvk: Advertise VK_NV_shader_sm_builtins

  • nvk/draw: Map cbuf slots to shaders, not cbuf_maps

  • nak: Refactor shader upload math

  • nvk: Wire up nir_opt_large_constants

  • nak: Enable NIR fuse_ffmaN

  • nak: Legalize OpBMsk

  • nvk: Don’t exnore ExternalImageFormatInfo

  • nvk: Set maxInlineUniformTotalSize

  • nak: Fix TCS output reads

  • anv: Add helpers for getting the surface state from an image view

  • anv: Advertise VK_EXT_attachment_feedback_loop_layout

  • nak: Choose S2R vs CS2R based on sysval index

  • nak: Add a source barrier intrinsic

  • nak: Loop to ensure we get accurate shader clocks

  • nvk: Stop requiring dedicated allocations

  • nvk: Advertise Vulkan 1.3

  • nvk: Do a second submit to check for errors in the sync case

  • nvk: Whitespace fixes

  • nvk: Disable all cbufs in nvk_queue_init_context_draw_state()

  • nvk: Call lower_compute_system_values after zer_initialize_workgroup_memory

  • nak/nir: Stop lowering load_local_invocation_index

  • nil: Set the level offset to 0 in nil_image_for_level

  • nvk: Fix whitespace in nvk_image.c

  • nouveau/winsys: Re-order channel creation

  • nouveau/winsys: Allow only allocating a subset of engines

  • nvk/queue: Pull DRM specfc stuff into nvk_queue_drm.c

  • nvk/queue: Refactor the push builder a bit

  • nvk: Move the nouveau_ws_context to nvk_queue

  • nvk: Add an array of queue families to nvk_physical_device

  • nvk/queue: Rework context state init

  • nvk/queue: Only initialize the necessary engines

  • nvk: Use VM_BIND for contiguous heaps instead of copying

  • nvk: Only map heaps that explicitly request maps

  • nvk: Add an upload queue

  • nvk: Add an upload queue to nvk_device

  • nvk: Use the upload queue for shader uploads

  • nvk: Don’t set CONSTANT_BUFFER_SELECTOR with a zero size

  • nvk/heap: Use nvk_heap_bo::addr instead of bo->offset

  • nvk/heap: Rework over-allocation

  • nvk: Convert shader addresses to offsets in nvk_shader.c

  • vulkan: Update XML and headers to 1.3.278

  • nvk: Use nouveau_ws_bo_new_mapped() for descriptors

  • nouveau/winsys: Add a fixed_addr to nouveau_ws_bo_map

  • nvk: Implement VK_EXT_map_memory_placed

  • nvk: Invalidate the texture cache before MSAA resolves

  • nvk: Don’t use WAIT_AVAILABLE in nvk_upload_queue_sync

  • drm-uapi: Sync nouveau_drm.h

  • nouveau/winsys: Add a vram_used query

  • nvk: Add a nouveau_ws_device to nvk_physical_device

  • nvk: Add a hand-rolled nvk_memory_heap struct

  • nvk: Use 3/4 of total system memory for the VRAM heap

  • nvk: Add an available query to nvk_memory_heap

  • nvk: implement EXT_memory_budget

  • nouveau/winsys: Getch the BAR size from the kernel

  • nvk/heap: Upload shaders on the CPU when we have a map

  • nvk: Upload shaders on the CPU when we have ReBAR

  • nvk: Expose a host-visible VRAM type when we have REBAR

  • nvk: Only expose VK_KHR_present_id/wait when we have WSI

  • nvk: Advertise VK+KHR_incremental_present

  • nil: Add PIPE_FORMAT_R5G5B5A1_UNORM

  • nak: Add writes_point_size to nak_shader_info

  • nvk: Handle missing gl_PointSize in the last geometry stage

  • nvk/copy: Handle VK_REMAINING_ARRAY_LAYERS

  • vulkan/meta: Handle VK_REMAINING_ARRAY_LAYERS in blit and resolve

  • nvk: Use VkPipelineCreateFlags2 flag names

  • nvk: Advertise VK_KHR_maintenance5

  • vulkan: Add a vk_get_subgroup_size() helper

  • vulkan: Move the descriptor set limit to vk_limits.h

  • vulkan: Add runtime code for VK_EXT_shader_object

  • vulkan: Add a vk_render_pass_state_has_attachment_info() helper

  • vulkan: Rework vk_render_pass_state::attachments

  • vulkan: Add a new dynamic state for render pass attachments

  • vulkan: Add a vk_pipeline base struct

  • vulkan: Add push constant ranges to vk_pipeline_layout

  • vulkan: Add a BLAKE3 hash to vk_descriptor_set_layout

  • vulkan: Add generic graphics and compute VkPipeline implementations

  • nvk: Populate vk_descriptor_set_layout::blake3

  • nvk/shader: Refactor some helpers

  • nvk: Move populate_fs_key to nvk_shader.c

  • nvk: Pass an array of descriptor sets to nvk_lower_nir

  • nvk: Move nir_lower_patch_vertices to nvk_lower_nir()

  • nvk: Use vk_render_pass_state::attachments for write masks

  • nvk: Switch to shader objects

  • nvk: Advertise VK_KHR_graphics_pipeline_library

  • nvk: Advertise VK_EXT_shader_object

  • nak: Implement nir_op_iadd3 on SM70+

  • nir: Add an imad opcode

  • nak: Move NAK_FS_OUT_COLOR next to the enum

  • nak: Add support for imad on Volta+ and enable it in simple cases

  • nvk: Advertise a CTS version of 1.3.7.3

  • nvk: Drop the non-conformant warning on Turing-Ada

  • nvk: Don’t print the NVK_I_WANT_A_BROKEN_VULKAN_DRIVER warning in release builds

  • meson: Rename nouveau-experimental to nouveau and build by default on x86

  • vulkan/pipeline: Whitespace fix

  • vulkan/pipeline: Handle fully compiled library shaders properly

  • nvk: Advertise VK_KHR_pipeline_library

  • docs/nvk: Update the conformance status section

  • docs/nvk: Update the NVK_DEBUG docs

  • docs/nvk: Document NAK_DEBUG

  • nil: Enable A8_UNORM for storage buffers

  • vulkan/pipeline: Always init pipeline cache objects

  • nak: Fix printing of OpIsberd

  • nak/sm50: Fix encoding of immediates in OpFFma

  • nak/sm50: Use OpBfe instead of OpBRev for nir_op_find_lsb

  • nak: Support F2I for 8-bit integers on SM50

  • nvk: Return os_page_size for minMemoryMapAlignment

  • nouveau: Import g_nv_name_released.h from NVIDIA OGK

  • nvk: Report official GPU names from NVIDIA when we have them

  • nvk: Use row_stride_B instead of width for render and copies

  • nil: Rework tiling calculations

  • nil: Add a concept of width to tile sizes

  • nil: Add a concept of sliced 3D image views

  • nvk: Use “real” 3D image views

  • nvk/queue: Add a push_bind helper

  • nvk: Refactor opaque image binds

  • nvk/queue: Add support for non-opaque sparse binds

  • nak: Rename resident to fault

  • nak: Plumb is_sparse through from NIR for texture ops

  • nak/nir: Add sparse support to shrink_image_load()

  • nak: Wire up sparse residency for texture ops

  • nil: Fix a typo in a comment

  • nvk: Document the register name for the helper load workaround

  • nvk: Always wait for the FALCON in set_priv_reg

  • nvk: Disable the Out Of Range Address exception

  • nvk: Drop a bunch of dev->pdev and just use pdev

  • nvk: Add and use more cmd_buffer_*_cls helpers

  • nvk: Replace more dev->pdev with nvk_device_physical()

  • nvk: Drop nvk_device::pdev

  • zink: Remove interpolateAtSample() when not multi-sampling

  • nil: Move Z slice offset calculations to a helper

  • nvk: Add a nil_image helper variable in BeginRendering

  • nvk: Manually offset array and Z slices in BeginRendering

  • nil: Advertise support for PIPE_FORMAT_R5G6B5_UNORM

  • nil: Whitespace fix

  • nil: Add support for larger textures on Pascal+

  • nil: Add a helper to view a MSAA image as samples

  • nil: Expose nil_pix_extent_sa()

  • nvk: Use a HW generation names instead of chipsets

  • nvk: Stop pretending to handle Intel image intrinsics

  • nvk: Use different descriptor layouts for storage vs. sampled images

  • nvk: Implement shaderStorageImageMultisample

  • zink: Rework sparse texture lowering

  • nvk: Ignore rasterizationSamples when handling sampleShadingEnable

  • nvk: Always set SET_ATTRIBUTE_POINT_SIZE

  • Revert “nvk: Enable VK_KHR_shader_subgroup_uniform_control_flow”

  • nvk: Move the mutableDescriptorType enable

  • nir: Take a nir_def in nir_goto_if()

  • nir/print: Inline print_ssa_use()

  • nir/builder: Correctly handle decl_reg or undef as the first instruction

  • nir: Improve the comment for nir_block::imm_dom

  • nir: Add a sort_unstructured_blocks() helper

  • nir: Validate that unstructured blocks are in reverse PDFS order

  • nir/lower_reg: Remove dead reg_decl intrinsics

  • nir/lower_reg: Support unstructured control-flow

  • nir/repair_ssa: Support unstructured control-flow

  • nir/gather_types: Support unstructured control-flow

  • nir: Mark divergent regs in phis_to_regs_block()

  • nir: Add a lower_terminate_to_demote pass

  • nak: Add a copy_fs_outputs_nv intrinsic

  • nak: Move barrier removal into its own pass

  • nak: Add a condition to bar_break_nv

  • nak/nir: Add a control-flow lowering pass

  • nak: Add more NIR wrappers for walking the NIR CFG

  • nak: Add NIR helpers for jump instructions

  • nak: Add helpers for emitting jumps

  • nak: Handle unstructured NIR

  • nak: Use the new lowering pass on SM70+

  • nak: Remove the old barriers pass

  • nak/nir: Use nir_lower_terminate_to_demote()

  • nvk: Advertise VK_KHR_shader_maximal_reconvergence

  • nvk: Advertise VK_KHR_shader_subgroup_uniform_control_flow

  • nak/nir: Emit nir_intrinsic_ald_nv directly for system values

  • nak/nir: Rename load_interpolated_input

  • nak/nir: Add a load_fs_input hepler for flat inputs

  • nak/nir: Emit nir_intrinsic_ipa_nv directly for FS system values

  • nak/nir: Use nir_io_semantics for varyings and attributes

  • nak: Break lower_fs_inputs into its own file

  • nak/nir: Clean up lower_fs_inputs a bit

  • nak: Call nir_lower_io_to_temporaries for FS outputs

  • nak/nir: Use nir_io_semantics for FS outputs

  • nak: Drop lower_io_arrays_to_elements_no_indirects for FS outputs

  • nak: Simplify over-all I/O lowering

  • nak: Don’t write undefined FS outputs

  • nak: Plumb through LDC modes

  • nak: Implement load_ubo with an indirect cbuf index

  • nvk: Support VkBindMemoryStatusKHR

  • nvk: Advertise VK_KHR_maintenance6

  • nir: Delete the rest of the CF list when adding a halt

  • nak: Don’t do a scope break cascade for nir_jump_halt

  • nil: Add a CSV version of the format table

  • nil: Re-organize the format table

  • nil: Switch to using the CSV generated table

  • nil: Drop bogus color formats from non-renderable luminance/alpha formats

  • nil: Remove 2-bit SNORM from the format table

  • nil: Drop unneeded types from formats

  • nvk: Use the page-aligned BO size for the descriptor pool

  • nvk: Use a VMA heap for descriptor memory

  • nvk: Use a linked list for descriptor sets in a pool

  • nvk: Add a _pad field to nvk_cbuf

  • nvk: Delete dead descriptor code

  • nvk: Add a _pad field to nvk_fs_key

  • nvk: Add a bunch of -Wpadded errors

  • vulkan: Add a bunch of -Wpadded errors

  • nouveau: Move .rustfmt.toml from NAK to src/nouveau/

  • nouveau: Use hyphenated arguments to class_parser.py

  • nouveau/headers: Add initial Rust bindings

  • nouveau/headers: Add Rust bindings for texture headers

  • ci: Add cbindgen to the build images

  • nil: Move nil_tic_format to nil_format_table.h

  • nil: Move to a single header file

  • nvk: Stop using nvk_extent4d short names

  • nil: Rename nil_tiling::gob_height_8 to gob_height_is_8

  • nak/bitview: Add a SetField<f32> implementation

  • nil: Delete unused USAGE bits

  • nil: Make the Rust library the primary build target

  • nil: Add Extent/Offset4D::new() helpers

  • nil: Drop the nil_extent/offset4d() helpers

  • nil: Take a format in el_to_B()

  • nil: Enforce units via Rust types

  • nil: cbindgen is required

  • nvk: Improve the unsupported handle type error

  • nvk: Restrict shaderFloat16 to Ampere+ for now

  • nouveau/headers: Move the classes into a submodule for Rust

  • nouveau/headers: Generate Rust for QMDs

  • nak: Add helpers for filling QMDs

  • nvk: QMDs are 64 dwords

  • nvk: Use the NAK helpers to fill QMDs

  • nouveau: Import the hwref headers from Nvidia OGK

  • nouveau/headers: Add the MMU headers to the Rust crate

  • nil: Use the enums from the hwref headers for PTE kinds

  • nil: add s8 pte kind

  • nil: Be more speicific about Maxwell in the format table

  • nil: Advertise S8_UINT on MaxwellB+

  • nvk: Hash ycbcr conversions in the descriptor set layout hash

  • nvk: Re-emit sample locations when rasterization samples changes

  • nvk/meta: Restore set_sizes[0]

  • nvk/upload_queue: Only upload one line of data

  • vulkan/wsi: Bind memory planes, not YCbCr planes.

  • nvk: Improve the GetMemoryFdKHR error

  • nouveau/winsys: Take a reference to BOs found in the cache

  • nouveau/winsys: Make BO_LOCAL and BO_GART separate flags

  • nvk: Allow GART for dma-bufs

  • nil: Use the right PTE kind for Z32 pre-Turing

  • nvk: Set color/Z compression based on nil_image::compressed

  • nil: Default to NV_MMU_PTE_KIND_GENERIC_MEMORY on Turing+

  • nvk: Allow VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT

  • drm-uapi: Sync nouveau_drm.h

  • nouveau/winsys: Add back nouveau_ws_bo_new_tiled()

  • nvk: Support image creation with modifiers

  • nvk: Set tile mode and PTE kind on dedicated dma-buf BOs

  • nvk: Implement DRM format modifier queries

  • nvk: Advertise VK_EXT_queue_family_foreign

  • nvk: Advertise VK_EXT_image_drm_format_modifier

  • nvk/wsi: Advertise modifier support

  • zink: Set workarounds.can_do_invalid_linear_modifier for NVK

  • nvk/meta: Save and restore set_dynamic_buffer_start

Felix DeGrood (6):

  • driconf: Change vendorid on Palworld for Intel

  • driconf: Fake vendorid for RDR2

  • mesa-overlay: defer listening to socket until first frame

  • driconf: add SotTR DX12 to Intel XeSS workaround

  • iris: Increase target batch size to 128 KB

  • intel/ds: add pipe control reasons to perfetto flushes

Francisco Jerez (36):

  • intel/fs: Use full 32-bit sample masks when immediate.

  • intel/eu/validate: SEND instructions don’t have immediate encodings on Gen12+.

  • intel/eu/gfx12.5+: Don’t fail validation with ARF register restriction error for indirect addressing.

  • intel/compiler/xe2: Add Xe2 bounds to FF() macro.

  • intel/compiler/xe2: Implement codegen of general instruction controls.

  • intel/compiler/xe2: Implement codegen of 2-source instruction operands.

  • intel/compiler/xe2: Implement codegen of indirect immediates.

  • intel/compiler/xe2: Implement codegen of three-source instructions.

  • intel/compiler: Add assume() checks to brw_compact_inst_(set_)bits().

  • intel/compiler/xe2: Implement codegen of compact instructions.

  • intel/compiler/xe2: Implement instruction compaction.

  • intel/compiler/xe2: Fix for NibCtrl field removal.

  • intel/compiler/xe2: Fix for the removal of most predication modes.

  • intel/compiler/xe2: Add extra flag registers.

  • intel/compiler/xe2: Fix for the removal of AccWrCtrl.

  • intel/ir/xe2+: Add support for 32 SBID tokens to performance model.

  • intel/fs/xe2+: Disable bank conflict mitigation pass for now.

  • intel/eu/xe2+: Translate brw_reg fields in REG_SIZE units to physical 512b GRF units during codegen.

  • intel/fs: Set the default execution group to 0 when not representable by the platform.

  • intel/fs: Emit QUAD_SWIZZLE instructions with WE_all for derivative lowering.

  • intel/fs/xe2+: Allow SIMD16 MULH instructions.

  • intel/brw/xe2: Render target reads have been removed from the hardware.

  • intel/brw/xe2+: Update encoding of FB write descriptor message control.

  • intel/brw/xe2+: Update encoding of FB write extended descriptor.

  • intel/brw/xe2+: Double allowed SIMD width of FB write SEND messages.

  • intel/brw/xe2+: Allow FS stencil output in SIMD16 dispatch mode.

  • intel/brw/xe2+: Allow dual-source blending in SIMD16 mode.

  • intel/blorp/xe2+: Don’t use replicated-data clears.

  • intel/brw/gfx12: Setup PS thread payload registers required for ALU-based pixel interpolation.

  • intel/brw/xe2+: Setup PS thread payload registers required for ALU-based pixel interpolation.

  • iris/xe2+: Disable coherent framebuffer fetch.

  • intel/brw/xehp+: Replace lsc_msg_desc_dest_len()/lsc_msg_desc_src0_len() with helpers to do the computation.

  • intel/eu/xehp+: Don’t initialize mlen and rlen descriptor fields from lsc_msg_desc*().

  • intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*().

  • intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.

  • intel/fs/gfx20+: Handle subdword integer regioning restrictions in copy propagation.

Frank Binns (3):

  • CODEOWNERS: update Imagination maintainers

  • pvr: fix up some includes

  • pvr: split out device info into per GPU headers

Friedrich Vock (27):

  • radv/rt: Add workaround to make leaves always active

  • radv: Fix shader replay allocation condition

  • nir: Make is_trivial_deref_cast public

  • nir: Handle casts in nir_opt_copy_prop_vars

  • radv/amdgpu: Fix build on BSD

  • winsys/amdgpu: Fix build on BSD

  • util: Provide a secure_getenv fallback for platforms without it

  • vulkan: Use secure_getenv for trigger files

  • aux/trace: Guard triggers behind __normal_user

  • vtn: Use secure_getenv for shader dumping

  • mesa/main: Use secure_getenv for shader dumping

  • radv: Use secure_getenv in radv_builtin_cache_path

  • radv: Use secure_getenv for RADV_THREAD_TRACE_TRIGGER

  • util/disk_cache: Use secure_getenv to determine cache directories

  • radv/rt: Write inactive node data in ALWAYS_ACTIVE workaround

  • radv/rt: Optimize update shader VGPR usage

  • radv,driconf: Enable active AS leaf workaround for Jedi Survivor

  • radv/rt: Handle monolithic pipelines in capture/replay

  • vulkan/runtime: Allow more than 8 DRM devices

  • radv: Set SCRATCH_EN for RT pipelines based on dynamic stack size

  • radv/rt: Fix frontface culling with emulated RT

  • radv/rt: Force active leaves for every updateable accel struct

  • radv,driconf: Remove active accel struct workaround

  • radv: Only enable SEs that the device reports

  • radeonsi: Only enable SEs that the device reports

  • aco/tests: Insert p_logical_start/end in reduce_temp tests

  • aco/spill: Insert p_start_linear_vgpr right after p_logical_end

Ganesh Belgur Ramachandra (1):

  • compiler,glsl: fix warning when -finstrument-functions is used

Georg Lehmann (60):

  • aco: reassign split vector to SOPC

  • aco: stop scheduling at p_logical_end

  • nir: add ballot_relaxed and as_uniform intrinsics

  • aco: implement as_uniform and ballot_relaxed

  • ac/llvm: implement as_uniform and ballot_relaxed

  • nir: add lowering for boolean shuffle

  • radv: lower boolean shuffle

  • radeonsi: lower boolean shuffle

  • aco: remove boolean shuffle isel

  • aco: fix printing dpp8

  • aco: validate v_permlane opsel correctly

  • aco: support v_permlane64_b32

  • aco/gfx11: use v_nop to resolve VcmpxPermlaneHazard

  • aco/gfx11: resolve VcmpxPermlaneHazard for v_permlane64

  • aco: implement rotate

  • radv: enable VK_KHR_shader_subgroup_rotate

  • radv: report rotate subgroup feature bits

  • anv: report rotate subgroup feature bits

  • aco/gfx11+: disable v_pk_fmac_f16_dpp

  • aco: add packed fma dpp note to README-ISA

  • aco: don’t remove branches that skip v_writelane_b32

  • aco/print_ir: don’t use alloca for input modifiers

  • aco: print neg prettier for packed math

  • aco: don’t print hi() for permlane opsel

  • aco: print permlane16 bc/fi

  • aco: print exec/vcc_lo/hi for single dword access

  • aco/gfx11+: limit hard clauses to 32 instructions

  • radv/gfx11+: add rtwave32 perftest option

  • aco: use fmamk/ak instead of fma with inline constant for more VOPD

  • nir: remove rotate scope

  • nir/divergence_analysis: fix subgroup mask

  • aco: create pseudo instructions with correct struct

  • aco/post-ra: rename overwritten_subdword to allow additional uses

  • aco/post-ra: assume scc is going to be overwritten by phis at end of blocks

  • aco: store if pseudo instr needs scratch reg

  • aco/post-ra: track pseudo scratch sgpr/scc clobber

  • aco/ssa_elimination: check if pseudo scratch reg overwrittes regs used for v_cmpx opt

  • aco/builder: improve v_mul_imm for negative imm

  • aco/builder: use 24bit mul if low bits of imm are zero

  • aco/optimizer: combine v_mul_i32_i24 and add to mad

  • aco: avoid full 32bit imul for uniform reduce/scan

  • aco: don’t combine mul+add_clamp to mad_clamp

  • aco/ra: use SDWA for 16bit instructions when the second byte is blocked

  • aco/vn: remove instruction hash templates

  • aco: use v1 definition for v_interp_p1lv_f16

  • aco/assembler: add vintrp high_16bit support

  • aco: swap opsel and wait_exp for vinterp

  • aco: support high_16bits FS IO

  • aco/tests: add assembler tests for interp high_16bits

  • aco/gfx9: all non legacy opsel instructions only write 16bits

  • aco: use v_interp_p2_f16 opsel

  • aco: add ra test for hi v_interp_p2_f16

  • radv: sink alu

  • radv: move alu

  • nir: don’t try to optimize exclusive min/max scan to inclusive

  • nir: rename to nir_opt_16bit_tex_image

  • ac/nir: add ac_nir_opt_pack_half

  • radv: use ac_nir_opt_pack_half

  • radv, radeonsi: don’t use D16 for f2f16_rtz

  • zink: use bitcasts instead of pack/unpack double opcodes

George Ouzounoudis (1):

  • vulkan: Fix null pointer dereferencing on sample locations state

Gert Wollny (76):

  • r600: lower dround_even also on hardware that supports fp64

  • virgl: Use better reporting for mirror_clamp features

  • ci: Upref virglrenderer

  • zink: Factor out create buffer from resource_object_create

  • zink: shorten lifetime of success variable in resource_object_create

  • zink: Factor out create sampler conversion in resource_object_create

  • zink: factor out get_format_feature_flags in resource_object_create

  • zink: factor out get_image_memory_requirement in resource_object_create

  • zink: reduce number of #ifdefs in resource_create_object

  • zink: extract get_export_flags from resource_object_create

  • zink: extract function allocate_bo from resource_create_object

  • zink: redesign the allocation try loop to test all heaps

  • zink: extract function create_image from resource_object_create

  • zink: extract function update_alloc_info from resource_object_create

  • zink: extract update_obj_info from resource_object_create

  • zink: extract debug_resource_mem from resource_object_create

  • zink: drop duplicate assignment to obj->alignment

  • zink: extract allocate_bo_and_update_obj from resource_object_create

  • zink: Move more code to create_image and create_buffer

  • zink: simplify call to get_export_flags

  • zink: remove duplicate arguments and use VkMemoryRequirements locally

  • zink: use enums as return values in resource_object_create

  • radv: Fix compilation with gcc-13 and tsan enabled

  • nir/lower_int64: Fix compilation with gcc-13 and tsan enabled

  • nir/builder: Fix compilation with gcc-13 when tsan is enabled

  • zink: Fix return type and values of create_buffer and create_images

  • zink: extract check_unordered_exec from zink_get_cmdbuf

  • zink: remove duplicate check and assignment in zink_resource_image_needs_barrier

  • zink: extract emit_memory_barrier from zink_resource_image_barrier

  • zink: extract emit_memory_barrier::for_buffer from zink_resource_buffer_barrier

  • zink: extract update_unordered_access_and_get_cmdbuf

  • zink/sync: remove duplicate assignments in UNSYNCHRONIZED case

  • zink: move zink_resource_copies_reset out of exportable_lock

  • zink: remove invalid scope in bo allocation loop

  • r600: handle indirect access to kcache 14 and 15

  • zink/nir_to_spirv: emit ViewportIndex cap also for inputs

  • zink: use only ZINK_BIND_DESCRIPTOR

  • zink: decrease aggressiveness of increasing descriptor data space adaptive

  • zink/nir-to-spirv: Make sure sampleid for InterpolateAtSample is int

  • nir-to-spirv: Cast SSBO input pointer when needed

  • zink: set handle type also for user memory

  • zink: acquire - maybe clear timeout after waiting for presentation fence

  • nir_to_spirv: Allow LOD for external images

  • zink: ctx->last_fence really wants to be a batch_state, so accomodate it

  • zink: another fence that is better off as a batch state

  • ntv: remove store_def_raw

  • ntv: remove store_ssa_def

  • ntv: pass def->index to store_def

  • ntv: simplify increasing the number of dest componnets for sparse tex

  • zink/ntv: introduce structure using the source params

  • zink/ntv: extract get_tex_srcs

  • zink/ntv: use new struct to pass texture parameters

  • zink/ntv: extract find_sampler_and_texture_index

  • zink/ntv: simplify evaluation of sampled_type

  • zink/ntv: extract get_texture_load

  • zink/ntv: extract get_texop_dest_type

  • zink/ntv: Extract move_tex:proj_into_coord

  • zink/ntv: replace if-chain with switch in emit_tex

  • zink/ntv: extract picking the image to load from

  • zink/ntv: extract emit_tex_readop as function that reads texture pixel data

  • zink/ntv: pull result out of cases and use a common store_def

  • zink: if AcquireNextImageKHR fails with VK_NOT_READY or VK_TIMEOUT retry

  • meson: Add blacklist when compiling with tsan

  • futex: disable futexes when compiling with tsan

  • util/u_queue: read fence->signalled locked with TSAN

  • tsan-blacklist: ignore race when reading lp_fence signalled status

  • llvmpipe: Don’t emit certain debug code when TSAN is enabled

  • tsan-blacklist: Ignore race in get_max_abs_timeout_ns

  • tsan-blacklist: surpress two race conditions in TC

  • r600/sfn: Add array element parent also to array

  • r600/sfn: Use dependecies to order barriers and LDS/RAT instructions

  • r600/sfn: call nir_lower_doubles explicitely

  • r600/sfn: when emitting fp64 op2 groups pre-load values

  • r600/sfn: Don’t put b2f64 conversion into ALU group

  • zink/kopper: Wait for last QueuePresentKHR to finish before acquiring for readback

  • mesa/st: don’t use base shader serialization when uniforms are not packed

Guilherme Gallo (25):

  • ci/lava: Turn the r8152 issue check into a counter

  • ci/lava: Detect r8152 issue during boot phase

  • ci/lava: Detect hard resets during test phase

  • bin/ci: Don’t submit jobs on integration test

  • ci/lava: Ignore DUT feedback messages

  • ci/lava: Fix the integration test

  • bin/ci: Propagate the token to GitlabGQL

  • bin/ci: Move get_token_from_default_dir to common

  • bin/ci: Refactor read_token function

  • bin/ci: Add GitLab basic token validation

  • ci/lava: Broader R8152 error handling

  • radv+zink/ci: Update xfiles based on nightly run

  • radv/ci: Update xfiles based on nightly run

  • v3d/ci: Update xfiles based on nightly run

  • freedreno/ci: Update xfiles based on nightly run

  • etnaviv/ci: Update xfiles based on nightly run

  • r300/ci: Update xfiles based on nightly run

  • ci/a618: Rebalance a618-limozeen jobs

  • ci/a618: Add zink-tu-a618-full

  • ci/lava: A few formatting cleanups

  • ci/lava: Introduce unretriable exception handling

  • ci/lava: Don’t run jobs if the remaining execution time is too short

  • ci/lava: Fix how exception entry in structured log

  • ci: Add S3 id_token for all jobs

  • ci: Use id_tokens for JWT auth

Gurchetan Singh (6):

  • mesa/util: Check __ANDROID__ when for detecting for Android

  • mesa/util: add <linux/fcntl.h>

  • mesa/util: use DETECT_OS_ANDROID in anon_file.c

  • mesa/vulkan: use a simpler path for header in trampoline gen

  • mesa/vulkan: use DETECT_OS_ANDROID

  • vk_image.c: #ifndef _WIN32 –> DETECT_OS_LINUX + DETECT_OS_BSD

Haihao Xiang (1):

  • anv: Fix typo in transition_color_buffer

Hannes Mann (2):

  • gallium/pipe: Add contiguous planes per-surface attribute

  • frontends/va: Only export one handle for contiguous planes

Hans-Kristian Arntzen (20):

  • wsi/x11: Remove unused vk_alpha in get_dri3_modifiers.

  • wsi/x11: Compare modifiers before signalling SUBOPTIMAL.

  • wsi/x11: Add drirc option to ignore SUBOPTIMAL.

  • wsi/x11: Add workaround for Detroit Become Human.

  • wsi/x11: Rewrite implementation to always use threads.

  • wsi/x11: Implement VK_EXT_swapchain_maintenance1.

  • wsi/x11: Keep track of multiple presentation requests.

  • wsi/x11: Make chain->status atomic.

  • wsi/wl: Refactor out code to update current present ID.

  • wsi/wl: Improve fallback for present_wait.

  • wsi/common: Allow KHR_present_wait on WL.

  • wsi/x11: Disable vk_xwayland_wait_ready by default on most drivers.

  • wsi/x11: Rewrite logic for how we consider minImageCount.

  • radv: export multiview in VS/TES/GS for depth-only rendering

  • wsi/wl: Fix deadlock in dispatch_queue_timeout.

  • wsi/wayland: Replace surface pilfer with retired bool.

  • wsi/wayland: Init outstanding list earlier.

  • wsi/x11: Return OUT_OF_DATE on sw resize.

  • vulkan/runtime: Check correct callback list for binding report.

  • radv: Store range rather than bo_size in VkBuffer/VkImage.

Helen Koike (6):

  • ci/ci_run_n_monitor: move get_gitlab_pipeline_from_url() to gitlab_common

  • ci/ci_gantt_chart: add tool to analyse pipeline execution time

  • ci/ci_gantt_chart: add timeout vertical line

  • ci/ci_gantt_chart: add option to save output to a file

  • ci/ci_gantt_chart: show duration on hover

  • ci/ci_post_gantt: add script that post gantt to Marge’s messages

Hsieh, Mike (4):

  • amd/vpelib: add new tf enum and add flag for geometric scaling

  • amd/vpelib: skip gamma remap and cs conversion when geometric scaling

  • amd/vpelib: geometric scaling fix

  • amd/vpelib: Add UID for 3d Lut and control logic

Hyunjun Ko (3):

  • anv/video: fix out-of-bounds read

  • anv/video: fix scan order for scaling lists on H265 decoding.

  • anv/video: Fix to set correct offset and size for parsing h265 slice header.

Iago Toral Quiroga (49):

  • broadcom/compiler: fix incorrect flags setup in non-uniform if path

  • broadcom/compiler: fix incorrect flags update for subgroup elect

  • broadcom/compiler: add new SFU instructions in V3D 7.x

  • broadcom/compiler: don’t move subgroup reduction instructions above setmsf

  • broadcom/compiler: support subgroup ballot

  • broadcom/compiler: support subgroup shuffle

  • broadcom/compiler: support subgroup vote

  • broadcom/compiler: support subgroup quad

  • v3dv: expose more subgroup features on V3D 7.x

  • broadcom/compiler: be more careful with unifa in non-uniform control flow

  • brodcom/compiler: implement non-compute TSY barrier

  • broadcom/compiler: support subgroup reduction operations from fragment shaders

  • v3dv: allow subgroup operations in fragment shaders

  • broadcom/compiler: fix lane selection for subgroups in fragment shaders

  • v3d,v3dv: fix BO allocation for shared vars

  • v3dv: fix subpass clear with draw call for multi-layered framebuffers

  • v3dv: always set view index before drawing

  • v3dv: fix copying v3dv_end_query_info into primaries from secondaries

  • v3dv: refactor checking and adding pending jobs

  • v3dv: add a helper to constrain clip window to render area

  • v3dv: add helper to check if we need to use a draw for a depth/stencil clear

  • v3dv: add helper to build a render pass for dynamic rendering

  • v3dv: add a helper to setup a framebuffer for dynamic rendering

  • v3dv: add a vk_render_pass_state to pipelines

  • v3dv: don’t assume that pipelines have a render pass

  • v3dv: implement vkCmdBeginRendering and vkCmdEndRendering

  • v3dv: implement dynamic rendering resume/suspend

  • v3dv: rename SECONDARY job type to INCOMPLETE

  • v3dv: fix resume address patching for secondary command buffers

  • v3dv: handle render pass continue flag with dynamic passes

  • v3dv: also emit subpass clears with secondary command buffers

  • v3dv: enable VK_KHR_dynamic_rendering

  • broadcom/ci: skips for tests that don’t check for extension support correctly

  • broadcom/ci: add new expected test failures

  • broadcom/ci: add a test that fails only in CI

  • broadcom/ci: add skips for unsupported features

  • v3dv: fix image creation when exceeding maxResourceSize

  • v3d: implement fix for GFXH-1602

  • broadcom/compiler: fix workaround for GFXH-1602

  • v3dv: require multisync kernel

  • v3dv: drop single sync kernel interface

  • v3dv: add a v3dv_job_clone helper

  • v3dv: fix job pointers from cloned CLs

  • v3dv: store the offset of the BRANCH instruction in a CL

  • v3dv: fix job suspend with command buffer simultaneous use flag

  • broadcom/compiler: enable perquad with uses_wide_subgroup_intrinsics

  • v3d/simulator: size counter_values array correctly on V3D 7.x

  • broadcom/ci: document external causes for some CTS 1.3.8 failures

  • v3dv: fix VK_KHR_vertex_attribute_divisor

Ian Romanick (54):

  • nir: Minor clean up in nir_alu_srcs_negative_equal

  • intel/compiler: Disable DPAS instructions on MTL

  • intel/compiler: Use u_foreach_bit64 in brw_get_compiler_config_value

  • intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value

  • intel/compiler: Track mue_compaction and mue_header_packing flags in brw_get_compiler_config_value

  • intel/fs: Fix shift counts for 8- and 16-bit types

  • intel/rt: Don’t directly generate umul_32x16

  • intel/compiler/xe2: Update get_sampler_lowered_simd_width

  • intel/fs: Move opcode modification before the switch that emits srcs

  • intel/compiler/xe2: Use new sample_*_mlod messages

  • nir: Pack texture LOD and array index to a single 32-bit value

  • intel/compiler/xe2: Emit texture instructions w/ combined LOD and array index

  • intel/compiler/xe2: Set SIMD mode for sampler messages

  • nir: Add documentation for subgroup_.._mask

  • intel/fs: Delete stale comment in nir_intrinsic_ballot implementation

  • nir: Mark nir_intrinsic_load_global_block_intel as divergent

  • intel/fs: Enable nir_opt_uniform_atomics in all shader stages

  • intel/fs: Use constant of same type to write flag

  • intel/fs: Add fast path for ballot(true)

  • nir: Initial framework for optimizing uniform subgroup operations

  • intel/fs: Use nir_opt_uniform_subgroup

  • nir: Optimize uniform iadd, fadd, and ixor reduction operations

  • nir: Optimize uniform vote_all and vote_any

  • i915: Fix value returned for PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS

  • intel/brw: Silence “statement may fall through” warning

  • intel/brw: Correctly dump subnr for FIXED_GRF in INTEL_DEBUG=optimizer

  • intel/compiler: Enforce 64-bit RepCtrl restriction in eu_validate

  • intel/brw: Integer multiply w/ DW and W sources is not commutative

  • intel/brw: Combine constants for src0 of integer multiply too

  • intel/brw: Combine constants for src0 of POW instructions too

  • intel/brw: Avoid a silly add with zero in assign_curb_setup

  • intel/fs: Don’t allow 0 stride on MOV destination

  • intel/brw/xe2: Correctly disassemble RT write subtypes

  • intel/brw: Fix handling of accumulator register numbers

  • intel/brw: Allow SIMD16 F and HF type conversion moves

  • intel/brw: Remove last vestiges of could_coissue

  • intel/brw: Clear write_accumulator flag when changing the destination

  • intel/brw: Use enums for DPAS source regioning

  • nir: intel/brw: Change the order of sources for nir_dpas_intel

  • intel/brw/xe2+: DPAS must be SIMD16 now

  • intel/brw/xe2+: Use phys_nr and phys_subnr in DPAS encoding

  • intel/brw/xe2: Update brw_nir_analyze_ubo_ranges to account for 512b physical registers

  • intel/brw/xe2: Update uniform handling to account for 512b physical registers

  • intel/compiler: Ensure load_barycentric_at_sample and load_interpolated_input remain together

  • intel/brw: Don’t call nir_opt_remove_phis before nir_convert_from_ssa

  • intel/elk: Don’t call nir_opt_remove_phis before nir_convert_from_ssa

  • intel/brw: Delete stray nir_opt_dce

  • intel/elk: Delete stray nir_opt_dce

  • intel/brw/xe2+: Implement Wa 22016140776

  • intel/brw/xe2+: Only apply Wa 22016140776 to math instructions

  • intel/brw: Fix handling of cmat_signed_mask

  • nir: intel/brw: Remove cmat_signed_mask from dpas_intel intrinsic

  • intel/brw: Fix optimize_extract_to_float for i2f of unsigned extract

  • intel/elk: Fix optimize_extract_to_float for i2f of unsigned extract

Isaac Marovitz (1):

  • asahi: Add >16 Sampler Access for Ryujinx

Iván Briano (10):

  • anv: flush query clears for all gens

  • anv, hasvk: pMutableDescriptorTypeLists can be out of range on pool creation

  • compiler/types: fix serialization of cooperative matrix

  • intel/cmat: fix stride calculation in cmat load/store

  • nir/algebraic: avoid double lowering of some fp64 operations

  • nir/lower_doubles: preserve sign of zero if we are asked to

  • nir/lower_doubles: preserve NaN when asked to do so

  • anv, hasvk: check requirements for USAGE_INPUT_ATTACHMENT properly

  • anv: check requirements for VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE

  • anv: fix casting to graphics_pipeline_base

JCWasmx86 (1):

  • meson: Fix invalid kwarg name

Jan Beich (1):

  • util: mimic KCMP_FILE via KERN_FILE on DragonFly and FreeBSD

Jani Nikula (1):

  • docs: fix doc build ‘intel/dev/intel_device_info_gen.h’ file not found

Javier Martinez Canillas (2):

  • clc: silence a warn_unused_result

  • gallium: Add ssd130x to the list of kmsro drivers

Jesse Natalie (115):

  • ci/windows: Update WARP to 1.0.9 NuGet

  • mesa: Consider mesa format in addition to internal format for mip/cube completeness

  • ci/windows: Rev Vulkan SDK and piglit

  • d3d12: Set up spirv-as and fix expectations

  • microsoft/compiler: Declare shader model 6.8 / validator 1.8

  • microsoft/compiler: Handle comparison bias/gradient sampling

  • dzn: Add a debug option to enable experimental shader models

  • microsoft/compiler: Add feature flags for new comparison sampling ops

  • dzn: Implement maintenance3 VariableDescriptorCountLayoutSupport

  • dzn: Fix enhanced barrier layout for depth blits

  • dzn: Handle VkBindImageMemorySwapchainInfoKHR

  • dzn: Disable depth/stencil for partial binding from dynamic rendering

  • spirv2dxil: Fix the spirv2dxil command line tool

  • spirv2dxil: Handle aliasing/overlapping UBO/SSBO variables

  • util: Detect arm64ec as aarch64 (and x86_64)

  • glsl: Work around MSVC arm64 optimizer bug

  • dzn: Don’t set view instancing mask until after the PSO

  • dzn: Fix path passed to CreateDeviceFactory

  • d3d12: Fix path passed to CreateDeviceFactory

  • microsoft/compiler: Use double pack/unpack instead of int for reduce ops on doubles

  • dzn: Add a stencil blit fallback

  • dzn: Add missing condition to immutable sampler init loop

  • dzn: Add missing blit source barriers for enhanced barriers

  • microsoft/compiler: Respect ACCESS_COHERENT in UAV variable data

  • microsoft/compiler: Add a pass for promoting ACCESS_COHERENT on loads/stores

  • spirv2dxil: Lower the Vulkan memory model and coherent loads/stores

  • dzn: Add missing handling of VK_PIPELINE_STAGE_2_DRAW_INDIRECT_BIT

  • dzn: Add barrier to copy source for DispatchIndirect copies

  • dzn: Support non-static samplers for meta

  • dzn: Add a debug flag for forcing off native view instancing

  • dzn: Don’t resolve for RESOLVE_MODE_NONE

  • dzn: Use correct format for depth/stencil resolves

  • dzn: Use blits for all non-averaging resolves

  • microsoft/compiler: Only use simplified subgroup ID algorithm for compute

  • d3d12: Subgroup ballot

  • microsoft/compiler: Relax assert for SPIR-V barriers

  • spirv2dxil: Remove dead branches early during shader compilation

  • spirv2dxil: Trivial fixes for tessellation shaders

  • dzn: Simultaneous-access is mutually exclusive with MSAA

  • dzn: Fix tessellation shader insertion into PSO desc

  • dzn: Add a driconf option to disable dzn for specific apps and use it for RDR2

  • microsoft/compiler: For emulating scan, ensure all threads are active when reading cross-lane

  • microsoft/compiler: Fix wave size control for SM6.6+

  • microsoft/compiler: Fix wave size control for SM6.8+

  • wgl: Support a single-buffered winsys framebuffer

  • wgl: Flush frontbuffer when calling swapbuffers on single-buffered fb

  • wgl: Add no-gdi-single-buffered and gdi-double-buffered PFDs

  • wgl: Enable WGL_ARB_pixel_format_float

  • wgl: Add HDR pixel formats

  • winsys/d3d12: Support single-buffered mode

  • d3d12: Support R16G16B16A16_FLOAT display targets

  • microsoft/compiler: Fix SM6.6 non-bindless handle annotation for UAV counter

  • dzn: Fix conditions for barrier in texture-converting copy case

  • wgl: Check for stw_device->screen before trying to destroy it

  • spirv2dxil: Set push constant register space to nonzero

  • microsoft/compiler: Remove deref load/store/atomic ops that statically go out of array bounds

  • microsoft/compiler: Remove code after discard/terminate in later optimization steps

  • wgl: Initialize DEVMODE struct

  • d3d12: Point sprite lowering pass needs to handle arrays

  • nir_lower_tex_shadow: For old-style shadows, use vec4(result, 0, 0, 1)

  • spirv2dxil: Support buffer_device_address

  • dzn: Support bufferDeviceAddress

  • wgl: Delete unused context param to swap

  • wgl: Check for null before dereferencing ctx in swap

  • nir_tests: Add /bigobj when compiling with MSVC

  • dzn: Include vulkan_core.h instead of vulkan.h in the device enum header

  • dzn: Initialize memoryTypeBits for querying properties on imported handles

  • microsoft/compiler: domainLocation component index needs to be i8

  • microsoft/compiler: Disable GS streams workaround for validator 1.8

  • ci/windows: Update DirectX-Headers, Agility SDK, zlib, DXC, and WARP

  • ci/debian: Update DirectX-Headers

  • nir: Handle ptr_as_array for build_deref_follower

  • microsoft/compiler: Don’t store static-indexing handles that are dynamically emitted

  • microsoft/clc: When possible, compute a part-constant “pointer” value for kernel inputs

  • microsoft/compiler: Simplify code emitting CL globals

  • clc: Move libclc helpers back to microsoft/clc

  • microsoft/clc: Add linkage capability to libclc build to silence warning

  • microsoft/clc: Adjust order of UAV binding assignment

  • microsoft/clc: Install clon12compiler

  • wgl: The default swap interval is supposed to be 1

  • d3d12: Fix d3d12_lower_triangle_strip if multiple vars are in a single location

  • microsoft/compiler: When sorting variables, put unused variables last

  • microsoft/compiler: Move kill-unused/undefined varying pass from spirv to common

  • microsoft/compiler: Simplify I/O component type enum handling

  • microsoft/compiler: Expect front-facing var as an input

  • microsoft/compiler: Improve linking helpers

  • microsoft/compiler: Don’t duplicate work from gather_info in var sorting

  • d3d12: Move some lowering passes to pre-variant

  • d3d12: Lower uniforms to UBO by nir options

  • d3d12: Minor logging improvements

  • d3d12: Fix var splitting pass writemasks

  • d3d12: Explicitly add tess factor vars to tess signatures

  • d3d12: Forward front-facing for passthrough GS

  • d3d12: Capture always_active_io in varying data

  • d3d12: Use TES inputs rather than VS outputs for TCS variant key

  • d3d12: Add primitive ID sysval to input bitmask (for GS in)

  • d3d12: Gather info less and before the final compilation steps

  • d3d12: Remove variables instead of adding them for linking

  • d3d12: Don’t compile useless variants during shader creation

  • microsoft/compiler: Add a fractional var mask for variable sorting

  • d3d12: Set fractional var masks

  • d3d12: Add a debug flag for loading WinPixGpuCapturer.dll

  • ci/windows: Bump Agility SDK to 1.613.2 for ExecuteIndirect validation fix

  • microsoft/compiler: Handle base vertex/instance sysvals as DXIL intrinsics

  • spirv2dxil: Support passing first vertex / base instance to DXIL backend

  • spirv2dxil: Output more specific metadata for whether draw sysvals are needed

  • dzn: Delete dzn structs for indirect draw args and use D3D ones

  • dzn: Query options21

  • dzn: Understand whether first-vertex and base-instance are needed for a pipeline

  • dzn: Update pipeline cache params to take all options into account

  • dzn: Rework indirect drawing keys for shaders and command signatures

  • dzn: Add a hash table of command signatures with non-default strides

  • dzn: Don’t copy app indirect args if we don’t need to

  • glsl: Use a stable attr sort for VS in / FS out

  • d3d12: Include <shlobj.h> with lowercase name

Job Noorman (43):

  • tu: support l1 dcache size on musl

  • ir3: fix setting shared flag on parallel copy arguments

  • ir3: optimize read_first.macro to a mov

  • ir3: fix printing of brcst.active and quad_shuffle

  • ir3: optimize subgroup operations using brcst.active

  • ir3: set reconvergence for scan_clusters.macro

  • ir3: add disassembly for flat.b

  • ir3: update a0/a1 users when cloning instructions

  • ir3: fix alignment of spill slots

  • ir3: validate instruction block pointer

  • ir3: add terminators to blocks

  • ir3: fix instruction count before kill_sched

  • ir3: print branch sources

  • ir3: remove OPC_B and brtype from cat0

  • ir3: remove comp1/2 from cat0

  • ir3: allow liveness calculation for different register types

  • ir3: allow finding SSA uses for a subset of uses

  • ir3: implement RA for predicate registers

  • ir3: validate no registers are invalid after RA

  • ir3: integrate predicates into RA validation

  • ir3: optimize bitwise ops that can directly write predicates

  • ir3: insert predicate conversions after their source

  • ir3: fold negations into cmps.ne zero

  • nir: add search helper is_only_used_by_if

  • ir3: fold and/or and negations into branches

  • freedreno/ci: Update pixmark-piano-v2 hash

  • ir3: fix freeing incorrect register in loops

  • ir3: fix returning false instead of NULL

  • freedreno/registers: fix installation of schema

  • zink: print shaderdb info via debug message callback

  • ir3: calculate SSA uses at the start of predicates RA

  • ir3: fix finding uses of reloaded defs in predicates RA

  • ir3-disasm: run clang-format

  • ir3-disasm: remove unused #includes

  • ir3-disasm: add options to specify GPU by chip ID or name

  • ir3-disasm: add option to disassemble hex number

  • freedreno,computerator: support initialization of buffers

  • ir3: remove unnecessary tessellation epilogue

  • ir3: model predt/predf without sources

  • ir3: add support for precolored sources in predicate RA

  • ir3: add support for predication

  • freedreno/drm-shim: add a730, a740, and a750

  • freedreno/drm-shim: remove duplicate entry for a630

Jonathan Gray (4):

  • intel/dev: update DG2 device names

  • intel/dev: update DG2 device names

  • intel/dev: update DG2 device names

  • intel/dev: 0x7d45 is mtl-u not mtl-h

Jonathan Marek (1):

  • tu/a750: Basic a750 support

Jordan Justen (26):

  • intel/dev/common: Add xe2 support to get_l3_list()

  • intel/dev: Add ARL platform enums

  • intel/dev: Add intel_device_info_is_mtl_or_arl()

  • intel/l3: Define l3 config for ARL

  • iris: Extend MTL modifiers to ARL devices

  • intel/i915: ARL also supports the set-PAT uapi

  • intel/dev: Define engine prefetch for ARL

  • isl: Define MOCS for ARL

  • isl: Handle ARL in isl_drm_modifier_get_score()

  • intel/compiler: Lower DPAS instructions on ARL except ARL-H

  • anv/drirc: Extend option to disable FCV optimization to ARL

  • anv/query: Follow MTL code paths on ARL

  • intel/dev: Add device info for ARL

  • intel/compiler: Set branch shader required-width as 16 for xe2

  • intel/compiler: Implement nir_intrinsic_load_topology_id_intel for xe2

  • intel/compiler: Verify SIMD16 is used for xe2 BTD/RT dispatch

  • intel/dev: Add 2 additional ADL-N PCI ids

  • intel/compiler: Adjust fs_visitor::emit_cs_terminate() for Xe2

  • intel/dev: Adjust device strings for ATS-M devices

  • intel/dev: Add ATS-M PCI ID for Data Center GPU Flex 170G

  • intel/compiler/fs: Restore SIMD32 restriction for ray_queries on Xe2

  • intel/compiler: nib_ctrl no longer exists on Xe2+

  • intel/dev/mesa_defs.json: Add LNL WA entries

  • intel/dev: Add 0x56be and 0x56bf DG2 PCI IDs

  • intel/dev: Change ATS-M 0x56c2 string from 170G to 170V

  • intel/brw: Avoid getting a stride of 0 for nir_intrinsic_exclusive_scan

Jose Maria Casanova Crespo (7):

  • ci: Adds /usr/local/bin to PATH at piglit-traces.sh

  • v3d: Fix indentation at v3d_flush_jobs_writing_resource

  • v3d: Only flush jobs that write texture from different job submission.

  • v3d: Early return if job is not writing the resource

  • v3d: Implement GL_ARB_texture_barrier

  • broadcom/compiler: needs_quad_helper_invocation enable PER_QUAD TMU access

  • ci: re-enable Igalia farm

Joshua Ashton (34):

  • winsys/amdgpu: Hook up guilt to amdgpu_ctx_set_sw_reset_status

  • winsys/amdgpu: Limit usage of query_reset_state2

  • radv/amdgpu: Handle -ENODATA and -ETIME from cs_submit

  • radv: Mark device loss if QueueSubmit failed immediately

  • radv: Remove check_status

  • radv/amdgpu: Remove ctx_query_reset_status

  • radv: Add radv_get_tdr_timeout_for_ip helper

  • radv: Ensure vkGetQueryPoolResults returns in finite-time

  • android: Use system = ‘android’ in crossfile

  • meson: Enable zink in gallium_drivers by default

  • meson: Enable d3d12 in gallium_drivers by default on Windows

  • anv: Enable EXT_swapchain_maintenance1

  • v3dv: Enable EXT_swapchain_maintenance1

  • lavapipe: Enable EXT_swapchain_maintenance1

  • v3dv: Enable EXT_swapchain_colorspace

  • lavapipe: Enable EXT_swapchain_colorspace

  • wsi: Pass wsi_drm_image_params to wsi_configure_native_image

  • wsi: Pass wsi_drm_image_params to wsi_configure_prime_image

  • wsi: Add explicit_sync to wsi_image_info

  • wsi: Add explicit_sync to wsi_drm_image_params

  • build: Add linux-drm-syncobj-v1 wayland protocol

  • wsi: Track if timeline semaphores are supported

  • wsi: Add acquired member to wsi_image

  • wsi: Track CPU side present ordering via a serial

  • wsi: Get timeline semaphore exportable handle types

  • wsi: Add common infrastructure for explicit sync

  • ci: Bump wayland-protocols version to 1.34

  • ci: Bump DEBIAN_BASE_TAG for now

  • meson: Update wayland-protocols wrap to 1.34

  • meson: Bump wayland-protocols requirement to 1.34

  • wsi: Implement linux-drm-syncobj-v1

  • tu: Expose VK_EXT_surface/swapchain_maintenance1

  • radv: Enable KHR_video_queue if encode is enabled

  • radv: Properly initialize imageCreateFlags in GetPhysicalDeviceVideoFormatPropertiesKHR

José Expósito (2):

  • zink: add render-passes HUD query

  • meson: Update proc_macro2 meson.build patch

José Roberto de Souza (88):

  • intel/isl/xe2: Disable route of Sampler LD message to LSC

  • anv: Fix PAT entry for userptr in integrated GPUs

  • intel/genxml/xe2: Remove L3ALLOC

  • intel/dev: Reduce usage of intel_device_info_compute_system_memory()

  • intel: Make memory heaps consistent between KMDs

  • anv: Fix calculation of syncs required in Xe KMD

  • iris: Avoid read of uninitialized value in blorp_clear_stencil_as_rgba()

  • iris: Fix return of iris_wait_syncobj()

  • iris: Wait for drm_xe_exec_queue to be idle before destroying it

  • intel/common: Add functions to handle async vm bind

  • anv: Start to use intel_bind_timeline

  • iris: Start to use intel_bind_timeline

  • anv: Switch to truly asynchronous VM binding in Xe KMD

  • iris: Switch to truly asynchronous VM binding in Xe KMD

  • intel: Fix intel_get_mesh_urb_config()

  • anv: Drop include to common/i915/intel_gem.h

  • intel/common: Fix location of C++ support macro in intel_gem.h

  • intel: Remove circular dependency between intel/dev and intel/common

  • intel/common: Add intel_engines_supported_count()

  • anv: Use intel_engines_supported_count()

  • iris: Use intel_engines_supported_count()

  • intel: Sync i915_drm.h

  • intel/common: Implement i915_engines_is_guc_semaphore_functional()

  • intel: Sync xe_drm.h

  • intel/common: Implement xe_engines_is_guc_semaphore_functional()

  • iris: Fix iris_batch_is_banned() check

  • anv: Use DRM_XE_VM_BIND_OP_UNMAP_ALL to unbind whole bos

  • docs/anv: Add recommended GuC firmware version

  • iris: Set (EXEC_OBJECT_SUPPORTS_48B_ADDRESS | EXEC_OBJECT_PINNED) in a single place

  • iris: Remove iris_bo::kflags

  • iris: Move i915 set and get tiling uAPI calls to i915 specific code

  • iris: Remove more i915_drm.h includes from common code

  • intel: Move intel_define.h to i915/intel_define.h

  • intel/common: Remove more i915_drm.h includes from common code

  • intel/tools/error_decode: Add function to try to open error dump file

  • intel/tools/error_decode: Simply error message handling

  • intel/tools/error_decode: Add support to search for Xe KMD error dumps

  • intel/tools/error_decode: Detect and split error dump file parsing by KMD

  • intel: Sync xe_drm.h

  • anv/xe: Add VMs to error dump

  • iris/xe: Add VMs to error dump

  • intel/tools/error_decode: Move code that can be shared between i915 and Xe error decoders

  • intel/tools/error_decode: Parse Xe KMD error dump file

  • intel/tools: Fix compilation in 32 bits

  • intel/nullhw: Fix 32bits compilation warnings

  • iris: Add IRIS_HEAP_DEVICE_LOCAL_CPU_VISIBLE_SMALL_BAR heap type

  • iris: Force lmem cpu accessible for bos with clear-color

  • iris/xe: Consider pat_index while unbinding the bo

  • anv: Call flush_pipeline_select_gpgpu() for compute engines in compute code paths

  • anv: Skip cmd_buffer_emit_bt_pool_base_address() in blitter and video engines

  • intel: Drop pre-production steppings

  • anv: Fix Xe KMD userptr unbind

  • intel/dev: Nuke ‘ver == 10’ check

  • intel/dev: Nuke display_ver

  • intel: Enable Xe KMD support by default

  • iris: Set BO_ALLOC_NO_SUBALLOC when allocating bo for slab

  • anv: Replace the 2 sparse booleans by 1 enum

  • anv: Set VK_QUEUE_PROTECTED_BIT during queue families initialization

  • anv: Set VM control to true in Xe KMD

  • intel/tools/error_decode: Fix parsing in Xe decoder

  • intel/tools/error_decode: Add function to print batch in Xe decoder

  • intel/tools/error_decode: Parse HW context in Xe decoder

  • iris: Move tiling_to_modifier() implementation to i915 folder

  • iris: Remove i915_drm.h include from iris_indirect_gen.c

  • intel/decoder: Fix binding table pointer entry being marked as invalid

  • anv: Set STATE_COMPUTE_MODE mask bit when zeroing compute mode

  • intel/genxml: Add more instdone registers

  • intel/genxml/gfx125: Fix definition of INTERFACE_DESCRIPTOR_DATA::Thread group dispatch size

  • intel/genxml/xe2: Update definition of INTERFACE_DESCRIPTOR_DATA

  • anv: Create protected engine context when i915 supports vm control

  • anv: Remove protected memory types from default_buffer_mem_types

  • intel/tools/error2hangdump: Print out_filename when failed to open it

  • intel/tools/error2hangdump: Replace drm_i915_gem_engine_class by intel_engine_class

  • intel/tools: Move Xe KMD error decode functions to a separated file

  • intel/tools: Move ascii85_decode_char() to error_decode_lib

  • intel/tools: Move more Xe KMD error decode functions to error_decode_xe_lib

  • intel/tools/error2hangdump: Move code that will be shared with Xe parser to error2hangdump_lib

  • intel/tools/error2hangdump: Move i915 parser to a function

  • intel/tools/error2hangdump: Add Xe KMD support

  • anv: Add missing ANV_BO_ALLOC_INTERNAL

  • iris: Add comments to BO_ALLOC flags

  • iris: Avoid creation of slabs and cache buckets of lmem heaps in integrated gpus

  • iris: Avoid allocation of not needed iris_bucket_cache

  • intel/tools/aubinator_error_decode: Move definition of option_color to header

  • intel/decoder: Add intel_print_group_custom_spacing()

  • intel/tools: Parse INSTDONE registers in Xe KMD error dump

  • intel: Sync xe_drm.h

  • intel/dev: Read GFX IP version during runtime

Juan A. Suarez Romero (32):

  • Revert “v3d: use kmsro to create drm screen on real hw”

  • v3d: show warning on creating a v3d screen on real hw

  • v3d/vc4/ci: reset the list of timeout tests

  • Revert “v3d: show warning on creating a v3d screen on real hw”

  • broadcom/simulator: protect simulator BO rallocs with mutexes

  • v3d/ci: run OpenGL 3.1 tests

  • v3dv/ci: increase timeout for full jobs in 30min

  • ci: disable Igalia farm

  • Revert “ci: disable Igalia farm”

  • Revert “ci: disable Igalia farm”

  • v3d/ci: update expected results

  • v3d/ci: update expected list

  • Revert “v3d/ci: update expected list”

  • vc4/ci: update expected list

  • v3d/ci: add new failures

  • v3dv/ci: update expected list

  • v3dv/ci: remove crashes from expected list

  • v3d,v3d: use new simulator

  • v3dv: disable Early Z for multisampled 16-bit depth buffers

  • v3d: disable Early Z for multisampled 16-bit depth buffers

  • broadcom/compiler: fix SFU check for 7.1

  • v3dv: mark some promoted extensions as supported

  • v3d: add load_fep_w_v3d intrinsic

  • v3d: fix line coords with perspective projection

  • compiler,gallium: move u_reduced_prim to common

  • v3dv: assume that rasterization state can be NULL

  • v3dv: enable smooth line rendering

  • broadcom/ci: add new expected failures

  • v3d: configure polygon mode when enabled

  • broadcom/ci: update expected results

  • v3dv/ci: update expected list

  • nir/lower_clip: update inputs/ouputs read/written bitmask

Juston Li (13):

  • venus: refactor query feedback cmds

  • venus: acquire mutex when recycling query feedback cmds

  • venus: free query batches for VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT

  • venus: add comments for query feedback batch free list

  • venus: recycle linked query feedback immediately during submission

  • venus: handle empty resolved query feedback list

  • venus: fix image reqs cache store locking

  • venus: extract cache hash/equals functions into common

  • venus: image format properties cache

  • venus: move feedback on empty last batch to prior batch

  • venus: fix VkDeviceGroupSubmitInfo cmd counts from feedback

  • venus: extend device format prop cache with VkFormatProperties3

  • Revert “zink: store last pipeline directly for zink_gfx_program::last_pipeline”

Kai Wasserbäch (2):

  • fix(FTBFS): clc: adapt to new LLVM 19 DiagnosticHandlerTy

  • fix(FTBFS): clover: adapt to new LLVM 19 DiagnosticHandlerTy

Karol Herbst (131):

  • nak/algebraic: merge run and main function

  • nak/algebraic: write code to an output file

  • clc: use spirv triple starting with llvm-17

  • clc: add support for the native spir-v backend

  • rusticl/kernel: run opt/lower_memcpy later to fix a crash

  • rusticl/kernel: add a few comments in regards to pass ordering

  • rusticl/kernel: no need to reset the scratch size anymore

  • nir/printf: remove treat_doubles_as_floats

  • clc: require LLVM-14

  • clc: merge blocks handling optional features

  • clc: require LLVM-15

  • nir: rework and fix rotate lowering

  • rusticl/program: rework debug logging option

  • rusticl/spirv: do not attempt to parse spirv after failed link

  • rusticl/spirv: use bool::then_Some inside SPIRVBin::link

  • rusticl/program: add clc_validator_options helper function

  • rusticl/program: add debug option to validate internal spirvs

  • nak/opt_out: fix comparison in try_combine_outs

  • nak: simplify phi_dsts

  • nak: make it compile with clippy

  • rusticl/meson: use rust_abi instead of rust_crate_type

  • rust/spirv: fix clippy lint on unneeded late initialization

  • rusticl/kernel: check that local size on dispatch doesn’t exceed limits

  • nak/meson: specify rust flags globally and allow some clippy lints

  • nak: fix clippy::extra_unused_lifetimes warnings

  • nak: fix clippy::mem_replace_with_default warnings

  • nak: fix clippy::useless_conversion warnings

  • nak: fix clippy::needless_lifetimes warnings

  • nak: fix clippy::needless_borrow warnings

  • nak: fix clippy::while_let_loop warnings

  • nak: fix clippy::match_like_matches_macro warnings

  • nak: fix clippy::needless_return warnings

  • nak: fix clippy::redundant_closure warnings

  • nak: fix clippy::unwrap_or_default warnings

  • nak: fix clippy::manual_while_let_some warnings

  • nak: fix clippy::clone_on_copy warnings

  • nak: fix clippy::single_match warnings

  • rusticl/util: add a wrapper around “thread-safe” C types

  • rusticl/mesa/device: convert to ThreadSafeCPtr

  • rusticl/mesa/screen: convert to ThreadSafeCPtr

  • rusticl/mesa: add thread-safe wrapper for pipe_image_views

  • rusticl/context: store SVM pointers as usize

  • rusticl/gl: mark GLCtxManager as Send + Sync

  • rusticl/mem: make Mem Send/Sync by storing mapping ptrs as usize

  • rusticl/program: mark NirKernelBuild as Send and Sync

  • meson: remove opencl-external-clang-headers option and rely on shared-llvm

  • clc: force fPIC for every user when using shared LLVM

  • nir/lower_cl_images: record image_buffers and msaa_images

  • rusticl/mem: properly handle buffers

  • rusticl/mem: support GL_TEXTURE_BUFFER

  • rust/api: add RustTypes enum

  • rusticl/util: support nested structs in offset_of!

  • rusticl/api: allow CLObjectBase to be placed anywhere

  • rusticl/icd: move get_ref_vec_from_arr into the Rusticl type

  • rusticl/icd: move refcnt() and get rid of needless atomic ops

  • rusticl/icd: move retain() and release()

  • rusticl/icd: move get_arc_vec_from_arr and rename it

  • rusticl/icd: fold leak_ref into its only consumer

  • rusticl/icd: move get_ref()

  • rusticl/device: deduplicate devices with sorting

  • rusticl/icd: move from_arc() and rename it

  • rusticl/event: drop from_cl_arr and use arcs_from_arr

  • rusticl/icd: move get_arc() and rename it

  • rusticl/icd: split Arc part out of CLObject into new trait

  • rusticl/device: get rid of pointless Arc overhead

  • rusticl/icd: actually allow dispatching CL types

  • rusticl/mem: split into Buffer and Image

  • rusticl/mem: use pattern matching in is_parent_buffer

  • rusticl/mem: move fill methods into concrete types

  • core/memory: drop Arc for &Arc<Queue> function parameters

  • rusticl/mem: move map methods into concrete types

  • rusticl/mem: move shadow sync methods into concrete types

  • rusticl/mem: split unmap into Buffer and Image versions

  • rusticl/mem: move copy and write buffer impls into Buffer

  • rusticl/mem: split read_to_user_rect into Buffer and Image versions

  • rusticl/mem: split write_from_user_rect into buffer and image

  • rusticl/mem: move copy_to_rect into Buffer

  • rusticl/mem: split copy_to into Buffer and Image

  • rusticl/mem: split Buffer::copy_to into Buffer and Image versions

  • rusticl/mem: split Image::copy_to into Buffer and Image versions

  • rusticl/mem: get rid of pixel_size

  • rusticl/mem: move tx_image into Image

  • rusticl/mem: fold tx_raw into tx

  • rusticl/image: call tx on the parent buffer directly

  • rusticl/mem: move is_parent_buffer into Image

  • rusticl/mem: move tx into Buffer

  • rusticl/mem: remove get_res

  • rusticl/mem: move comment describing how mapping works

  • rusticl/mem: reorganize Image::map

  • rusticl/mem: move MemBase::map into the users

  • rusticl/mem: move tx_raw_async methods into Buffer and Image

  • rusticl/mem: move Buffer and Image specific fields into the subtypes

  • rusticl/mem: reorganize MemBase::from_gl a little

  • rusticl/mem: move pipe_image_host_access into Image

  • rusticl/kernel: recalculate scratch and shared memory after opts

  • rusticl/program: fix CL_PROGRAM_BINARIES for devs with no builds

  • meson/rusticl: import rust instead of unstable-rust

  • clc: include opencl-c.h for extensions needing it

  • meson: do not pull in clc for clover

  • intel: Only build shaders with anv and iris

  • zink: lower unaligned memory accesses

  • rusticl/context: complete conversion of SVM pointers to usize

  • rusticl/memory: store host_ptr as usize

  • rusticl/memory: make closures Send and Sync

  • rusticl/event: make EventSig Send + Sync

  • rusticl/spirv: mark SPIRVBin as Send and Sync

  • rusticl/kernel: make it Send and Sync

  • rusticl/icd: verify all cl classes are Send and Sync

  • rusticl/meson: remove -Aclippy::arc-with-non-send-sync flag

  • rusticl/kernel: make builds private

  • rusticl/event: we need to call the CL_COMPLETE callback on errors as well

  • rusticl/kernel: assign sampler locations before DCEing variables

  • rusticl/device: support query_memory_info to retrieve available memory

  • drm-uapi: Sync nouveau_drm.h

  • nvk: use c.get_supported_arguments for compiler flags

  • nouveau: import libdrm_nouveau

  • nouveau: call glsl_type_singleton_init_or_ref earlier

  • nouveau/drm: drop immediate parameter from nouveau_pushbuf_new

  • nouveau/drm: rely on nouveau_pushbuf::channel being always set

  • nouveau/drm: drop unsued chan argument from nouveau_pushbuf_kick

  • nouveau/drm: remove nouveau_client::id

  • rusticl/util: make create_pipe_box indepentend of pipe_box’s field types

  • meson: fix link failure with llvm-18

  • rusticl/program: handle -cl-no-subgroup-ifp

  • nouveau: fix potential double-free in nouveau_drm_screen_create

  • nir: fix nir_shader_get_function_for_name for functions without names.

  • rusticl: use stream uploader for cb0 if prefered

  • rusticl/icd: remove CLObject

  • event: break long dependency chains on drop

  • rusticl/mesa/context: flush context before destruction

  • nir/lower_cl_images: set binding also for samplers

Kenneth Graunke (76):

  • iris: Don’t return timestamps modulo 36-bits

  • intel/dev: Fix typo (ajust -> adjust)

  • iris: Implement query_memory_info() on discrete cards

  • intel/nir: Pass devinfo and prog_data to brw_nir_lower_cs_intrinsics

  • intel: Add driver support for hardware generated local invocation IDs

  • intel: Use hardware generated compute shader local invocation IDs

  • driconf: Advertise GL_EXT_shader_image_load_store on iris for SVP13

  • iris: Implement INTEL_DEBUG=heaps

  • intel/fs: Don’t include sync.nop in instruction count statistics

  • intel/fs: Don’t rely on CSE for VARYING_PULL_CONSTANT_LOAD

  • intel/brw: Delete enum brw_urb_write_flags

  • intel/brw: Delete more unused defines

  • intel/brw: Delete legacy SFIDs

  • intel/brw: Delete SIMD4x2 URB opcodes

  • intel/brw: Delete more unused compression stuff

  • intel/brw: Delete SINCOS

  • intel/brw: Delete constant_buffer_0_is_relative

  • intel/brw: Delete compiler->supports_shader_constants

  • intel/brw: Delete enum gfx6_gather_sampler_wa

  • intel/brw: Delete brw_wm_prog_key::line_aa

  • intel/brw: Delete unnecessary brw_wm_prog_data fields

  • intel/brw: Delete some swizzling functions

  • intel/brw: Delete brw_eu_util.c

  • intel/brw: Change unit tests to use TEX_LOGICAL instead of TEX

  • intel/brw: Delete SHADER_OPCODE_TXF_CMS[_LOGICAL]

  • intel/brw: Delete SHADER_OPCODE_TXF_UMS

  • intel/brw: Allow CSE on TXF_CMS_W_GFX12_LOGICAL

  • intel/brw: Delete legacy texture opcodes

  • intel/brw: Mark FIND[_LAST]_LIVE_CHANNEL as not writing the flag

  • intel/brw: Replace CS_OPCODE_CS_TERMINATE with SHADER_OPCODE_SEND

  • intel/brw: Avoid copy propagating any fixed registers into EOTs

  • intel/brw: Handle SHADER_OPCODE_SEND without src[3] in copy prop

  • intel/brw: Add assertions that EOT messages live in g112+

  • intel/brw: Copy the smaller payload in fixup_sends_duplicate_payload

  • intel/brw: Make register coalescing obey the g112-g127 restriction

  • intel/brw: Call constant combining after copy propagation/algebraic

  • intel/brw: Remove SIMD lowering to a larger SIMD size

  • intel/brw: Unindent code after previous change

  • iris: Fix tessellation evaluation shaders that use scratch

  • intel/brw: Emit better code for read_invocation(x, constant)

  • iris: Remove suballocation in iris_flush_resource()

  • iris: Eliminate prototype introduced in the previous patch

  • ra: Add debug functions for printing spill costs and benefits

  • intel/fs: Avoid generating useless UNDEFs for every SSA def

  • intel/brw: Split out 64-bit lowering from algebraic optimizations

  • intel/brw: Don’t consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRF

  • intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST once

  • intel/brw: Fix check for 64-bit SEL lowering types

  • intel/brw: Assert that min/max are not happening in 64-bit SEL lowering

  • intel/brw: Use correct execution pipe for lowering SEL on DF

  • intel/brw: Unify DF and Q/UQ lowering for MOV

  • Revert “intel/brw: Don’t consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRF”

  • intel/brw: Fix opt_split_sends() to allow for FIXED_GRF send sources

  • intel/brw: Fix register coalescing’s LOAD_PAYLOAD dst offset handling

  • intel/brw: Fix destination stride assertion in copy propagation

  • intel/brw: Allow changing types for LOAD_PAYLOAD with 1 source

  • intel/brw: Delete brw_fs_lower_minmax

  • anv, hasvk: Save the original instance ID

  • anv, hasvk: Move multiview remapping loop below output stores

  • anv, hasvk: Fix nir_lower_multiview to re-emit outputs before EmitVertex

  • intel/brw: Stop checking mlen on math opcodes in CSE pass

  • intel/brw: Rearrange fs_inst fields

  • intel/brw: Fix generate_mov_indirect to check has_64bit_int not float

  • intel/brw: Fix lower_regioning for BROADCAST, MOV_INDIRECT on Q types

  • intel/brw: Update comments for indirect MOV splitting

  • intel/brw: Don’t mention gfx7 limitations in shuffle comments

  • intel/brw: Drop dead CHV checks.

  • intel/brw: Drop align16 support in brw_broadcast()

  • intel/brw: Drop gfx7 scratch message setup code

  • intel/brw: Delete if_depth_in_loop

  • intel/brw: Delete fs_visitor::vgrf helper

  • intel/brw: Drop default size of 1 from bld.vgrf() calls

  • intel/brw: Use SHADER_OPCODE_SEND for coherent framebuffer reads

  • intel/brw: Replace FS_OPCODE_LINTERP with BRW_OPCODE_PLN

  • intel/brw: Make an fs_builder::SYNC helper

  • isl: Set MOCS to uncached for Gfx12.0 blitter sources/destinations

Konrad Dybcio (1):

  • freedreno/registers: Add some HWCG regs

Konstantin (8):

  • util/printf: Include stdio.h

  • util/printf: Expose util_printf_prev_tok

  • ac/debug: Handle the output of recent umr versions

  • radv/debug: Canonicalize shader addr

  • radv: Canonicalize addresses in radv_find_shader

  • radv/debug: Try to find unbound shaders

  • radv/debug: Dump descriptor binding information

  • ac/parse_ib: Always print the value of the whole register

Konstantin Seurer (105):

  • nak/repair_ssa: Remap PHI sources as well

  • ac/llvm: Enable helper invocations for quad OPs

  • radv: Vectorize load_global_constant

  • lavapipe: Fix DGC vertex buffer handling

  • gallivm: Use saturating fpto*i conversions

  • lavapipe: Mark vertex elements dirty if the stride changed

  • lavapipe: Report the correct preprocess buffer size

  • radv: Implement NIR debug printf

  • llvmpipe: Stop refcounting sample functions

  • llvmpipe: Compile sample functioins on demand

  • radv/rt: Use doubles inside intersect_ray_amd_software_tri

  • llvmpipe: Fix building with llvm11

  • nir/print: Don’t print shared_size twice

  • nir/print: Rename workgroup-size to workgroup_size

  • radv/radix_sort: clang-format

  • radv: Reduce the amount of radv_device_to_handle calls

  • radv: Make radv_write_user_event_marker non-static

  • radv: Emit user events during acceleration structure builds

  • radv: Skip unused acceleration structure build paths

  • radv/sqtt: Set SeparateCompiled for monolithic RT pipelines

  • radv/sqtt: Handle ray tracing pipelines with no traversal shader

  • radv/rt: Lower ray payloads like hit attribs

  • radv/rra: Rename rra_chunk_type to rra_chunk_version

  • radv/rra: Use memcpy for chunk descriptions

  • radv/rra: Remove useless variable

  • radv/rra: Refactor error handling

  • radv/rra: Dump basic ray history tokens

  • docs: Document RADV_RRA_TRACE_HISTORY_SIZE

  • radv/rra: Implement ahit/isec counters

  • amd/common: Use the correct register table for GFX10_3

  • radv: Wire up ac_gather_context_rolls

  • zink: Always set mfence->submit_count to the fence submit_count

  • Revert “zink: always force flushes when originating from api frontend”

  • llvmpipe: Use full subgroups when possible

  • gallivm: Consider the initial mask when terminating loops

  • lavapipe: Advertise VK_KHR_shader_maximal_reconvergence

  • ci: Update llvmpipe trace checksums

  • ac/parse_ib: Add and use print_addr

  • ac/parse_ib: Dump the ADDR field of PKT3_SET_BASE

  • ac/parse_ib: Annotate addresses with UAF/OOB info

  • ac/parse_ib: Handle 32bit PKT3_DISPATCH_INDIRECT addrs

  • ac/parese_ib: Handle more packets

  • radv/rra: Avoid reading past the ray history buffer

  • radv/meta: Add shader - device mapping for radv_build_printf

  • vulkan/cmd_queue: Implement CmdBuildAccelerationStructuresKHR

  • lavapipe: Implement VK_KHR_acceleration_structure

  • lavapipe: Add ray traversal code

  • lavapipe: Implement VK_KHR_ray_query

  • lavapipe: Advertise VK_KHR_deferred_host_operations

  • lavapipe: Advertise VK_KHR_acceleration_structure

  • lavapipe: Advertise VK_KHR_ray_query

  • lavapipe/ci: Document ray query failures

  • docs: Document lavapipe ray tracing features

  • vulkan: Implement DebugMarkerSetObjectNameEXT

  • radv/rt: Implement RADV_DEBUG=shaderstats

  • radv/rt: Add radv_ray_tracing_stage_info

  • radv/rt: Fixup constant args

  • aco: Only fix used variables to registers

  • radv/rt: Avoid passing unused data to the next stage

  • radv/rt: Inline constant trace_ray srcs into the traversal shader

  • radv/rt: Inline constant information about ray flags

  • radv/rt: Fix raygen_imported condition

  • zink: Handle aoa derefs of images

  • ac: Annotate context rolls

  • ac/parse_ib: Replace the parameter list with ac_ib_parser

  • ac/parse_ib: Implement annotations

  • radv: Add support for IB annotations

  • radv: Add an IB annotation layer

  • ac: Improve context roll readability

  • radv: Use radv_buffer_map for parsing IBs

  • radv/rt: Use 32-bit offsets for load_sbt_entry

  • radv: Skip more acceleration structure build markers

  • radv/printf: Use fprintf instead of printf

  • nir/print: Fix printing booleans with bit_size>1

  • nir/serialize: Encode data for temporaries

  • nir: Add lavapipe ray tracing intrinsics

  • llvmpipe: Fix function call handling

  • lavapipe: Add lvp_spirv_to_nir

  • lavapipe: Make lvp_shader_init non-static

  • lavapipe: Make lvp_create_pipeline_nir non-static

  • lavapipe: Lower mem_constant variables

  • lavapipe: Defer binding compute state

  • lavapipe: Remove unused ray tracing variables

  • lavapipe: Add more ray tracing helpers

  • lavapipe: Pass lvp_ray_flags into lvp_aabb_intersection_cb

  • lavapipe: Use the pipeline type in get_pcbuf_size

  • lavapipe: Inline fill_ubo0

  • lavapipe: Add an api_stage parameter to update_pcbuf

  • lavapipe: Fix a memory leak in lvp_push_internal_buffer

  • lavapipe: Implement VK_KHR_ray_tracing_pipeline

  • lavapipe: Implement KHR_ray_tracing_maintenance1

  • lavapipe: Implement VK_EXT_pipeline_library_group_handles

  • lavapipe: Implement VK_KHR_ray_tracing_position_fetch

  • radv: Destroy leaf_updateable_pipeline

  • lavapipe: Handle accel struct queries in handle_copy_query_pool_results

  • lavapipe: Implement ray_tracing_maintenance1 queries

  • lavapipe: Do nort use NIR_PASS during lowering

  • lavapipe: Handle multiple planes in GetDescriptorEXT

  • lavapipe: Explicitely support ycbcr formats

  • Revert “gallivm/ssbo: mask offset with exec_mask instead of building the ‘if’”

  • radv: Handle all dependencies of CmdWaitEvents2

  • nir/print: Do not access invalid indices of load_uniform

  • radv: Fix radv_shader_arena_block list corruption

  • radv: Remove arenas from capture_replay_arena_vas

  • radv: Zero initialize capture replay group handles

Krzysztof Kurek (1):

  • panfrost: fix shift overflow in `bi_fold_constant`

Leo Liu (2):

  • radeonsi: fix video processing path without VPE enabled

  • ac/gpu_info: Fix broken UVD firmware query

Lepton Wu (1):

  • llvmpipe: Set “+64bit” for X86_64

Lin, Ricky (1):

  • amd/vpelib: Rename the parameters of init vpe function

Lionel Landwerlin (186):

  • anv: fix disabled Wa_14017076903/18022508906

  • intel/aux_map: fix fallback unmapping range on failure

  • anv: hide vendor ID for The Finals

  • intel/decoder: make vertex data decoding optional

  • intel/decoder: don’t ignore BT entries at offset 0

  • intel/genxml: add CCS_INSTDONE register

  • intel/genxml: add GAM done register description

  • intel/hang_viewer: add aux-tt view

  • anv: export descriptor flushing functions

  • anv: fix include guards

  • anv: fix missing header

  • anv: move generated draw flush helper to its own file

  • anv: move draw commands to their own file

  • anv: move compute/ray-tracing commands to their own file

  • anv: rename video command file

  • nir/alu_srcs_negative_equal: bail earlier if possible

  • nir/comparison_pre_tests: update expectations

  • anv: using a single struct for kernel upload

  • anv: fix pipeline executable properties with graphics libraries

  • isl: add print helpers for debug

  • anv: implement undocumented tile cache flush requirements

  • anv: reorder anv_astc_emu.c

  • anv: remove unused perfetto declarations

  • anv: rename layers entrypoints

  • anv: add BO flag for internal driver allocations

  • anv: track total state stream allocated blocks from the pool

  • anv: track imported ray tracing pipeline groups

  • anv: initial RMV support

  • vulkan/runtime: handle new image layout

  • anv: don’t prevent L1 untyped cache flush in 3D mode

  • anv: promote EXT_index_type_uint8 to KHR

  • anv: promote EXT_line_rasterization to KHR

  • anv: promote EXT_load_store_op_none to KHR

  • anv: add missing alignment for AUX-TT mapping

  • intel/ds: track predication of blorp operations

  • vulkan/runtime: add helper to query attachment layout

  • anv: ensure consistent layout transitions in render passes

  • anv: add check that in renderpass barriers apply to attachments

  • anv: handle image feedback loop usage

  • anv: implement VK_EXT_attachment_feedback_loop_dynamic_state

  • anv/hasvk: don’t report error when intel_get_device_info_from_fd fails

  • anv: factor out aux-tt binding logic for future reuse

  • anv: rename aux_tt image field

  • anv: retain ccs image binding address

  • anv: fix transfer barriers flushes with compute queue

  • vulkan/runtime: handle new dynamic states for attachment remapping

  • docs/features: drop gen8+/gen9+ on Anv

  • docs/features: synchronize new features for Anv

  • vulkan/multialloc: bump max number to 16

  • vulkan/runtime: rework VK_KHR_dynamic_rendering_local_read state tracking

  • anv: reduce cache flushing for indirect commands on Gfx12.5+

  • anv: don’t unmap AUX ranges at BO delete

  • isl: printout sparse usage

  • isl: add a no-aux-align usage flag

  • anv: move ALLOC_HOST_CACHED_COHERENT as define

  • anv: use address helper to compute address u64 value

  • intel/aux_map: add BSpec reference

  • intel/aux_map: add helper to compute offset in aux data

  • anv: re-introduce BO CCS allocations

  • intel/dev: fix missing dependency on generated packing heaers

  • anv: factor out post submit queue debug code

  • intel/fs: indent lowering code to make it more readable

  • intel/fs: rerun divergence prior to lowering non-uniform interpolate at sample

  • anv: fix incorrect flushing on shader query copy

  • meson: add a new option to enable intel-clc without building RT shaders

  • intel/compiler: make default NIR compiler options visible

  • intel-clc: move ISA generation to its own function

  • intel/clc: add ability to output NIR

  • intel-clc: print text input

  • genxml: enable opencl code generation

  • genxml: generate opencl packing headers

  • genxml: remove NDEBUG_UNUSED

  • intel/ds: new tracepoints for generated commands

  • meson: add option to install intel-clc

  • ci: build a host version of mesa for cross builds

  • anv: rewrite internal shaders using OpenCL

  • intel/shaders: add iris variant of indirect draws generation shader

  • intel/shaders: enable gfx8 support

  • iris: make binding table shifting values available outside iris_state.c

  • iris: make KSP helper available outside iris_state.c

  • iris: make URB programming available outside iris_state.c

  • iris: factor out index buffer emission

  • iris: add an option for not emit draw parameters

  • iris: enable generated indirect draws

  • meson: enforce build of intel-clc with anv/iris

  • anv: remove redundant asserts

  • anv: don’t allocate aux padded BOs with host pointers

  • anv: fix buffer marker cache flush issues on MTL

  • anv: enable query clear/copy using shaders on MTL/ARL

  • anv: fixup push descriptor shader analysis

  • anv: factor out descriptor buffer flushing

  • anv: reenable ANV_ALWAYS_BINDLESS

  • anv: remove unused definition

  • anv: fix Wa_16013994831 macros

  • anv: fix emission of Wa_14015055625

  • genxml: generate opencl temporary variables with private qualifier

  • intel/clc: lower temp function/shader variables together

  • intel/clc: workaround LLVM17 opaque pointers

  • anv: disable Wa_16013994831

  • ci/anv: add more testing for optimization paths

  • intel/ci: bump anv/tgl fraction to 6

  • intel/nir: only consider ray query variables in lowering

  • anv: limit depth flush on dynamic render pass suspend

  • anv: add missing generated file dep

  • anv: optimize push descriptor updates

  • anv: add new heap/pool for descriptor buffers

  • anv: create new helper for small allocations

  • anv: add a second dynamic state heap for descriptor buffers

  • anv: move aux-tt to general state pool

  • anv: allocate slice_hash for descriptor buffer

  • anv: allocate border colors for descriptor buffers

  • anv: allocate fsr states for descriptor buffer

  • anv: implement data write entry points for EXT_descriptor_buffer

  • anv: compute a sampler hash based on parameters

  • anv: add embedded sampler parameters in descriptor set layout hash

  • intel/fs: add plumbing for embedded samplers

  • nir: add additional flag to resource_intel for embedded samplers

  • anv: add embedded sampler support

  • anv: add new helper to update binding table pool offset

  • anv: add descriptor set layout support for descriptor buffers

  • anv: add pipeline/shader support for descriptor buffers

  • anv: handle push descriptor writes with descriptor buffers

  • anv: implement descriptor buffer binding

  • anv: disable mutable combined image/sampler in descriptor buffer

  • anv: expose VK_EXT_descriptor_buffer

  • anv: fix non matching image/view format attachment resolve

  • anv: fix helper usage for CmdUpdateBuffer()

  • anv: remove some wrapping around mmap

  • anv: add support for VK_EXT_map_memory_placed

  • anv: delay internal shader upload to when needed

  • anv: fix companion command buffer initialization

  • anv: fix incorrect ISL usage in buffer view creation

  • anv/iris/blorp: use the right MOCS values for each engine

  • anv: try to keep the pipeline in GPGPU mode when buffer transfer ops

  • anv: don’t copy the null descriptor from the GPU memory

  • intel/fs: fixup sampler header message

  • anv: return unsupported for FSR images on Gfx12.0

  • intel/fs: remove some unused send helpers

  • anv: ignore descriptor alignment for inline uniforms

  • intel/fs: bump max simd size of some messages for xe2

  • anv: track embedded sampler counts in layouts

  • anv: allocate pipeline bindings tables dynamically on the heap

  • anv: avoid partially compiled warning with GPL

  • blorp: handle a few allocation failure cases

  • anv: fix invalid border color free

  • anv: fix block pool allocation failure

  • anv: fix temporary state pool allocation failures

  • anv: fix bitfield checks in gfx runtime flushing

  • anv: fix query clearing with blorp compute operations

  • blorp: add support for cached dynamic states

  • anv: reduce blorp dynamic state emissions

  • anv: optimize emission of dynamic state with blorp

  • anv: fix protected memory allocations

  • anv: pull surface state copies for secondary in one loop

  • anv: disable protected content around surface state copies

  • anv: disable generated draws in protected command buffers

  • anv: update protection fault property

  • anv: fix incorrect blorp dynamic state heap usage

  • intel/fs: printout a couple of more late compile steps

  • intel/fs: fixup instruction scheduling last grf write tracking

  • anv: add missing data flush out of L3 for transform feedback writes

  • anv: mark descriptors & pipeline dirty after blorp compute

  • isl: set NullPageCoherencyEnable for depth/stencil sparse surfaces

  • anv: only check patch_control_points changes in runtime flush

  • anv: increase maxResourceDescriptorBufferRange on DG2+

  • anv: reuse vk_common_GetImageSubresourceLayout

  • anv: move all format props checks to anv_get_image_format_properties()

  • drirc: rename hasvk only option

  • vulkan: track compression control flags on vk_image

  • anv: implement VK_EXT_image_compression_control

  • anv: disable capture replay with descriptor buffer

  • anv: remove useless dynamic state allocation for samplers

  • anv: add capture/replay support for image with descriptor buffers

  • anv: add capture/replay support for buffer with descriptor buffers

  • anv: add a new reserved pool for capture/release

  • anv: enable shader border color capture/replay

  • anv: enable capture/replay with descriptor buffers

  • anv: disable dual source blending state if not used in shader

  • intel/brw: fixup wm_prog_data_barycentric_modes()

  • anv: fixup alloc failure handling in reserved_array_pool

  • anv: fix leak of custom border colors

  • anv: fix ycbcr plane indexing with indirect descriptors

  • brw: add more condition for reducing sampler simdness

  • anv: fix push constant subgroup_id location

  • nir/divergence: add missing load_printf_buffer_address

  • anv: use weak_ref mode for global pipeline caches

Louis-Francis Ratté-Boulianne (5):

  • panfrost: factor out method to check whether we can discard resource

  • panfrost: add copy_resource flag to pan_resource_modifier_convert

  • panfrost: add can_discard flag to pan_legalize_afbc_format

  • panfrost: Legalize before updating part of a AFBC-packed texture

  • panfrost: Add AFBC packing support for RG formats

Luc Ma (1):

  • gallium/u_blitter: Fix a few uninitialized fb_state

Luca Bacci (1):

  • meson,windows: Use relative paths in Vulkan ICD manifest files

Lucas Fryzek (19):

  • egl/wayland/sw: don’t invert y `wl_surface_damage_buffer`

  • drisw/winsys: Flip y coordinate when creating pipe boxes

  • drisw: clamp damage region to texture bounds

  • llvmpipe: explicitly reject (most) yuv formats

  • gallium: Add dmabuf arg to memory fd allocation API

  • llvmpipe: Implement dmabuf handling

  • drisw: reuse kopper image extension vtable if modifiers/dmabuf is supported

  • llvmpipe: conditionally export PIPE_CAP_DMABUF

  • lavapipe: support VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT

  • lavapipe: EXT_external_memory_dma_buf

  • llvmpipe: make it possible to import and bind unbacked resources

  • lavapipe: include drm_fourcc.h

  • lavapipe: check drm modifier info during image create

  • lavapipe: EXT_image_drm_format_modifier

  • venus/ci: Add patch for modifiers test to check import/export bits

  • lp: Wrap udmabuf usage in HAVE_LIBDRM ifdef

  • ci/lp: Remove ext buffer YUV tests from fails

  • llvmpipe: Only return null resource handle when dt is not mapped

  • llvmpipe: Only use udmabuf if header is found

Lucas Stach (18):

  • etnaviv: disable 64bpp render/sampler formats

  • etnaviv: track resource sharing

  • etnaviv: only add shared resources to implicit flush list

  • etnaviv: implicitly update shared texture resources

  • etnaviv: don’t use int filter for depth textures

  • etnaviv: tex_desc: emit texture comparator

  • etnaviv: fix fixpoint conversion of negative values

  • ci/etnaviv: update expectations

  • etnaviv: fix depth writes without testing

  • etnaviv: rs: take src dimensions into account when increasing height alignment

  • etnaviv: use correct blit box sizes when copying resource

  • etnaviv: fix separate depth/stencil clears

  • etnaviv: trigger TS derivation after slow clear

  • etnaviv: split TS and non-TS RS clear commands

  • etnaviv: ci: update expectation with fixed depth/stencil clears

  • etnaviv: rs: treat depth-only clear to X8Z24 surfaces as full clear

  • ci/etnaviv: update expectation after piglit uprev

  • etnaviv: flip the switch on MSAA support

Luigi Santivetti (1):

  • pvr: return the OS page size for minMemoryMapAlignment

Lynne (1):

  • radv/av1: limit profile and bit depth to supported values

M Henning (10):

  • nvk: Don’t clobber vb0 after repeated blits

  • nak: Remove assert on nir->info.outputs_written

  • nvk: Early-out impossible descriptor allocations

  • nak: Fix ldg/stg/atomg encoding to use globalmem

  • nak: Set fewer bits in writes_color

  • nak: Use undef for unset FSOut components

  • nak: Remove old union_find implementation

  • nak: Rewrite union_find and use it in repair_ssa

  • nak: Count GLOBAL_SIZE_OFFSET in bytes, not words

  • nvk: Don’t use a descriptor cbuf if it’s too large

Manuel Stoeckl (1):

  • util/disk_cache: try getenv(HOME) before getpwuid->pw_dir

Marcin Ślusarz (1):

  • intel/compiler/xe2: fix decoding of sampler simd mode

Marek Olšák (244):

  • gallium/u_vbuf: replace unnecessary dst_index with “i”

  • gallium: remove unbind_trailing_count from set_vertex_buffers

  • cso: don’t unbind vertex buffers when enabling/disabling u_vbuf

  • winsys/amdgpu: merge loops decrementing num_active_ioctls & unreferencing bufs

  • winsys/amdgpu: cosmetic touchups

  • winsys/amdgpu: don’t clear buffer list elements after IB submission

  • winsys/amdgpu: add more fence_reference helpers

  • winsys/amdgpu: don’t clear fence list elements after IB submission

  • winsys/amdgpu: remove misplaced duplicated comment

  • winsys/amdgpu: represent IB_MAX_SUBMIT_DWORDS in bytes

  • winsys/amdgpu: represent max_ib_size_dw in bytes

  • winsys/amdgpu: cosmetic touchups around IB sizes

  • amd: unify NIR options between RADV and radeonsi

  • ac/nir: don’t write TCS outputs to memory if no_varying is set

  • ac/nir: rename clipdist_enable_mask -> clip_cull_dist_mask

  • ac/nir: optimize out multiplications in small line culling

  • ac/nir: simplify code at the beginning of ac_nir_gs_shader_query

  • ac,radeonsi: emulate GS primitive pipeline stat on gfx11 because of culling

  • radeonsi: report more detailed output stats for shader-db

  • radeonsi: expose shader profiles to other .c files

  • radeonsi: don’t use staging uploads for buffers & shaders with all VRAM visible

  • radeonsi: deduplicate gfx10_ngg_get_vertices_per_prim / get_num_vert_per_prim

  • radeonsi: change GS_STATE_PROVOKING_VTX_INDEX to 1 bit PROVOKING_VTX_FIRST

  • radeonsi: split si_update_ngg_prim_state_sgpr into 2 functions

  • radeonsi: pack GS_STATE_ESGS_VERTEX_STRIDE better to save 2 bits

  • radeonsi: remove no-op additions for viewport0_y_inverted

  • radeonsi: remove unused preloaded instance_divisor_constbuf

  • radeonsi: rename *trivial_vs_prolog -> *trivial_vs_inputs

  • radeonsi/gfx11: clean up MAX_ALLOWED_TILES_IN_WAVE programming

  • radeonsi/ci: update gfx11 flakes

  • radeonsi/gfx11: flush DB before Z/S clear to work around dEQP failures

  • radeonsi: don’t flush CS before and after every blitter invocation

  • mesa,gallium: move the thread scheduler to src/util

  • gallium: rename PIPE_.._PIN_THREADS_TO_L3_CACHE -> .._UPDATE_THREAD_SCHEDULING

  • st/mesa: rename ST_L3_PINNING_DISABLED -> ST_THREAD_SCHEDULER_DISABLED

  • util: add mesa_pin_threads environment variable that sets a static affinity mask

  • glthread: apply the thread scheduling policy when the context is created

  • glthread: apply the thread scheduling policy when a batch executes synchronously

  • gallium/hud: add “csv” option to print values to stdout as CSV

  • nir: remove INTERP_MODE_COLOR

  • nir: relax validation failure for generic TCS outputs with no_varying

  • nir: remove and replace underused option pack_varying_options

  • nir: replace lower_io_variables with a GLSL NIR flag

  • nir: add a lower_mediump_io callback into options

  • nir: add vertex divergence into nir_divergence_analysis

  • winsys/amdgpu: fix a race condition when reading ws->num_buffers

  • winsys/amdgpu: add real buffers of slab entries in the CS thread

  • winsys/amdgpu: change the signature of amdgpu_add_bo_fences_to_dependencies

  • winsys/amdgpu: move code out of amdgpu_add_bo_fences_to_dependencies for reuse

  • winsys/amdgpu: merge 2 loops iterating over slab entries in amdgpu_cs_submit_ib

  • winsys/amdgpu: merge 2 loops iterating over sparse BOs in amdgpu_cs_submit_ib

  • winsys/amdgpu: merge 2 loops iterating over real BOs in amdgpu_cs_submit_ib

  • winsys/amdgpu: skip code checking RADEON_USAGE_SYNCHRONIZED for slabs

  • winsys/amdgpu: simplify amdgpu_do_add_buffer to remove memset

  • winsys/amdgpu: don’t ref/unref slab BOs in amdgpu_cs_submit_ib

  • radeonsi: use num_vertex_buffers instead of ARRAY_SIZE

  • radeonsi/ci: add gfx11 flakes

  • gallium: always set vertex elements before setting vertex buffers

  • gallium/u_blitter: set take_ownership=true for set_vertex_buffers

  • st/mesa: set take_ownership=true for set_vertex_buffers in st_draw_quad

  • gallium/util: add take_ownership parameter into util_draw_vertex_buffer

  • st/mesa: set take_ownership=true for util_draw_vertex_buffer in st_DrawTex

  • st/mesa: set take_ownership=true for set_vertex_buffers in st_pbo_draw

  • gallium/hud: set take_ownership=true for set_vertex_buffers

  • cso: remove CSO_UNBIND_VERTEX_BUFFER0

  • gallium/u_threaded: remove the count=0 path from tc_call_set_vertex_buffers

  • gallium/u_threaded: allow drivers to change tc_call_set_vertex_buffers function

  • gallium: remove take_ownership from set_vertex_buffers, assume it’s true

  • gallium/noop: don’t leak resources due to take_ownership

  • radeonsi,aco: remove the VS prolog

  • gallium/u_threaded: expose helpers for filling set_vertex_buffers externally

  • st/mesa: rename attribs -> arrays in st_atom_array to indicate non-zero strides

  • st/mesa: do (inputs_read & enabled_arrays) outside setup_arrays

  • st/mesa: do (inputs_read & ~enabled_arrays) outside st_setup_current

  • st/mesa: move a piece of _mesa_draw_array_attrib out of the loop in setup_arrays

  • st/mesa: cosmetic touchups in st_atom_array.cpp

  • st/mesa: change the update enum of vertex elements

  • st/mesa: move st_update_functions into st_context

  • st/mesa: constify the pipe_draw_info parameter and remove obsolete comments

  • mesa: inline {Create,Draw}GalliumVertexState callbacks

  • mesa: inline _mesa_set_vao_immutable

  • mesa: add gl_vertex_array_object::NonIdentityBufferAttribMapping

  • util/idalloc: make deleting invalid IDs a no-op

  • mesa: remove unused _mesa_HashTable code

  • mesa: clean up unnecessary _mesa_HashTable locked/unlocked wrappers

  • mesa: re-format main/hash.h, move inlines to the end, some code to main/hash.c

  • mesa: fold _mesa_HashDeleteAll into _mesa_DeleteHashTable

  • mesa: remove _mesa_HashTable::InDeleteAll

  • st/mesa: merge 3 unlikely blocks in _mesa_get_bufferobj_reference

  • st/mesa: remove !obj checking in _mesa_get_bufferobj_reference when it’s useless

  • mesa: fix incorrect _mesa_HashInsertLocked parameter in _mesa_EndList

  • mesa: use util_idalloc_alloc_range for _mesa_HashFindFreeKeyBlock

  • winsys/amdgpu: convert amdgpu_cs.c to .cpp

  • winsys/amdgpu: enable unlimited number of parallel queues for VCN

  • util/idalloc: optimize foreach by tracking the greatest non-zero element

  • mesa: declare _mesa_HashTable::id_alloc as non-pointer

  • mesa: declare _mesa_HashTable inside structures instead of as a pointer

  • mesa: remove isGenName parameter from _mesa_HashInsert

  • mesa: use util_idalloc_foreach for looping in _mesa_HashTable

  • mesa: replace _mesa_HashTable::ht with util_sparse_array for faster lookups

  • d3d12: make DrawTransformFeedback not depend on the vertex buffer offset

  • mesa: don’t use the slow VAO path except for drivers that want to use it

  • st/mesa: add VAO fast path C++ template variants for st_update_array callback

  • st/mesa: optimize st_update_arrays using lots of C++ template variants

  • glthread: re-enable thread scheduling in st/mesa when glthread is disabled

  • glthread: use _mesa_glthread_fence_call() instead of duplicating that code

  • glthread: add no_error variants of glDrawElements*

  • glthread: add no_error variants of glDrawArrays*

  • glthread: remove cmd_size from constant-sized calls

  • glthread: clean up how vertex stride is packed

  • glthread: pack “size” in Pointer calls as 16 bits

  • mesa: deduplicate get_index_size_shift code

  • mesa: deduplicate is_index_type_valid code

  • glthread: pack the primitive type to 8 bits

  • glthread: pack the index type to 8 bits

  • glthread: rewrite glDrawElements call packing

  • glthread: rewrite glDrawArrays call packing

  • glapi: fix type names for glthread and handle all types

  • glthread: sort fixed-sized parameters before returning them

  • glthread: move global marshal_XML.py functions into class marshal_function

  • glthread: precompute fixed_params and variable_params lists

  • gltrhead: merge 3 blocks conditional on marshal_sync in print_async_body

  • glthread: separate unmarshal function generation into print_unmarshal_func

  • glthread: separate marshal code generation into print_marshal_async_code

  • glthread: remove “if True” from print_marshal_async_code

  • glapi: pass pointer size to python for glthread from meson

  • glthread: pack glVertexAttribPointer calls better

  • glthread: fix multi draws with a negative draw count

  • glthread: pack uploaded user vertex buffers and offsets better

  • glthread: deduplicate batch finalization code

  • glthread: don’t check cmd_size for small variable-sized calls

  • glthread: use marshal_count instead of count for more functions

  • glthread: rewrite glBindBuffer packing

  • glthread: add a packed variant of glDrawElements with 16-bit count and indices

  • glthread: add a packed version of DrawElementsUserBuf

  • glthread: generate packed versions of gl*Pointer/Offset calls

  • amd: update addrlib

  • mesa: deduplicate initialization of gl_pixelstore_attrib

  • mesa: move struct gl_pixelstore_attrib into glthread.h

  • glthread: track glPixelStore(GL_UNPACK_*)

  • glthread: execute small glBitmap asynchronously

  • glthread: execute small glDrawPixels asynchronously

  • glthread: invert _mesa_glthread_has_no_{un}pack_buffer by removing the negation

  • amd/registers: add correct gfx11.x enums for BINNING_MODE

  • radeonsi: disable binning correctly on gfx11.5

  • radeonsi/gfx11: fix programming of PA_SC_BINNER_CNTL_1.MAX_ALLOC_COUNT

  • radeonsi/gfx10.3: add a GPU hang workaround for legacy tess+GS

  • radeonsi: allocate only one set of tessellation rings per device

  • radeonsi/gfx11: program the attribute ring right before draws

  • radeonsi: program tessellation rings right before draws

  • radeonsi/gfx11: program SAMPLE_MASK_TRACKER_WATERMARK optimally for APUs

  • ac: use the gfx11 shadowed register tables for gfx11.5

  • radeonsi/gfx11: add missing DCC_RD_POLICY setting

  • radeonsi: add radeonsi_cache_rb_gl2 option enabling GL2 caching for CB and DB

  • nir/divergence_analysis: change function prototypes

  • nir/divergence_analysis: load_primitive_id is convergent within a primitive

  • nir/divergence_analysis: load_instance_id is convergent within a primitive

  • nir/divergence_analysis: handle derefs of system values

  • nir: print nir_io_semantics::invariant

  • nir: add nir_block::divergent to indicate a divergent entry condition

  • ac/llvm: fix SSBO bounds checking by using raw instead of struct opcodes

  • radeonsi: fix the DMA compute shader

  • radeonsi: don’t test so many wave limits for AMD_TEST=testdmaperf

  • nir: add a utility computing post-dominance of SSA uses

  • nir: add nir_opt_varyings, new pass optimizing and compacting varyings

  • nir/tests: add tests for nir_opt_varyings

  • radeonsi: set the lower_mediump_io callback for GLSL

  • radeonsi: set trivial NIR options for nir_opt_varyings

  • radeonsi: enable uniform propagation for varyings except VP/Energy

  • radeonsi: add test failures due to incorrect tests for nir_opt_varyings

  • st/mesa: get dual slot input info from NIR if IO is lowered

  • st/mesa: lower sysvals slightly sooner

  • st/mesa: skip a few NIR passes that don’t work with lowered IO

  • glsl/linker,st/mesa: enable nir_opt_varyings and lower IO in the linker

  • amd/ci: update stoney results

  • r300: port scanout pitch alignment from the DDX to fix DRI3

  • r300: enable tiling for scanout to fix DRI3 performance

  • radeonsi/ci: run GLCTS, ESCTS, and dEQP from the glcts directory

  • radeonsi/ci: update failures

  • Unbreak Viewperf by reverting “util: use crc32_z instead of crc32 and bump zlib dep to 1.2.9”

  • gallium: use u_box_3d to initialize pipe_box instead of non-designated initializers

  • gallium: increase the size of pipe_box y, height fields to allow bigger textures

  • nir: rename AMD XFB intrinsics to *_gfx11_amd

  • nir,amd: add nir_intrinsic_load_debug_log_desc_amd and its use

  • aco: implement aco_is_gpu_supported using switch statement

  • aco: add a helper printing shader asm by disassembling via LLVM

  • ac/llvm: remove remnants of gfx10 NGG streamout

  • radeonsi: implement the shader debug log from ac_nir_store_debug_log_amd

  • nir/validate: validate interp_mode of load_barycentric_*

  • nir/lower_io: add nir_io_semantics::interp_explicit_strict

  • nir/validate: validate more fields of nir_io_semantics

  • tgsi_to_nir: translate TG4

  • nir/opt_varyings: don’t generate IO with unsupported bit sizes

  • nir/opt_varyings: simplify nir_io_semantics::num_slots of directly-indexed slots

  • nir/opt_varyings: handle load_input_vertex

  • ac/surface: add radeon_surf::thick_tiling

  • ac/nir: allow 16-bit results for resinfo

  • ac/llvm: simplify extracting an element in get_image_coords

  • ac/llvm: add support for 16-bit coordinates (A16) for image (non-sampler) opcodes

  • ac/llvm: allow image loads to return less than 4 components, trim DMASK

  • ac/llvm: remove handling of input and output loads/stores that are lowered

  • ac/llvm: remove unused fields of ac_shader_abi

  • ac/llvm: simplify the optimization barrier and apply it to the whole vector

  • ac: add helper ac_get_ip_type_string to remove duplication

  • nir: add more build helpers

  • nir: allow FP16 in nir_format_linear_to_srgb

  • nir: add nir_intrinsic_optimization_barrier_sgpr_amd

  • nir: change “user_data_amd” sysval from 4 to 8 components

  • nir/use_dominance: set the root as post-dominator of unmovable instructions

  • util: add new format helpers

  • util: import pipe_box and its helpers

  • ac/llvm: fix assertions for texture instructions with 16-bit LOD bias

  • ac/llvm: always trim components of texture instructions, trim DMASK

  • ac/surface: constify and reindent NIR meta address-from-coord function params

  • radeonsi/ci: update gfx11 failures

  • radeonsi/gfx11: don’t prefetch constants in binaries into the instruction cache

  • radeonsi/gfx11: enable DCC fast clears for 8-bit and 16-bit formats

  • radeonsi: use the same nir_lower_subgroups_options as RADV

  • radeonsi: add the radeonsi_optimize_io option into the shader cache key

  • radeonsi: check has_stable_pstate in the winsys

  • radeonsi: move TCS epilog key bits to the key->ge.opt section

  • radeonsi: fix initialization of occlusion query buffers for disabled RBs

  • radeonsi: don’t expose samples_identical and don’t lower FMASK if it’s disabled

  • radeonsi: allow input NIR to use descriptors in image opcodes

  • radeonsi: move blitter resource_copy_region implementation to si_gfx_copy_image

  • radeonsi: move blitter clear_render_target impl into si_gfx_clear_render_target

  • radeonsi: preserve NaNs in draw-based resource_copy_region

  • radeonsi: use simpler UINT fallback formats for draw-based resource_copy_region

  • radeonsi: remove si_use_compute_copy_for_float_formats

  • radeonsi: change allow_flat_shading to make it a single condition

  • radeonsi: don’t call resource_copy_region in pipe->blit

  • radeonsi/gfx11: implement DCC clear to “single” for fast non-0/1 clears

  • radeonsi: disable VRS flat shading for selected 8xMSAA and thick tiling cases

  • radeonsi: don’t use si_get_flush_flags() for flushing images

  • radeonsi: don’t flush CB in si_launch_grid_internal_images if not needed

  • radeonsi: don’t flush CB and DB if there have been no draw calls

  • radeonsi: enable fast FB clears for conditional rendering

  • radeonsi: make clear_render_target clear DCC directly instead of via pipe->clear()

  • radeonsi: don’t add whether NIR is used into the shader key

  • radeonsi: only expose 8 EQAA samples due to shader limitations

  • radeonsi: always run nir_opt_16bit_tex_image

  • radeonsi: use ip_type in debug code instead of hardcoding GFX

  • radeonsi: implement user_data_amd for 5, 6, and 7 components correctly

  • util: shift the mask in BITSET_TEST_RANGE_INSIDE_WORD to be relative to b

Mark Collins (32):

  • tu/kgsl: Fix sync_wait’d FD in kgsl_syncobj_wait

  • tu/a7xx: Update CCU layout logic for A7XX

  • tu: Allow GMEM on A7XX when TU_DEBUG=gmem

  • tu: Set A7XX registers in `tu6_tile_render_begin`

  • tu: Set `CP_THREAD_CONTROL::CONCURRENT_BIN_DISABLE` in A7XX HW init

  • tu: Only set PC/VFD PWR_CNTL regs on A6XX

  • tu: Use `CP_SET_PSEUDO_REG` for A7XX VSC stream regs

  • tu/autotune: Use `CP_EVENT_WRITE7::ZPASS_DONE` on A7XX

  • tu: Set `RB_UNKNOWN_88E4` for A7XX event blits

  • freedreno/devices: Update A7XX tile values

  • tu: Use full size color CCU in sysmem mode

  • tu: Update CCU layout selection logic for seperate stencil stores

  • tu: Allow event blit to resolve depth stencil formats

  • tu: Fix 2D blit path for GMEM stores on A7XX

  • tu: Use `Z24_UNORM_S8_UINT_AS_R8G8B8A8` for A7XX GMEM D24S8 blits/clear

  • tu: Disable LRZ properly on A7XX

  • tu: Set RB_CCU_CNTL during HW init on A7XX

  • tu: Fix CP_BLIT sync on A7XX

  • tu: Clear `VSC_UNKNOWN_0D08` on A7XX

  • tu: Add blit cache flushing for input attachments

  • tu: Unconditionally enable GMEM on A7XX

  • fd/replay: Fix wrbuffer name extraction

  • fd/replay: Dump wrbuf into cwd rather than exe directory

  • fd/replay: Clamp dumped wrbuf to buffer size

  • fd/replay: Clear wrbufs after submitting cmdstreams for DRM

  • fd/replay: Add wrbuf support for KGSL/DXG

  • fd/replay: Error when VMA AS allocation fails

  • fd/replay+rddecompiler: Add option to clear wrbufs at start

  • fd/rddecompiler: Disable IR3 cache for replay context

  • fd/decode: Build generate_rd executable rather

  • fd/replay: Use generate_rd as default CS generator

  • fd/decode: Fix “OPTSIONS” typo in help messages

Mark Janes (18):

  • hasvk: add missing linker arguments

  • util: add parson for handling json files

  • intel/dev: specify struct intel_device_info type details in python

  • intel/dev: generate declarations for struct intel_device_info

  • intel/tools: add intel device meson dependencies

  • intel/dev: implement json serialization for intel_device_info

  • intel/dev/tools: add json as an output format for intel_dev_info

  • intel/tools: load json device info in drm_shim

  • intel/dev: improve meson invocation for intel_device_info gen

  • intel/compiler: generate a hash function to use with the shader cache

  • iris: use device info sha in device renderer string

  • anv: use intel_device_info to set device UUID

  • intel/tools: move intel_dev_info to intel/tools

  • intel/tools: add shader compiler hash key to json devinfo format

  • pan/va: Add missing valhall_enums dep to bifrost_tests

  • intel/dev: declare workarounds required by ATSM platforms

  • intel/dev: remove pci revision from shader cache key

  • intel/compiler: drop unused ray-tracing fields from cache hash

Martell Malone (3):

  • nine: r500 under 20 fragments cap is a warning

  • nine: detect emulation fallback of d3d coordinates

  • nine: update verbiage for enduser device messages

Martin Krastev (5):

  • svga/ci: land vmware mesa-ci lava farm

  • svga/ci: workaround vmware farm’s inability to use public DNS 8.8.8.8

  • svga/ci: re-enable vmware farm

  • svga/ci: add two new piglit flakes to svga

  • svga/ci: disable vmware farm

Martin Roukala (né Peres) (10):

  • radeonsi/ci: update vangogh’s expectations after piglit uprev

  • zink/ci: update navi31’s expectations after piglit uprev

  • zink/ci: update polaris10’s expectations after piglit uprev

  • radv/ci: switch vkcts-polaris10 from mupuf to KWS’ farm

  • radv/ci: add a vkcts-tahiti job

  • radv/ci: add a vkd3d-tahiti job

  • ci/b2c: rename .b2c-test-{vk,gl} to .b2c-x86_64-test-{vk,gl}

  • ci/b2c: rename .deqp-test-valve into .b2c-deqp-test

  • ci/b2c: allow setting the DTB to be used

  • ci/valve: remove the traces runner

Mary Guillemard (37):

  • nouveau: nvidia_header: Add AMPERE_A in vk_push_print

  • nouveau: nvidia_header: Add TURING_COMPUTE_A and AMPERE_COMPUTE_A in vk_push_print

  • nouveau: nvidia_header: Add AMPERE_COMPUTE_B in vk_push_print

  • nouveau: nvidia-headers: Add compute array parsing to class_parser.py

  • nouveau: nvidia-headers: Add nv_push_dump tool

  • nouveau: mme: Add a dumper

  • agx: Add more bitops in agx_bitop_table

  • agx: Remove and/or/xor pseudo ops

  • agx: Fuse not into and/or/xor

  • agx: Add a bitop optimizer pass

  • pan/bi: assert indices when offsets are present in bi_emit_tex_valhall

  • pan/lib: Remove variables in blitter

  • pan/bi: Rework indices for tex on Valhall

  • pan/bi: Rework indices for image on Valhall

  • pan/bi: Rework indices for attributes on Valhall

  • pan/bi: Lower ubo table in indices for Valhall

  • panfrost, pan/lib: Move pan_resource_table to panfrost

  • nvk: Always copy conditional rendering value before compare

  • drm-shim: Add io region handling in mmap

  • panfrost: Add support for Panthor in drm-shim

  • docs: Document Mali-G610 in drm-shim section

  • panfrost: group up stubbed params in drm-shim

  • nouveau: Add support for TERT opcodes in vk_push_print

  • nouveau: Fix NINC TERT handling in vk_push_print

  • nak: Support unaligned swizzles in 8/16 bits vec srcs

  • nak: move folding logic to Src::fold_imm

  • nak: Add F16 and F16v2 sources

  • nak: Improve copy propagation pass to handle F16

  • nak: Add 16-bits float operations

  • nvk: Advertise shaderFloat16

  • nvk: Allow various alu op to be vectorized for 2xfp16

  • nak: Allow SHF to use immediate encoding for shift

  • panvk: Return os_page_size for minMemoryMapAlignment

  • panvk: Fix driver UUID not being filled

  • panvk: Move to vk_properties

  • panvk: Advertise VK_KHR_driver_properties

  • nak: Pass has_mod to all form of src2 requiring it

Mary Strodl (2):

  • rusticl: set OCL_ICD_VENDORS as directory, not file

  • NirShader: don’t fail on null constant_buffer

Matt Turner (5):

  • util: Add DETECT_ARCH_HPPA macro

  • util/tests: Disable half-float NaN test on hppa/old-mips

  • meson: Limit intel_vk_rt to x86_64

  • anv/drirc: Add option to control implicit sync on external BOs

  • intel: Build float64 shader only for Vulkan

Matthew Waters (1):

  • teximage: allow glCopyTex{Sub}Image[123]D into R/RG textures with OpenGL ES 2.0

Max R (11):

  • d3d10umd: Fix compilation

  • winsys/gdi: Handle R8G8B8 formats

  • winsys/gdi: Custom acquisition of hDC

  • d3d10umd: Use flush_frontbuffer for Present

  • virgl: Fix compilation on MSVC

  • virgl: Fix crash when no VE bound

  • virgl: Implement PIPE_QUERY_GPU_FINISHED

  • virgl: Allow importing resources without known templ

  • virgl: Pass cmd_buf to flush_frontbuffer

  • d3d10umd, meson: Allow naming d3d10umd DLLs

  • d3d10umd: Rename d3d10sw target to d3d10umd

Michel Dänzer (2):

  • egl/wayland: Flush after blitting to linear copy

  • wsi/wayland: Dispatch event queue in wsi_wl_swapchain_queue_present

Mike Blumenkrantz (313):

  • vk/cmdbuf: add back deleted maint6 workgraph bits

  • lavapipe: use pushconstants2 for dgc

  • lavapipe: fix devenv icd filename

  • zink: fix separate shader patch variable location adjustment

  • lavapipe: delete extra descriptor buffer layout validation

  • zink: use local screen variable in surface creation

  • zink: hook up maint6

  • zink: use maint6 for multi-layer compressed surface creation

  • zink: set more dynamic states when using shader objects

  • lavapipe: KHR_dynamic_rendering_local_read

  • zink: always map descriptor buffers as COHERENT

  • zink: fix descriptor buffer unmaps on screen destroy

  • lavapipe: RM2024 extension promotions

  • zink: add a tu flake

  • zink: prune dmabuf export tracking when adding resource binds

  • zink: fix sparse bo placement

  • zink: zero allocate resident_defs array in ntv

  • zink: move sparse lowering up in file

  • zink: run sparse lowering after all optimization passes

  • zink: add back (safe) optimizations after sparse lowering

  • zink: split out sparse_residency_code_and lowering

  • mesa: plumb errors through to texture allocation

  • zink: adjust swizzled deref loads by the variable component offset

  • nir/lower_io: fix handling for compact arrays with indirect derefs

  • zink: only add arrays to indirect non-tcs variables

  • zink: promote a conditional on gfx shader destroy

  • zink: clamp zink_gfx_lib_cache::stages_present for generated tcs

  • zink: promote gpl libs freeing during shader destroy out of prog loop

  • zink: don’t add VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for sparse textures

  • zink: add a ci skip

  • ci: bump VVL to snapshot-2024wk06

  • zink: update vvl expectations

  • mesa: check driver format support for certain GetInternalformativ queries

  • zink: always enable glsl_correct_derivatives_after_discard

  • zink: add a750 baseline

  • zink: delete maxDescriptorBufferBindings checks

  • zink: flag the use_img as unsync access with buf2img copies, not the swapchain

  • zink: pre-check formats for samplecount support

  • zink: validate sample count on image create

  • zink: add an assert for dummy fb surface creation

  • zink: compute bo unique_id on use, not creation

  • zink: avoid infinite recursion on (very) small BAR systems in bo alloc

  • egl/x11/swrast: deduplicate ANGLE_sync_control_rate enablement

  • drisw: hook up EXT_buffer_age

  • drisw/egl: delete unused buffer age handling

  • vk/wsi/x11/sw: use swapchain depth for putimage

  • mesa: add more driver support checks for more format queries

  • zink: add checks/compat for low-spec descriptor buffer implementations

  • zink: add a second fence disambiguation case

  • zink: force host-visible allocations for MAP_COHERENT resources

  • zink: hook up KHR_dynamic_rendering_local_read

  • zink: use KHR_dynamic_rendering_local_read

  • ci: make clang-format job warn on failure instead of killing the pipeline

  • zink: handle stencil_fallback in zink_clear_depth_stencil

  • zink: don’t destroy the current batch state on context destroy

  • zink: only scan active batch states for free states if > 1 exist

  • zink: fix longstanding issue with active batch state recycling

  • zink: assert that batch_id is valid in zink_screen_check_last_finished()

  • zink: move flagging rp_changed in zink_update_fbfetch() to caller

  • zink: don’t pre-init dummy fbfetch surface when missing nullDescriptor feature

  • zink: also set null fbfetch surfaces when no fb surface is bound

  • zink: break out null fbfetch init for descriptor buffer

  • zink: create/resize dummy surfaces on-demand

  • zink: start out with 256x256 sized dummy surfaces

  • zink: don’t pre-init null fbfetch info

  • zink: clamp in_rp clears to fb size

  • zink: fix (dynamic rendering) execution of scissored clears during flush

  • zink: fix swapchain readback conditional

  • zink: lock buffer age when chundering swapchain for readback

  • zink: flag acquired swapchain image as readback target on acquire, not present

  • zink: make kopper_swapchain_image::acquired the resource that acquired it

  • zink: add a swapchain readback case for reading differently-acquired image

  • zink: make readback attempts count towards ZINK_READBACK_THRESHOLD

  • zink: update swapchain readback cache on create

  • zink: set and manage a flag indicating that swapchain readback needs updating

  • zink: only update swapchain readback cache on create if necessary

  • zink: only update swapchain readback cache when necessary

  • zink: use new flag to determine whether swapchain readback cache is usable

  • zink: update nv blob baseline

  • zink: add nvk baseline

  • ci: disable clang-format job

  • zink: apply all storage memory masks to control barriers if no modes are specified

  • zink: emit SpvCapabilityImageMSArray for ms arrayed storage images

  • zink: null out bo usage when allocating from slab

  • zink: fix unsynchronized read-mapping of device-local buffers

  • zink: delete unused buffer map conditional

  • zink: force max buffer alignment on return ptrs for mapped staging buffers

  • gallium: add a nboxes param to flush_frontbuffer

  • winsys/sw: propagate nboxes to displaytarget_display()

  • drisw: plumb through a swapBuffersWithDamage interface

  • egl/wayland/sw: move swrast_update_buffers() directly into swapbuffers

  • egl/wayland/sw: move dri2_wl_swrast_commit_backbuffer() directly into swapbuffers

  • egl/wayland: unify back/current swapping between zink and swrast

  • egl/wayland/sw: split out surface attach from dri2_wl_swrast_commit_backbuffer()

  • egl/wayland/sw: call dri2_wl_swrast_attach_backbuffer() before swap

  • egl/wayland/sw: trigger damage from put_image2

  • egl/wayland/sw: move partial->full copy promotion to swapbuffers

  • egl/wayland/sw: fix no-op updating of current backbuffer

  • egl/wayland/sw: pass damage region through from put_image2 to wl_surface_damage

  • egl/wayland/sw: clamp putimage geometry to surface size

  • drisw/xlib: loop over all the boxes in display() hook

  • drisw/winsys: loop over all the boxes in display()

  • drisw: pass all frontend swapbuffer damage rects through

  • egl/kopper: plumb through SwapBuffersWithDamage

  • egl/kopper: advertise EXT_swap_buffers_with_damage only in non-sw mode

  • egl/wayland: split out kopper vtable

  • egl/wayland: add a separate hook for kopper buffer age

  • egl/wayland: split out kopper swapbuffers functions

  • egl/kopper: call swrast buffer age query for kopper+swrast

  • kopper: set drawable buffer age

  • egl/wayland/kopper: actually call kopper swapbuffer functions

  • egl/wayland: split out kopper update_buffers

  • egl/wayland: delete swrast references to zink

  • zink: fix stencil-only blitting with stencil fallback

  • zink: make zink_kopper_present_info public

  • zink: use a slab allocator for zink_kopper_present_info

  • zink: hook up VK_KHR_incremental_present

  • zink: use VK_KHR_incremental_present to propagate damage rects

  • zink: hook up KHR_partial_update

  • vulkan/dispatch_table: add an uncompacted version of the table

  • zink: use uncompacted vk_dispatch_table

  • egl/dri2: use the right egl platform enum

  • glx: only print zink failure-to-load messages if explicitly requested

  • zink: stop enabling EXT_conservative_rasterization

  • lavapipe bump descriptor buffer address space limits

  • zink: fix PIPE_CAP_MAX_SHADER_PATCH_VARYINGS

  • zink: call CmdSetRasterizationStreamEXT when using shader objects

  • nvk: bump NVK_PUSH_MAX_SYNCS to 256

  • zink: update nvk baseline

  • util/blitter: iterate samples in stencil_fallback

  • mesa: fix CopyTexImage format compatibility checks for ES

  • zink: update nvk baseline with nvk changes

  • driconf: add radv_zero_vram for Crystal Project (1637730)

  • zink: update nv baseline

  • zink: track whether shaders use load_barycentric_at_sample

  • zink: apply zink_shader::uses_sample to fs variant updating

  • zink: destroy batch states after copy context

  • zink: set VkExternalMemoryBufferCreateInfo for opaque fds too

  • zink: simplify vb masking on bind

  • mesa: force rendertarget usage on required-renderable formats

  • zink: try getting sparse page size again without storage bit on fail

  • u/inlines: constify util_res_sample_count()

  • zink: only add STORAGE bit for sparse images based on multisample usage

  • zink: nvk baseline updates

  • zink: set the sparse format usage flags directly based on queried props

  • zink: delete faked_e5sparse

  • zink: rename optimal_key in update_gfx_program_optimal()

  • zink: use the sanitized key in update_gfx_program_optimal()

  • zink: always sync and replace separable progs even with ZINK_DEBUG=noopt

  • zink: add even more strict checks for separate shader usage

  • zink: be even stricter with shader object usage about blocking invalid usage

  • zink: remove stale comments for DRLR usage

  • zink: add a pass to strip out multisample storage image ops

  • zink: don’t deref swapchain image array with UINT32_MAX

  • zink: handle image_deref_samples when stripping MS image instrs

  • zink: iterate all the modes when doing separate shader fixups

  • mesa/st: add ‘base_serialized_nir’

  • mesa/st: add is_draw_shader param to st_finalize_nir

  • mesa/st: when creating draw shader variants, use the base nir and skip driver opts

  • mesa/st: use sanitized shader keys for feedback draws

  • zink: do io fixup on patch variables too

  • zink: defer present barrier to flush if a clear is pending

  • zink: clamp present region size

  • zink: clamp swapchain renderarea instead of asserting

  • zink: set dynamic rendering color attachment layouts

  • radv: inline radv_device_fault_detection_enabled

  • ci: bump VVL to v1.3.281

  • nir/divergence: add zink intrinsics

  • nir/opt_varyings: update alu type when rewriting src/dest for moved ops

  • zink: only check that CUBE_COMPATIBLE for images doesn’t subtract flags

  • zink: don’t use set_foreach_remove with dmabuf_exports

  • zink: make descriptor pool creation more robust

  • zink: fix shaderdb pipeline compile

  • zink: delete some ntv dead code

  • zink: always sort io variables by location after re-creating them

  • zink: use outputs_written mask to detect edge flag usage

  • zink: update xfb info after lower_to_scalar

  • zink: run scan_nir before variable rework

  • zink: apply component offset for CLIP/CULL DIST1 location derefs

  • zink: manually calc clip/cull distance sizes

  • zink: add a helper to detect clip/cull dist locations

  • zink: always use shader sizes for clip/cull dist variables

  • zink: fix generated variable expansion

  • zink: check for arrayness rather than tess io vars for indirect array vars

  • zink: track a mask of arrayed io locations on shaders

  • zink: call gather_info during shader creation

  • zink: always check patch io during rework_io_vars

  • zink: don’t clobber indirect array reads with missing components

  • zink: fix io slot calculation for vertex inputs in add_derefs

  • zink: fix add_derefs case for compact arrays

  • zink: only use location_frac for deref array indexing for compact variables

  • llvmpipe: fix DRAW_USE_LLVM=0

  • nir/lower_wpos_ytransform: move new value load to start of function, reuse

  • nir/lower_wpos_ytransform: reuse input zw components for fragcoord rewrite

  • nir/lower_wpos_ytransform: update comment to reflect variable usage

  • nir/lower_wpos_ytransform: scalarize emit_wpos_adjustment

  • nir/lower_wpos_ytransform: fix for lowered io

  • glsl: handle xfb resources for spirv before running varying opts

  • mesa: clamp binary pointer in ShaderBinary if length==0

  • gallium: rework PIPE_CAP_POINT_SIZE_FIXED

  • zink: delete some maintenance5 psiz pruning

  • zink: fix add_derefs for partial interp loads of derefs

  • zink: assert that ntv interp handling isn’t doing implicit component expansion

  • egl/x11: disable swapbufferswithdamage for zink without kopper

  • glx/egl: fix LIBGL_KOPPER_DISABLE

  • glsl: set PSIZ bit in outputs_written when injecting a 1.0 psiz write

  • nir/lower_clamp_color_outputs: fix use with lowered io

  • nir/lower_flatshade: break out location checking

  • nir/lower_flatshade: fix with lowered io

  • nir/lower_alpha_test: fix use with lowered io

  • nir/lower_two_sided_color: rework for lowered io

  • nir/lower_drawpixels: fix for lowered io

  • nir/lower_clip_disable: fix for lowered io

  • nir/lower_point_size_mov: rework.

  • nir/lower_point_size_mov: fix for lowered io

  • nir/texcoord_replace: fix scalarized io handling

  • nir/dominance: fix comment

  • drisw: reorder image extensions

  • sw_winsys: add displaytarget_create_mapped

  • winsys/null: implement displaytarget_create_mapped

  • winsys/drisw: implement displaytarget_create_mapped

  • winsys/drisw: implement dmabuf handling

  • lavapipe: add a function for asserting external memory handle types

  • winsys: add WINSYS_HANDLE_TYPE_UNBACKED

  • winsys: add more stride members to winsys_handle

  • lavapipe: EXT_queue_family_foreign

  • lavapipe: rework mem handle type assert to handle dmabuf

  • lavapipe: handle drm image format queries

  • lavapipe: handle drm image imports

  • docs: update lavapipe features

  • nir: add compact_arrays to nir_shader_compiler_options

  • nir/gather_info: fix gathering for compact arrayed builtins

  • zink: set compact_arrays in compiler options

  • microsoft/compiler: set compact_arrays in compiler options

  • lavapipe: don’t clamp index buffer size for null index buffer draws

  • v3d: set use_clipdist_array=true for lower_clip?

  • nir/lower_clip: surgerize for lowered io

  • nir/lower_clip: handle scalarized io

  • zink: block LA formats with srgb

  • llvmpipe: clamp 32bit query results to low 32 bits rather than MIN

  • lavapipe: clamp 32bit query results to low 32 bits rather than MIN

  • agx: set compact_arrays in compiler options

  • v3d: set compact_arrays in compiler options

  • intel: set compact_arrays in compiler options

  • freedreno: set compact_arrays in compiler options

  • glsl: stop using PIPE_CAP_NIR_COMPACT_ARRAYS and check compact_arrays

  • ttn: stop using PIPE_CAP_NIR_COMPACT_ARRAYS and check compact_arrays

  • glsl: move an assert from st_context over to avoid using PIPE_CAP_NIR_COMPACT_ARRAYS

  • mesa: delete LowerCombinedClipCullDistance from consts

  • st/program: stop using PIPE_CAP_NIR_COMPACT_ARRAYS and use compact_arrays

  • nine: stop checking PIPE_CAP_NIR_COMPACT_ARRAYS and use compact_arrays

  • gallium: delete PIPE_CAP_NIR_COMPACT_ARRAYS

  • zink: set indirect io compiler flags

  • zink: set lower_to_scalar

  • zink: rework rework_io_vars

  • zink: set nir_io_glsl_lower_derefs in compiler options

  • zink: add a pass to fix vertex input locations

  • zink: enable opt_varyings with ZINK_DEBUG=ioopt

  • zink: ci updates

  • nir/remove_unused_io_vars: check all components to determine variable liveness

  • ci: kill piano trace globally

  • nir: print i/o variables in location order

  • lavapipe: disable stencil test if no stencil attachment

  • egl: fix defines for zink’s dri3 check

  • egl/android: fix zink loading

  • egl: use os_get_option for MESA_LOADER_DRIVER_OVERRIDE

  • zink: disable buffer reordering correctly on shader image binds

  • nir/print: stop trying to match i/o vars using base/driver_location

  • zink: add ZINK_DEBUG=nopc to completely disable precompilation

  • zink: destroy shaderdb pipelines

  • zink: add VK_PIPELINE_CREATE_CAPTURE_STATISTICS_BIT_KHR for shaderdb

  • brw/lower_a2c: fix for scalarized fs outputs

  • zink: copy shader name when copying shader info

  • zink: run nir_lower_io_to_scalar (mostly) unconditionally and earlier

  • zink: vectorize io loads/stores when possible

  • zink: ci updates

  • zink: prune some piglit cts fails

  • loader: delete unused param from pipe_loader_vk_probe_dri()

  • glx: fix some indentation

  • glx: add an ‘implicit’ param to createScreen

  • glx: pass implicit load param through allocation

  • dri: plumb a ‘implicit’ param through createNewScreen interfaces

  • gbm: plumb an ‘implicit’ param through device creation

  • frontends/dri: plumb an ‘implicit’ param through screen init

  • pipe-loader: plumb a flag for implicit driver load through screen creation

  • zink: don’t print error messages when failing an implicit driver load

  • glx: silence more implicit-load zink errors

  • mesa/st: don’t use serialized_nir for cached shaders

  • zink: make NOREORDER mode context-based

  • zink: disable command reordering for compute-only contexts

  • nir: store variable names to io instrs during io lowering

  • nir/lower_io_to_scalar: preserve variable names when splitting io

  • nir/clone: preserve intrinsic name field across clones

  • nir/print: print io instr->name if available

  • zink: preserve/merge variable names when generating new variables

  • glthread: check for invalid primitive modes in DrawElementsBaseVertex

  • zink: reconstruct features pnext after determining extension support

  • zink: prune zink_shader::programs under lock

  • zink: fully wait on all program fences during ctx destroy

  • kopper: fix bufferage/swapinterval handling for non-window swapchains

  • zink: slightly better swapinterval failure handling

  • kopper: don’t set drawable buffer age

  • zink: clean up accidental debug print

  • egl/x11: disable dri3 with LIBGL_KOPPER_DRI2=1 as expected

  • zink: add a batch ref for committed sparse resources

  • u_blitter: stop leaking saved blitter states on no-op blits

  • freedreno/replay: use inttypes format string for 64bit

  • frontends/dri: only release pipe when screen init fails

  • frontends/dri: always init opencl_func_mutex in InitScreen hooks

  • zink: clean up semaphore arrays on batch state destroy

  • egl/dri2: fix error returns on dri2_initialize_x11_dri3 fail

  • nir/lower_aaline: fix for scalarized outputs

  • nir/linking: fix nir_assign_io_var_locations for scalarized dual blend

Mike Hsieh (1):

  • amd/vpelib: Add param check for geometric scaling and refactor

Mohamed Ahmed (14):

  • nil: change image_level_size() to take tiling in account

  • nil: Add helper function to get tile size in pixels

  • nil: Add helpers for conversion from pixel values to tiles

  • nil: Expose tiling_extent_B()

  • nil: Add support for sparse resident images

  • nvk: add sparse queries

  • nvk: enable sparse residency features

  • nak: wire up shader resource residency intrinsics

  • nak: wire up sparse image loads

  • nvk: advertise shader resource residency

  • nil: Add a nil_image::compressed bit

  • nil: Add some helpers for DRM format modifiers

  • nil: Support creating images with DRM modifiers

  • nvk: enable rendering to DRM_FORMAT_MOD_LINEAR images

Mykhailo Skorokhodov (2):

  • egl/wayland: Fix sRGB format look up for config

  • ci/lima: expect fail of window_8888_colorspace_srgb on wayland

Nanley Chery (13):

  • iris: Don’t memset the extra_aux memory range

  • iris: Don’t memset CCS on integrated gfx12

  • iris: Enable pass-through state init for gfx12 CCS

  • isl: Pick a better initial state for zeroed MCS

  • iris: Copy main ISL surf when reallocating in place

  • iris: Report the correct modifier for Tile4 images

  • iris: Use resource_get_param in resource_get_handle

  • intel/isl: Remove inconsistency when choosing Tile64

  • intel/isl: Remove inconsistency when encoding Tile64

  • intel/isl: Remove a CCS_D check from gfx12+ code

  • intel/isl: Enable a 64KB alignment WA for flat-CCS

  • intel/isl: Use Tile64 to align images for CCS WA

  • intel/isl: Disable miptails to align LODs for CCS WA

Neil Armstrong (1):

  • freedreno: Add a750 clock gating control related registers

Nikita Popov (1):

  • Pass no-verify-fixpoint option to instcombine in LLVM 18

Oskar Viljasaar (3):

  • vulkan/properties: Start looping from the next member in GPDP2

  • tu: Use common physical device properties infrastructure

  • compiler/types: Fix glsl_dvec*_type() helpers

Patrick Lerda (16):

  • glsl/nir: fix gl_nir_cross_validate_outputs_to_inputs() memory leak

  • r300: fix vertex_buffer related refcnt imbalance

  • r300: fix r300_destroy_context() related memory leaks

  • r300: fix memory leaks when register allocation fails

  • r300: fix constants_remap_table memory leak

  • radeonsi/gfx10: fix main_shader_part_ngg_es memory leak

  • r300: enable R400 cos and sin hardware vertex shader opcodes

  • ac/llvm,radeonsi: fix memory leaks triggered by ac_nir_translate() errors

  • r300: fix NIR passes regression

  • r300: fix constants_remap_table memory leak related to the dummy shader path

  • r300: fix r300_draw_elements() behavior

  • panfrost: remove panfrost_create_shader_state() related dead code

  • gallium/auxiliary/vl: fix typo which negatively impacts the src_stride initialization

  • clover: fix pipe_box update regression

  • clover: fix memory leak related to optimize

  • r600: fix vertex state update clover regression

Paul Gofman (3):

  • glsl: allow out arrays in #110 with allow_glsl_120_subset_in_110

  • driconf: add a workaround for Joe Danger 2

  • driconf: add a workaround for Joe Danger

Paulo Zanoni (35):

  • zink: fix bind size handling in buffer_bo_commit()

  • anv/sparse: add an extra step before anv_sparse_bind_resource_memory()

  • anv/sparse: allow binding operations to match the resource size

  • anv+zink/ci: remove recently fixed tests from the crash list

  • anv/sparse: don’t issue a single bind operation per vm_bind ioctl

  • anv/sparse: leave the semaphore waits and signals to the vm_bind ioctl

  • anv/sparse: don’t use the bind_timeline when doing sparse binding

  • anv: change the vm_bind-related kmd_backend vfuncs to return VkResult

  • anv: add an anv_pipe_bits bit to allow invalidating the TLB

  • anv/trtt: invalidate the TLB after writing TR-TT entries

  • anv/trtt: update GFX_TRTT_VA_RANGE for LNL

  • anv: don’t leak device->vma_samplers

  • anv: set shaderFloat64 to true when fp64_workaround_enabled

  • driconf/anv: set fp64_workaround_enabled to DIRT 5

  • anv/xe: don’t leak xe_syncs during trtt submission

  • anv/xe: don’t overwrite the result from vk_sync_wait()

  • vulkan: don’t zero-initialize STACK_ARRAY()’s stack array

  • anv, iris: add missing CS_STALL bit for GPGPU texture invalidation

  • anv: reduce struct anv_image_memory_range from 32 to 24 bytes

  • vulkan: reduce struct vk_object_base by 8 bytes

  • anv/sparse: remove useless isl_surf_get_tile_info() call

  • anv/sparse: remove unnecessary popcount assertions

  • anv/sparse: adjust sparse_bind_image_memory debug messages

  • anv/sparse: remove unused dump_vk_sparse_memory_bind()

  • anv/sparse: replace device->using_sparse with device->num_sparse_resources

  • anv/sparse: rework anv_free_sparse_bindings() error handling

  • anv/xe: extract anv_vm_bind_to_drm_xe_vm_bind()

  • anv/xe: add a ‘flags’ parameter to the vm_bind() kmd_backend function

  • anv/xe: slightly improve error handling for the vm_bind ioctl

  • anv/xe: assert we’re using drm_syncobjs only once

  • anv/xe: de-duplicate xe_exec_fill_sync()

  • anv/xe: rename and refactor xe_exec_fill_sync()

  • anv/sparse: fail the right way in anv_GetDeviceImageSparseMemoryRequirements()

  • anv: const-correct anv_{image,buffer}_is_sparse()

  • isl: add ISL_TILING_64_XE2 to isl_tiling_to_name()

Pavel Ondračka (38):

  • r300: fix reusing of color varying slots for generic ones

  • r300: skip draw if vertex shader does not write gl_Position

  • r300/ci: switch to deqp-runner suite

  • r300/ci: add the KHR gles2 tests

  • r300/ci: move streaming-texture-leak from fails to skips

  • r300: fix writemask for nir_intrinsic_load_ubo_vec4

  • r300: skip backend DCE for vertex shaders

  • r300: remove R3xx/R4xx backend absulute modifier lowering

  • r300/ci: add dEQP on RV380

  • r300: remove backend SLE and SGT support

  • r300: add r300_is_only_used_as_float helper

  • r300: optimize out more modifiers produced later

  • r300: lower comparison ops early in NIR

  • r300: remove SGE, SNE, SLT, SGE lowering in the backend

  • r300: remove the remaining of backend constant folding

  • r300: remove backend support for SUB

  • r300/ci: update piglit fails

  • r300: remove compiler tests

  • r300/ci: add two more observed piglit flakes

  • r300: fix vs output register indexing

  • r300: add explicit flrp lowering

  • ci: install xwayland in x86_64_test-gl

  • ci: build nine in debian-testing

  • ci: build nine tests

  • r300/ci: enable nine tests

  • r300: explicitly check if sin/cos input is already in correct range

  • r300: move sin/cos input fixups to finalize_nir

  • r300: remove some late NIR passes

  • nir/lower_vec_to_regs: always set cursor before inserting decl_reg

  • r300: check for the extra restrictions on presubtract swizzles

  • r300: move presubtract pass later

  • r300: optimize swizzle for inline constants

  • r300: inline unoptimized_ra ntr option

  • r300: get rid of the unused ubo_vec4_max ntr option

  • r300: remove the ntr lower_cmp option

  • r300: move lower_fabb option out of the options struct

  • r300: remove nir_to_rc_options wrapper

  • r300/ci: failures list update

Peyton Lee (6):

  • radeonsi/vpe: remove wait source surface fence and while loop

  • radeonsi/vpe: disable info log

  • radeonsi/vpe: move flush to si_vpe_processor_end_frame

  • radeonsi/vpe: support multi-buffer

  • radeonsi/vpe: pre-allocate stream struceutre

  • radeonsi/vpe: add support for p010

Philip Rebohle (1):

  • radv: Remove dead shared variables after optimization loop.

Philipp Zabel (6):

  • rusticl: work around reference-to-mutable-static warnings

  • etnaviv: common: Add PIPE_3D feature bit

  • etnaviv: Avoid duplicate query of ETNA_GPU_FEATURES_0 parameter

  • etnaviv: hwdb: Add VIP_V7 and NN_XYDP0 feature bits

  • etnaviv: Add nn_core_version field to etna_specs

  • etnaviv/nn: Extend post-multiplier for v8 architecture

Pierre-Eric Pelloux-Prayer (21):

  • radeonsi: compute epitch when modifying surf_pitch

  • Revert “ci/radeonsi: disable VA-API testing on raven”

  • radeonsi: emit cache flushes before draw registers

  • radeonsi: adjust flags for si_compute_shorten_ubyte_buffer

  • winsys/amdgpu: use syncobj rather than amdgpu fence

  • ac, radeonsi: remove has_syncobj, has_fence_to_handle

  • radeonsi: try to disable dcc if compute_blit is the only option

  • meson: require libelf when radeonsi is built

  • egl/drm: flush before calling get_back_bo

  • radv: don’t remove the blit queue from the device queues

  • winsys/amdgpu: unmap user fence BO before destroy

  • winsys/amdgpu: remove unused amdgpu_fence_is_syncobj

  • wsi/wl: flush connection on swapchain failure

  • mesa: deal with vbo_save_vertex_list::modes being NULL

  • wsi/wl: check wsi_wl_surface’s validity before use

  • egl/wayland: use __DRI_IMAGE_PRIME_LINEAR_BUFFER in get_back_bo

  • winsys/radeon: pass priv instead NULL to radeon_bo_can_reclaim

  • radeonsi: preserve alpha if needed in kill_ps_outputs_cb

  • amd: fix addrlib regression

  • aco: don’t use python 3.7+ feature in aco_opcodes.py

  • radv: don’t use python 3.9 feature in radv_annotate_layer_gen.py

Qiang Yu (1):

  • radeonsi: split RADEON_USAGE_NEEDS_IMPLICIT_SYNC into CB and DB flags

Ray Smith (2):

  • panfrost: Don’t try to set bifrost blendable format on midgard

  • panfrost: Fix format tables for v4 and v5

Rhys Perry (84):

  • radv: do nir_shader_gather_info after radv_nir_lower_rt_abi

  • nir/lower_non_uniform: set non_uniform=false when lowering is not needed

  • nir/lower_shader_calls: remove CF before nir_opt_if

  • aco: fix labelling of s_not with constant

  • aco: add VOPD format

  • aco: add VOPD statistic

  • aco: refactor schedule_ilp main loop

  • aco: implement VOPD scheduler

  • aco: enable VOPD scheduler

  • aco: fix >8 byte linear vgpr copies

  • aco/tests: fix to_hw_instr.swap_linear_vgpr

  • aco: refactor create_vopd_instruction

  • aco: swap operands to create VOPD instructions

  • aco: turn v_mov_b32 into addition to create VOPD instructions

  • aco: improve printing of VOPD instructions

  • aco/tests: add tests for VOPD operand swapping

  • aco/tests: use raw strings in form_hard_clauses.nsa

  • radv: support minmax filter for more formats

  • aco/ra: don’t initialize assigned in initializer list

  • aco/ra: fix GFX9- writelane

  • aco: don’t combine linear and normal VGPR copies

  • aco/ra: disable p_start_linear_vgpr allocation hint

  • aco: allow p_start_linear_vgpr to use multiple operands

  • aco: require linear vgpr uses to be late kill

  • aco: only allow linear vgpr kills in top-level blocks

  • aco/ra: constify various RegisterFile

  • aco/ra: move parallelcopy creation into helper

  • aco/ra: change get_reg_bounds() helper

  • aco/ra: rework linear VGPR allocation

  • aco/ra: disable live range splitting of linear vgprs

  • aco/ra: emit linear VGPR parallel copy separately

  • aco/tests: add tests for linear VGPR register allocation

  • aco: optimize for purely linear VGPR copies

  • nir/algebraic: don’t create 64-bit min/max/ior if lowered

  • nir/algebraic: remove duplicated iand(ien, ine)/ior(ieq, ieq) patterns

  • nir/algebraic: optimize 64-bit comparisons with zero’d halves to 32-bit

  • nir/lower_int64: allow 64-bit comparisons when lowering minmax

  • nir/search: fix nir_replace_instr() debug code

  • aco: don’t pass constant to is_overwritten_since()

  • radv: don’t advertise DGC with LLVM

  • radv: stop using 5/8 component SSBO stores

  • radv,aco: allow VS prologs to increase VGPR usage

  • aco: don’t reuse misaligned attribute destination VGPRs in VS prologs

  • aco/util: add small_vec

  • radv: use dual_color_blend_by_location with Half-Life Alyx

  • aco/cssa: reset equal_anc_out if merging fails

  • aco/cssa: update comments

  • aco: fix GFX6 buffer_load_dwordx4 opcode number

  • aco: rename opcode->instruction

  • aco: refactor VOPC opcode list

  • aco: use single tuple for all opcode numbers

  • aco: use op()

  • aco: move dot/wmma instructions into VOP3P list

  • aco: unify MIMG opcode lists

  • aco/gfx11: fix scratch ST mode assembly

  • aco: split instruction assembly into functions

  • aco: always emit float mode for merged shaders compiled separately

  • aco: avoid breaking clauses with waitcnt

  • nir: add mqsad_4x8, shfr and nir_opt_mqsad

  • aco: implement mqsad_4x8 and shfr

  • ac/llvm: implement mqsad_4x8 and shfr

  • amd: set has_shfr32=true

  • radv: optimize msad_4x8 to mqsad_4x8

  • radv: memset radv_pipeline_cache_object data

  • nir: add nir_remove_after_cf_node helper

  • aco: remove unreachable merge blocks

  • aco: ensure loop exits exist in NIR

  • aco: save/reset/combine has_divergent_continue in uniform branches

  • nir,aco: add test intrinsics

  • aco/tests: add isel test helpers

  • aco/tests: add control flow tests

  • aco: assume no unreachable blocks

  • aco: don’t include the clause in VMEM_CLAUSE_MAX_GRAB_DIST

  • aco: remove occupancy check in dealloc_vgprs()

  • aco/tests: don’t assume constructor order

  • aco/tests: remove LLVM 11 code

  • radv: cache RT stage info

  • aco: include LDSDIR in latency/etc stats

  • aco: make store clauses more aggressively

  • aco: schedule LDSDIR instructions

  • aco: schedule LDS instructions

  • aco: split vop3p results

  • aco/waitcnt: fix DS/VMEM ordered writes when mixed

  • aco: create lcssa phis for continue_or_break loops when necessary

Rob Clark (31):

  • freedreno/a6xx: fix comment

  • freedreno/registers: Pass full args to dump_c()

  • freedreno/registers: De-duplicate xml_reg_files

  • freedreno/registers: Don’t re-parse files

  • freedreno/registers: Generate copyright comment blurb

  • freedreno/registers: Add basic kernel header support

  • freedreno/registers: A couple newline changes

  • tu/drm/virtio: Fix dmabuf import

  • freedreno/drm: Submit should hold ref to device

  • freedreno/drm: Fix teardown crash harder

  • freedreno/decode: Fix prefetch handling for IB1 crash

  • freedreno: Fix MSAA z/s layout in GMEM

  • freedreno/crashdec: Find potential fault buffers

  • tu: Give suballoc bo’s a name

  • freedreno/a6xx: Add dual_color_blend_by_location

  • freedreno/a6xx: Fix z/s preserving sysmem clear blit

  • freedreno/pps: Don’t re-init perfcntrs

  • freedreno: Add bo usage hints

  • freedreno/drm: Add perfetto memory tracing

  • tu: Add perfetto memory tracing

  • pps: Enable memory traces

  • pps: Config tweaks to avoid loosing traces

  • freedreno/registers: Add license header

  • egl/android: Fix gl_config dereference

  • freedreno/drm/virtio: Fix deadlock on exit

  • freedreno+virgl: Add missing driconf

  • freedreno: Update a618 xfails

  • ci: Add deqp fix for pipeline_statistics_3 tests

  • tu: Fix a6xx lineWidthGranularity

  • egl/android: Fix sRGB visuals

  • freedreno/ir3: Fix ldg/stg offset

Robert Beckett (1):

  • vulkan/wsi: fix force_bgra8_unorm_first

Robert Mader (5):

  • crocus: Support offset query for multi-planar planes

  • panfrost: Use pipe resource helper

  • egl: Implement EGL_EXT_config_select_group

  • egl: Implement EGL_MESA_x11_native_visual_id

  • egl/x11: Allow all RGB visuals to match 32-bit RGBA EGLConfigs

Robin Kertels (3):

  • nvk: Enable EXT_nested_command_buffer.

  • nak: Enable lowering rotate to shuffle.

  • nvk: Advertise VK_KHR_shader_subgroup_rotate.

Rohan Garg (35):

  • anv: refactor emit_dynamic_buffer_binding_table_entry

  • isl,blorp,anv: introduce ISL_TILING_64_XE2 for Xe2+ platforms

  • anv: untyped data port flush required when a pipeline sets the VK_ACCESS_2_SHADER_STORAGE_READ_BIT

  • anv: factor out common code for determining surface usage from a VkDescriptorType

  • anv: cleanup duplicate robustness flag calculations

  • anv: add a command streamer stall on Xe2+ when switching pipelines

  • intel/compiler: Xe2+ can do URB load/store with a byte offset

  • anv: drop duplicated 3DSTATE_SLICE_TABLE_STATE_POINTERS emission

  • anv, blorp: Set COMPUTE_WALKER Message SIMD field

  • intel/genxml: update PIPE_CONTROL so that we can decode it on the CCS

  • iris,anv: WA 1509820217 is no impact for Xe2+

  • intel/brw: Use the dimensions supplied in the instruction

  • intel/brw: Cleanup send generation

  • intel/brw: Update written size depending on the LSC message

  • intel/brw: Set the right cache control bits for xe2

  • intel/brw: Adjust src1 length bits for xe2+

  • anv,blorp: implement restrictions from WA 1406738321

  • anv: 3D surfaces have fewer layers for higher miplevels

  • isl: enable CCS for 3D surfaces on gen12.5 and above

  • intel/brw: account for sources when determining if a operation uses half floats

  • intel/brw: Xe2+ can do SIMD16 for extended math on HF types

  • intel/brw: update disassembly for MATH pipe

  • intel/brw: adjust the copy propgation pass to account for wider GRF’s on Xe2+

  • intel/brw: minor rework to de duplicate variable assignment

  • intel/brw: Handle typed surface and atomic messages for xe2+

  • intel/brw: Lower DWORD scattered read writes to lsc

  • intel/eu/validate: Allow SIMD16 for mixed mode float operations on xe2+

  • iris: slow clear higher miplevels on single sampled 8bpp resources that have TILE64

  • intel/blorp: add fast clear rectangle dimensions for single sampled TILE64 CCS surfaces

  • isl: allow CCS on single sampled TILE64 surfaces

  • anv: Enable HiZ on multi-LOD depth buffers.

  • anv: use u_foreach_bit to iterate over the the view mask like we do for transition_clear_color

  • anv: formatting fix when printing pipe controls

  • anv: allocate space for generated indirect draw id’s using the temporary allocation helper

  • Revert “iris: slow clear higher miplevels on single sampled 8bpp resources that have TILE64”

Roland Scheidegger (2):

  • auxiliary/draw: fix streamout overflow calculation

  • auxiliary/rtasm: fix unaligned stores

Romain Naour (1):

  • glxext: don’t try zink if not enabled in mesa

Ruijing Dong (6):

  • radeonsi/vcn: data structure av1 enc long term reference.

  • radeonsi/vcn: vcn4 av1 long term ref support

  • frontends/va: get av1 encoding ref frame infos for L0.

  • radeonsi/vcn: add enc surface alignment caps

  • frontends/va: add surface alignment attribute

  • radeonsi/vcn: update to use correct padding size.

Ryan Neph (3):

  • venus: fix shmem leak on vn_ring_destroy

  • virgl: use PIPE_MAX_SAMPLERS in bind_samplers_states

  • venus: reclaim signal semaphore feedback resources for wasteful clients

Sagar Ghuge (28):

  • intel/fs: Track instance id in gs_thread_payload

  • vulkan/runtime: Track VkSharingMode in vk_image

  • anv: Disable compression if we have concurrent sharing mode

  • intel/compiler/xe2: Handle 6-bit message type for Gfx20+

  • intel/compiler: Add texture operation lowering pass

  • intel/compiler: Use nir_tex_src_backend1 to pack LOD and array index

  • nir: Drop intel specific lowering code

  • intel/compiler: Lower texture operation to combine LOD and AI

  • intel/dev: Update max_subslices_per_slice comment

  • intel/compiler: Fix disassembly of URB message descriptor on Xe2+

  • anv: Drop warnings for engine initialization failure

  • anv: Set timestampValidBits to 64bits

  • intel/compiler: Trim vector properly till array index

  • intel/compiler: Adjust sample_b parameter according to new layout

  • intel/compiler: Pack LOD/bias and array index on TG4 messages

  • intel/compiler: Pack texture LOD and offset to a single 32-bit value

  • intel/compiler: Add helper method to decide if header is required

  • intel/compiler: Add gather4_i/l/[_c]/b sampler message

  • intel/compiler: Add texture gather offset LOD/Bias message support

  • nir: Allow nir_texop_tg4 in implicit derivative

  • intel/compiler: Enable packing of offset with LOD or Bias

  • anv: Implement VK_AMD_texture_gather_bias_lod

  • anv/xe: Consider pat_index while unbinding the bo

  • anv: Fix typo in DestinationAlphaBlendFactor value

  • anv: Use appropriate argument format for indirect draw

  • isl: Update isl_swizzle_supports_rendering comment

  • isl: Update shader channel select for missing components

  • intel/compiler: Disassemble mlen/rlen/ex_mlen in units of registers

Saleemkhan Jamadar (1):

  • radeonsi/vcn: set jpeg reg version for gfx 1151

Samuel Pitoiset (419):

  • radv: constify stages in radv_rt_fill_group_info()

  • radv/rt: re-use radv_ray_tracing_stage::sha1 for hashing RT pipelines

  • radv: correctly return VK_ERROR_OUT_OF_DEVICE_MEMORY when mapping a BO fails

  • radv/nir: pass radv_shader_stage to some radv_nir_xxx() functions

  • radv/nir: remove useless struct for nir_shader typedef

  • radv: remove one unused parameter in radv_fill_shader_info_ngg()

  • radv: move radv_pipeline_key::mesh_fast_launch_2 to the per-device cache key

  • radv: add radv_shader_stage_key to radv_shader_stage

  • radv: use radv_shader_stage_key directly with pre-existing fields

  • radv: add optimisations_disabled to radv_shader_stage_key

  • radv: remove unecessary radv_nir_compiler_options::key

  • radv: remove unused lower_rt_instruction_monolithic_state::key

  • radv: stop passing the pipeline key when compiling compute/rt shaders

  • radv: re-organize radv_pipeline_key

  • radv: add vertex_robustness1 to radv_shader_stage_key

  • radv: introduce radv_graphics_state_key

  • zink/ci: skip more arb_shader_image_load_store.* on Polaris10/Navi10

  • radv: add keep_statistic_info to radv_shader_stage_key

  • radv: add shader_version to radv_shader_stage_key

  • radv: pass radv_shader_stage_key to radv_pipeline_stage_init()

  • radv: make sure to retain shaders key for imported shaders with GPL

  • radv: cleanup radv_generate_pipeline_key()

  • radv: add radv_pipeline_get_shader_key()

  • radv/rt: cleanup radv_parse_rt_stage()

  • radv: hash radv_shader_stage_key

  • radv: stop hashing radv_pipeline_key for compute/rt pipelines

  • radv: remove the pipeline key for compute pipelines

  • radv: remove the pipeline key for ray tracing pipelines

  • radv: remove an extra new line in radv_shader.h

  • radv: pass radv_graphics_state_key to radv_hash_shaders()

  • radv: remove radv_generate_pipeline_key()

  • radv: rename radv_pipeline_key to radv_graphics_pipeline_key

  • radv: delay emitting streamout enable at draw time

  • aco: silent checking if clrxdisasm is available

  • radv: fix indirect dispatches on the compute queue on GFX7

  • radv: fix indirect draws with NULL index buffer on GFX10

  • radv: remove unused parameter to gather_shader_info_mesh()

  • radv: add a per-stage key field for mesh shaders with a task shader

  • vulkan: bump headers/registry to 1.3.276

  • lavapipe: fix build since vulkan spec update

  • vulkan: promote VK_EXT_line_rasterization to KHR

  • vulkan: promote VK_EXT_index_type_uint8 to KHR

  • radv: add a helper for binding the custom blend mode

  • radv: add a helper to get the VGT_GS_OUT value

  • radv: prevent accessing NULL pipelines when emitting VBO with ESO

  • radv: re-emit the TCS epilog when a new TCS is bound

  • radv: enable prologs/epilogs in-memory cache for shader objects

  • radv: add required NV entrypoints for VK_EXT_shader_object

  • radv: initialize default dynamic state when beginning a new cmdbuf

  • radv: add radv_shader_stage::next_stage field

  • radv: add radv_shader_layout::dynamic_offset_count

  • radv: add support for creating/destroying shader objects

  • radv: make some pipeline graphics helpers non-static for ESO

  • radv: add support for binding/emitting shader objects

  • radv: advertise VK_EXT_shader_object on GFX6-8

  • radv: advertise VK_KHR_load_store_op_none

  • radv: promote VK_EXT_line_rasterization to KHR

  • radv: advertise VK_KHR_line_rasterization

  • radv: promote VK_EXT_index_type_uint8 to KHR

  • radv: advertise VK_KHR_index_type_uint8

  • radv: use device->vk.enabled_features instead of iterating twice

  • radv: fix segfault when getting device vm fault info

  • radv/ci: enable RADV_PERFTEST=shader_object for vkcts-polaris10-valve

  • radv: refactor gfx103_pipeline_emit_vgt_draw_payload_cntl()

  • radv: refactor gfx103_pipeline_emit_vrs_state()

  • radv: use the non-emitted graphics pipeline for the needed dynamic states

  • radv: fix the late scissor emission workaround with ESO on GFX9

  • radv: set NGG fields in vgt_shader_key for ESO on GFX10+

  • radv: do not ignore RADV_DYNAMIC_FRAGMENT_SHADING_RATE for ESO on GFX10.3+

  • radv: emit more default states for ESO on GFX10.3+

  • radv: export alpha-to-coverage via MRTZ for ESO on GFX11

  • radv: fix detecting invalid binaries with ESO

  • radv: fix emitting tess domain origin for merged TES+GS on GFX9

  • radv: emit required programming for tess on GFX10+ in radv_emit_hw_vs()

  • radv: rebind mesh/task shaders when restoring meta context

  • radv: determine next stage for mesh/task with ESO

  • radv: ignore unneeded dynamic states with mesh shaders and ESO

  • radv: determine the last VGT api stage with mesh shaders and ESO

  • radv: bind and emit mesh/task shaders with ESO

  • radv: prevent crashes when a task shader is compiled unlinked with ESO

  • radv: init the shader key in radv_shader_stage_init() for ESO

  • radv: add support for VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT

  • radv: add a helper to know if device fault detection is enabled

  • radv: refactor dumping GPU hang reports by using chunks

  • radv: add support for keeping GPU hang reports in memory

  • radv: export GPU hang reports through VK_EXT_device_fault

  • radv: enable deviceFaultVendorBinary if RADV_DEBUG=hang is set

  • radv: remove radv_graphics_state_key::dynamic_patch_control_points

  • radv: determine the workgroup size for TCS earlier

  • radv: set the default workgroup size for VS as LS

  • radv: constify radv_device in radv_emit_shader_pointer()

  • radv: check active NIR stages before trying to merge shaders on GFX9+

  • radv: only merge shader info stages if both stages exist on GFX9+

  • radv: rework shader arguments for separate compilation of VS+TCS on GFX9+

  • radv: always mark drawid/base_instance used with ESO

  • radv: force TCS stage for VS as LS compiled separately on GFX9+

  • radv: always emit PGM_RSRC1_HS when emitting the TCS epilog state

  • radv: add support for emitting VS+TCS compiled separately on GFX9+

  • radv: do not allow to enable VK_EXT_shader_object with LLVM

  • radv: add a workaround for mipmaps and minLOD on GFX6-8

  • radv/sqtt: fix describing queue submits for RGP

  • radv: limit maxIndirectCommandsTokenCount to 512

  • radv: remove one indentation level in radv_fill_shader_info_ngg()

  • radv: squash GFX10/GFX10.3 NGG restrictions in the same condition

  • radv: always set GS as NGG if present on GFX11

  • radv: use next_stage to determine the NGG stage

  • radv: check for MESA_SHADER_TESS_EVAL in radv_fill_shader_info_ngg()

  • radv: determine the ES stage earlier when processing binary config

  • radv: determine the workgroup size for GS non-NGG earlier

  • radv: set the default workgroup size for VS/TES as ES

  • radv: change the user SGPR idx of AC_UD_TES_STATE

  • radv: add a new user SGPR for the ESGS ring item size

  • radv/nir: lower esgs_vertex_stride for GS compiled separately on GFX9+

  • radv: rework shader arguments for separate compilation of VS+GS on GFX9+

  • radv: declare streamout buffers for VS+GS compiled separately on GFX9+

  • radv: force GS stage for VS as ES compiled separately on GFX9+

  • radv: add support for emitting VS+GS compiled separately on GFX9+

  • radv/ci: remove VKD3D_CONFIG=dxr11 for navi21/navi31

  • radv: remove unused radv_indirect_command_layout::state_offset

  • radv: only load 3x32-bit elements when emitting draws with mesh shader

  • docs: fix RADV_DEBUG=nonggc description

  • radv: add RADV_DEBUG=nongg_gs for GFX10/GFX10.3

  • radv: add radv_disable_ngg_gs and enable it for Persona 3 Reload

  • radv: fix RGP barrier reason for RP barriers inserted by the runtime

  • radv: force GS stage for TES as ES compiled separately on GFX9+

  • radv: declare streamout buffers for TES+GS compiled separately on GFX9+

  • radv: declare AC_UD_TES_STATE for separate compilation of GS on GFX9+

  • radv: bind the vertex input SGPR only for relevant stages

  • radv: add support for emitting TES+GS compiled separately on GFX9+

  • radv: allow RADV_PERFTEST=shader_object on GFX9/VEGA10

  • radv/ci: enable RADV_PERFTEST=shader_object on VEGA10

  • radv: cleanup radv_shader_combine_cfg_vs_tcs()

  • radv: fix emitting VS prologs for merged shaders compiled separately on GFX10+

  • radv: clear RADV_CMD_DIRTY_SHADERS when resetting the shader object state

  • radv: clear the custom blend mode when resetting gfx pipeline state

  • radv: fix re-emitting DB_RENDER_CONTROL when resetting gfx pipeline state

  • radv: make sure to reset the GS copy shader with ESO

  • radv: fix selecting shader variants with ESO

  • radv: fix setting the rasterized primitive for ESO

  • radv: enable GS_FAST_LAUNCH=2 by default for RDNA3 APUs (Phoenix)

  • radv: only configure {XYZ_DIM,DRAW_INDEX}_REG for mesh shaders if enabled

  • radv: re-enable GS_FAST_LAUNCH=2 by default on GFX11

  • radv: stop using conditional rendering internally when preprocessing DGC

  • radv: disable conditional rendering if enabled when preprocessing DGC

  • radv: pass the ES stage when emitting geometry shader with ESO

  • radv: determine the ES stage for merged NGG shaders compiled separately

  • radv: prefix radv_vgt_shader_key::streamout with ngg

  • radv: set radv_vgt_shader_key::ngg_streamout for ESO

  • radv: determine the number of invocations only for VS/TES as NGG

  • radv: store the number of outputs for VS/TES as NGG

  • radv: use radv_shader_info for computing NGG LDS layout

  • radv: fix a compilation warning in radv_bind_graphics_shaders()

  • spirv: only consider IO variables when adjusting patch locations for TES

  • radv: move mesh_fast_launch_2 to radv_physical_device

  • radv: initialize disk cache slightly later when creating a physical device

  • radv: introduce a per physical device cache key

  • radv: fix binary shaders compatibility with ESO

  • radv: fix indirect dispatches on compute queue with conditional rendering on GFX7

  • radv: remove the union in radv_shader_object

  • radv: fix a big memleak with VK_EXT_shader_object

  • radv: free NIR shaders when creating linked shaders with ESO

  • radv: simplify binding the GS copy shader with ESO

  • radv: rename radv_emit_shaders() to radv_emit_graphics_shaders()

  • radv: simplify emitting VGT_ESGS_RING_ITEMSIZE for ESO

  • radv: re-emit more states when a shader compiled separately is bound

  • radv: only enable emulated mesh/task shader invocations on GFX10.3

  • radv: add support for mesh primitives queries on GFX11

  • radv: add support for task shader invocations queries on GFX11

  • radv: remove a TODO about adding mesh/task queries on GFX11

  • radv: store/reset conditional rendering user info in the helpers

  • radv: add support for conditional rendering on the compute queue with DGC

  • radv: remove unused parameter in gfx10_get_ngg_query_info()

  • radv: do not set gs.has_pipeline_stat_query twice for NGG GS

  • radv: use so.num_outputs to determine if NGG shaders need XFB queries

  • radv: determine NGG query info before linking shader info

  • radv: pass gfx10_ngg_info to gfx10_get_ngg_info()

  • radv: pass radv_shader_info to gfx10_get_ngg_info()

  • radv: determine NGG culling info before NGG info

  • radv: compute NGG scratch LDS base in gfx10_get_ngg_info()

  • radv: compute the total LDS usage in gfx10_get_ngg_info()

  • radv: disable VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 with minmax filter on GFX6

  • radv/ci: enable RADV_PERFTEST=shader_object for vkcts-tahiti-valve

  • radv: clean up MAX_ALLOWED_TILES_IN_WAVE programming

  • radv: add missing RADV_DEBUG_NO_NGG_GS to the physical device cache key

  • radv: fix conditional rendering with direct mesh+task draws and multiview

  • radv: move conditional rendering for compute in radv_cmd_state

  • radv: get the pipeline layout info from the push constant token with DGC

  • radv: add a helper to calculate the compute resource limits

  • radv: add a function to get compute pipeline metadata for DGC

  • radv: add support for VK_PIPELINE_CREATE_INDIRECT_BINDABLE_BIT_NV

  • radv: implement vkGetPipelineIndirectXXX() for DGC

  • radv: implement vkCmdUpdatePipelineIndirectBufferNV()

  • radv: implement indirect compute pipeline binds with DGC

  • radv: handle indirect pipeline binds with scratch and DGC

  • radv: force shader BOs to be local BOS with DGC indirect compute pipelines

  • radv: enable deviceGeneratedComputePipelines

  • radv: fix conditional rendering on compute queue on GFX6

  • radv: add missing conditional rendering for indirect dispatches on GFX6

  • radv: add a helper to emit PKT3_COND_EXEC

  • radv: add a new user SGPR for NGG shaders compiled separately with ESO

  • radv: lower lds_ngg_{gs_out_vertex_base,_scratch_base} with ESO

  • radv: add support for emitting NGG shaders with ESO

  • radv: allow RADV_PERFTEST=shader_object on GFX11

  • radv: enable radv_zero_vram for RAGE2

  • radv: preserve streamout_buffers user SGPR for VS/TES + GS compiled separately

  • radv: always use ace_cs for the gang CS variable

  • radv: refactor emitting the view index for task shaders

  • radv: allocate a 32-bit value for the MEC fw bug with indirect mesh+task earlier

  • radv: stop passing radv_cmd_buffer to draw functions with task shaders

  • radv/ci: remove RT tests from the VANGOGH skip list

  • radv/ci: remove dEQP-VK.robustness.* from the VANGOGH skip list

  • radv: disable NGG in more situations with ESO on GFX10/GFX10.3

  • radv: implement has_vgt_flush_ngg_legacy_bug for ESO

  • radv: allow RADV_PERFTEST=shader_object on GFX10/GFX10.3

  • radv,aco: stop duplicating PS/TCS epilog fields

  • radv: add a helper to emit PS/TCS epilogs

  • radv/ci: enable RADV_PERFTEST=shader_object for VKCTS jobs on GFX10+

  • radv/ci: mark dEQP-VK.shader_object.binding.mesh_swap_task as flake on NAVI21

  • radv: stop using the custom blend mode for PS epilogs

  • radv: re-emit RB+ state with PS epilogs only when the col format changes

  • radv: fix emitting default blend state for PS without epilogs and ESO

  • radv: allow RADV_PERFTEST=shader_object on all GFX9 GPUs

  • radvi/ci: enable RADV_PERFTEST=shader_object for RENOIR

  • util/u_debug: fix parsing of “all” again

  • ci: use Linux kernel 6.6 for RADV

  • radv/ci: update list of flakes for VKCTS jobs

  • zink/ci: update list of flakes for RADV jobs

  • ac/nir: fix exporting NGG streamout outputs with implicit PrimId from VS/TES

  • radv: fix determining if PrimId is used for merged shaders compiled separately

  • radv: determine radv_vgt_shader_key::has_ngg_xxx with the last VGT shader

  • radv: rework generating vgt_shader_key for pipelines

  • radv: fix wave32 support with ESO

  • radv: add helpers to bind the GS copy shader and the RT prolog

  • radv: fix RADV_PERFTEST=dmashaders with ESO

  • radv: emit VGT_GS_OUT_PRIM_TYPE as part of the dynamic primitive topology

  • radv: disable binning correctly on GFX11.5

  • radv: fix programming of PA_SC_BINNER_CNTL_1.MAX_ALLOC_COUNT on GFX11

  • radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs

  • radv: add a GPU hang workaround for legacy tess+GS for GFX10.3

  • radv: fix occlusion queries with MSAA and no attachments

  • radv: add radv_force_pstate_peak_gfx11_dgpu and enable it for Helldivers 2

  • zink/ci: enable RADV_PERFTEST=shader_object for polaris10

  • radv: add a workaround for null IBO on GFX6

  • zink/ci: update CI lists

  • radv: always export MRTZ in FS epilogs with ESO on GFX11

  • radv: trigger a new PS epilog when the framebuffer is dirty with ESO

  • zink/ci: allow RADV_PERFTEST=shader_object on NAVI31

  • radv: invalidate L2 metadata for VK_ACCESS_2_MEMORY_READ_BIT

  • radv: make sure to disable NGG culling with TES when the FS stage is unknown

  • zink/ci: enable RADV_PERFTEST=shader_object for NAVI10/VANGOGH

  • radv/rmv: add missing logging when sparse BOs are destroyed

  • radv/rmv: add missing logging when events are destroyed

  • radv/rmv: fix logging of per-queue destroyed BOs

  • radv/rmv: fix logging sparse residency

  • radv/winsys: move BO size to radeon_winsys_bo

  • radv/rmv: remove BO size parameter in radv_rmv_log_bo_allocate()

  • radv: make some create resources helpers static

  • radv/rmv: remove unnecessary is_internal parameter to some helpers

  • radv: add radv_bo_{create,destroy}() helpers

  • radv/rmv: prevent logging BOs allocated in GDS/OA domains

  • radv/rmv: log allocated/destroyed BOs in radv_buffer_{create,destroy}()

  • radv: add radv_bo_virtual_bind() helper

  • radv: fix conditional rendering with mesh+task and multiview (again)

  • radv: remove useless RADV_DEBUG=nomemorycache

  • radv: implement alpha-to-one

  • radv: advertise alphaToOne

  • radv: advertise extendedDynamicState3AlphaToOneEnable with ACO

  • docs: add alpha-to-one features for RADV

  • radv: rename radv_physical_device variables to pdev everywhere

  • radv/winsys: rename gpu_info to pci_ids in the null winsys

  • radv: rename radeon_info variables to gpu_info everywhere

  • radv: rename radv_physical_device::rad_info to info

  • radv: remove radv_device::physical_device

  • radv: remove radv_queue::device

  • radv: remove radv_cmd_buffer::device

  • radv: remove radv_device::instance

  • radv: remove radv_physical_device::instance

  • radv: declare radv_cmd_update_descriptor_xxx() in radv_descriptor_set.h

  • radv: declare format related functions in radv_formats.h

  • radv: pass a radv_physical_device to radv_use_llvm_for_stage()

  • radv: move radv_device_supports_etc() to radv_physical_device.c

  • radv: move some VK_DEFINE_NONDISP_HANDLE_CASTS to radv_descriptor_set.h

  • radv: add radv_sampler.h

  • radv: add radv_event.h

  • radv: add radv_buffer_view.h

  • radv: add radv_buffer.h

  • radv: add radv_video.h

  • radv: add radv_image.h

  • radv: add radv_image_view.h

  • radv: add radv_query.h

  • radv: add radv_perfcounter.h

  • radv: add radv_device_generated_commands.h

  • radv: enable radv_zero_vram for Red Dead Redemption 2

  • vulkan/debug_utils: add a helper for reporting address binding

  • radv: implement VK_EXT_device_address_binding_report

  • radv: advertise VK_EXT_device_address_binding_report

  • radv: move radv_prim_vertex_count to si_cmd_buffer.c

  • radv: move radv_userdata_locations to radv_shader_args.h

  • radv: move radv_shader_{layout,stage} to radv_shader.h

  • radv: add radv_device_memory.h

  • radv: add radv_instance.h

  • radv: add radv_queue.h

  • radv: add radv_physical_device.h

  • radv: add radv_rra.h

  • radv: add radv_device.h

  • radv: add radv_pipeline_cache.h

  • radv: add radv_pipeline.h

  • radv: add radv_pipeline_compute.h

  • radv: add radv_pipeline_rt.h

  • radv: add radv_pipeline_graphics.h

  • radv: add radv_wsi.h

  • radv: add radv_sqtt.h

  • radv: add radv_shader_object.h

  • radv: add radv_spm.h

  • radv: add radv_cmd_buffer.h

  • radv: add radv_rmv.h

  • radv: add radv_cp_reg_shadowing.h

  • radv: add radv_printf.h

  • radv: move radv_get_tdr_timeout_for_ip() to radv_query.h

  • radv: move radv_queue_ring() to radv_queue.c

  • radv: add radv_nir_to_llvm.h

  • radv: add radv_android.h

  • radv: add radv_shader_info.h

  • radv: move CP DMA related code to radv_cp_dma.c/h

  • radv: move more cmd buffer related code to radv_cmd_buffer.c

  • radv: merge radv_write_guardband() with radv_emit_guardband_state()

  • radv: merge radv_write_scissors() with radv_emit_scissor()

  • radv: move radv_get_viewport_xform() to radv_pipeline_graphics.c

  • radv: move radv_create_gfx_config() to radv_device.c

  • radv: move radv_emit_{compute,graphics}() to radv_queue.c

  • radv: move code related to sample positions to radv_device.c

  • radv: rename si_cmd_buffer.c to radv_cs.c

  • radv: remove unused radv_printflike()

  • radv: remove pre-declarations needed for WSI entrypoints

  • radv: remove remaining forward declarations and comments in radv_private.h

  • radv: replace RADV_FROM_HANDLE by VK_FROM_HANDLE

  • radv: add missing endif comment for some headers

  • radv: rename remaining phys_dev occurrences to pdev

  • radv: replace radv_minify() by u_minify()

  • radv: replace align_{u32,u64}() by align{64}()

  • radv: replace align_u32_npot() by ALIGN_NPOT

  • radv: replace radv_float_to_{u,s}fixed() by util_{un}signed_fixed()

  • util: add util_is_aligned()

  • radv: replace radv_is_aligned() by util_is_aligned()

  • radv: move RADV_SUPPORT_CALIBRATED_TIMESTAMPS to radv_physical_device.c

  • radv: move RADV_API_VERSION to radv_instance.h

  • radv: move CLOCK_MONOTONIC_RAW define to radv_physical_device.h

  • radv: move RADV_USE_WSI_PLATFORM define to radv_wsi.h

  • radv: remove radv_private.h

  • radv: make radv_get_vgt_index_size() static

  • radv: move radv_get_user_sgpr() to radv_shader.c

  • radv: move radv_queue_family_to_ring() to radv_queue.c

  • radv: remove old comment in radv_cs.c

  • radv: move radv_printf_data to radv_printf.h

  • radv: make sure the heap budget is less than or equal to the heap size

  • radv: use SPDX-License-Identifier

  • radv: enable VK_EXT_shader_object by default

  • aco: use SPDX-License-Identifier

  • ci: uprev vkd3d-proton to c3b385606a93baed42482d822805e0d9c2f3f603

  • docs: mark VK_KHR_maintenance6 as DONE for RADV

  • radv: determine if the cache is disabled at device creation time

  • radv: add skip_shaders_cache also for compute/rt pipelines

  • radv: stop using a graphics pipeline for generating the graphics key

  • radv/rt: constify device in radv_init_rt_stage_hashes()

  • radv/rt: handle creation feedback like graphics/compute pipelines

  • radv/rt: stop passing pCreateInfo to radv_ray_tracing_pipeline_cache_search()

  • radv/rmv: fix missing image bind logging for WSI images

  • radv: fix missing addr binding report for WSI image binds

  • radv: fix addr binding report for disjoint image binds

  • radv/rmv: fix image binds logging for disjoint images

  • radv: add a helper to set image bindings

  • radv: fix missing unbind report when an image is destroyed

  • radv: fix missing unbind report when a buffer is destroyed

  • radv/rt: remove dead code about intersection shaders in radv_pipeline_get_shader_key()

  • radv: add a helper for hashing pipelines

  • radv: rework and add a helper for hashing a compute pipeline

  • radv: stop ignoring shader stages that don’t need to be imported with GPL

  • radv: add missing SQTT markers when an indirect indexed draw is used with DGC

  • radv/rt: use radv_pipeline_hash_shader_stage()

  • radv/rt: stop computing unused hash for the traversal shader

  • radv: use canonicalized VA for VM fault reports

  • radv: simplify importing pipeline layout with GPL

  • radv: return early when PS is NULL in radv_pipeline_init_blend_state()

  • radv: simplify checking for PS epilogs in radv_pipeline_init_blend_state()

  • radv: remove unused parameter in radv_skip_graphics_pipeline_compile()

  • radv: simplify the check for exporting multiview in the last VGT stage

  • radv/rt: remove unnecessary pipeline parameter to radv_rt_fill_group_info()

  • radv/rt: remove unnecessary pipeline parameter to radv_generate_rt_shaders_key()

  • radv/rt: initialize shader group capture/replay in a separate function

  • radv/rt: rework handle_from_stages to pass hashes directly

  • radv/rt: insert shaders to cache right after they are compiled

  • radv/rt: add radv_rt_pipeline_compile()

  • radv: clear color attachments without exports before compaction

  • ci: uprev CTS to vulkan-cts-1.3.8.0

  • radv/ci: add one more flake since CTS 1.3.8.0 for RENOIR

  • radv/ci: update lists for TAHITI and Zink/Polaris10

  • radV/ci: reduce the parallelism of navi21 to 3

  • radv: fix waiting for occlusion queries on GFX6-8

  • radv: return per plane requirements for disjoint images

  • zink/ci: update CI lists since piglit uprev

  • radv/rt: remove unnecessary param to radv_ray_tracing_pipeline_cache_insert()

  • radv/rt: move radv_ray_tracing_pipeline::sha1 to radv_pipeline

  • radv: use radv_pipeline::sha1 for graphics/compute pipelines

  • radv: rework pipeline cache search helpers

  • radv: add RADV_DEBUG=psocachestats to report per-pipeline cache hits/misses

  • vulkan: pass cmdbuf level to vk_command_buffer_ops::create()

  • radv/amdgpu: do not use IB2 for nested command buffers

  • radv: track if nested command buffers uses indirect draws

  • radv: advertise VK_EXT_nested_command_buffer

  • ac,radeonsi: add helpers to compute the number of tess patches/lds size

  • radv: rework the number of tess patches computation

  • ac: allow to use 64K of LDS for tessellation on GFX9+

  • ci: uprev CTS to 1.3.8.2

  • radv: fix image format properties with fragment shading rate usage

  • radv: remove bogus VkShaderCreateInfoEXT::flags being 0 assert for compute

  • radv: allow 3d views with VK_IMAGE_CREATE_2D_VIEW_COMPATIBLE_BIT_EXT

  • radv: mark some formats as unsupported on GFX8/CARRIZO

  • radv: set image view descriptors as buffer for non-graphics GPU

  • radv: only set ALPHA_IS_ON_MSB if the image has DCC on GFX6-9

Sathishkumar S (5):

  • ac/gpu_info: query the number of ip instance

  • radeonsi/vcn: avoid hard-coding the number of jpeg instance

  • ac/gpu_info: fix regression in vulkan hw decode

  • radeonsi/vcn: use num_instances from radeon_info

  • ac/gpu_info: update multimedia info

Sean Anderson (2):

  • gallium: lima: Don’t round height when checking alignment

  • Add Xilinx ZynqMP KMSRO entrypoint

Sebastian Wick (4):

  • radeonsi: Destroy queues before the aux contexts

  • util: Add timespec_sub_saturate to avoid negative time for deadlines

  • loader/wayland: Add fallback wl_display_dispatch_queue_timeout

  • vulkan/wsi/wayland: Use dispatch_queue_timeout in acquire_next_image

Sergi Blanch Torne (10):

  • ci: disable Collabora’s farm due to maintance

  • Revert “ci: disable Collabora’s farm due to maintance”

  • ci: disable Collabora’s farm due to maintance

  • Revert “ci: disable Collabora’s farm due to maintance”

  • ci: Nightly run expectations update

  • ci: disable Collabora’s farm due to maintance

  • Revert “ci: disable Collabora’s farm due to maintance”

  • Uprev Piglit to f7ece74a107a2f99b2f494d978c84f8d51faa703

  • ci: kernel stored in a different s3 bucket

  • ci: identify and label S3 buckets

Shih, Jude (4):

  • amd/vpelib: Need a debug flag to support 2tap downscaling

  • amd/vpelib: Add VPE prefix on API to avoid naming conflict

  • amd/vpelib: Solve the downscaling problem for 2 tap

  • amd/vpelib: Solve link error due to missing static for one function

Sil Vilerino (25):

  • d3d12: Do not assume multi-subregion support when querying for driver encode support

  • d3d12: Implement cap for PIPE_VIDEO_CAP_ENC_INTRA_REFRESH

  • d3d12: Clean up H264 video decode interlaced code path

  • frontends/va, d3d12: Fix PIPE_VIDEO_SLICE_MODE_MAX_SLICE_SICE -> PIPE_VIDEO_SLICE_MODE_MAX_SLICE_SIZE typo

  • d3d12: d3d12_video_encoder_negotiate_current_h264_slices_configuration to use correct mode when intra-refresh is on

  • d3d12: Do not use PIPE_BIND_DISPLAY_TARGET for d3d12_video_buffer

  • d3d12: AV1 encode - Configure CQP using qp and new qp_inter parameters

  • d3d12: H264 encode - Update CQP using current frame type as per VA frontend change

  • d3d12: HEVC encode - Update CQP using current frame type as per VA frontend change

  • frontend/va: Support media only post proc without compositor using shaders or surfaces

  • frontend/va: Use get_resources in VaDeriveImage for media only devices without get_surfaces support

  • d3d12: Add partial media, compute, graphics support with CORE and GENERIC feature levels

  • d3d12: Refactor graphics functions from context and blit to separate files

  • d3d12: Add GetDesc wrapper for ID3D12Heap

  • d3d12: Only check D3D12_FEATURE_DATA_PLACED_RESOURCE_SUPPORT_INFO for D3D_FEATURE_LEVEL_1_0_GENERIC

  • d3d12: Video Encode - Add driver workaround for rate control reconfiguration

  • d3d12: Implement PIPE_VIDEO_CAP_ENC_SURFACE_ALIGNMENT

  • d3d12: Bump directx-headers dependency to v613

  • d3d12: Support H264 slice L0/L1 active number override

  • d3d12: Support HEVC slice L0/L1 active number override

  • d3d12: Fix leak of batch->bos on video-only builds

  • d3d12: Fix leak dxil_module::serialized_dependency_table

  • d3d12: Fix util_blitter_destroy destruction ordering

  • vl_win32_screen_create: Take ownership of winsys injected to created d3d12_screen

  • d3d12/ci: Add vainfo with appverifier CI check

Simon Ser (2):

  • egl/wayland: ensure wl_drm is available before use

  • egl/wayland: explain why implicit modifier downgrade is allowed

Stéphane Cerveau (1):

  • vulkan/video: hevc: b-frames can be reference or not

Surafel Assefa (1):

  • radeonsi: Adds return on failure to get plane info

Sviatoslav Peleshko (6):

  • nir: Use alu source components count in nir_alu_srcs_negative_equal

  • anv,driconf: Add sampler coordinate precision workaround for AoE 4

  • driconf: Apply dual color blending workaround to Dying Light

  • anv: Store host-located copy of NULL surface state for faster memcpy

  • dri: Flush the context after flush_resource when creating shareable image

  • anv: Fix descriptor sampler offsets assignment

Tapani Pälli (35):

  • hasvk: remove cmd_buffer_ray_query_globals function decl

  • hasvk: remove gfx9 specific code from emit_sample_pattern

  • hasvk: remove softpin (GFX_VERx10 >= 90) related code

  • hasvk: remove gfx9 specific cs stall from emit_ps_depth_count

  • anv: check for wa 16013994831 in emit_so_memcpy_end

  • iris: expand pre-hiz data cache flush to gfx >= 125

  • anv: expand pre-hiz data cache flush to gfx >= 125

  • iris: replace constant cache invalidate with hdc flush

  • anv: move *bits_for_access_flags to genX_cmd_buffer

  • anv: use workaround framework for Wa_22018402687

  • intel/blorp: add a TODO note about stencil buffer resolve

  • intel: refactor urb configuration, add intel_urb_config

  • intel/common: provide a helper for urb setup comparison

  • blorp/crocus: refactor blorp_emit_urb_config

  • iris: implement Wa_16014912113

  • anv: implement Wa_16014912113

  • blorp: implement Wa_16014912113 callback for drivers

  • anv: flush tile cache independent of format with HIZ-CCS flush

  • anv: revert cache flushing changes for indirect commands

  • intel/blorp: disable use of REP16 independent of format

  • iris: make sure DS and TE are sent in pairs on >= gfx125

  • iris: make sure aux is disabled for external objects

  • anv: make sure aux is disabled for memory objects

  • hasvk: make sure aux is disabled for memory objects

  • crocus: make sure aux is disabled for memory objects

  • anv: use workaround framework for Wa_16013000631

  • anv: setup distribution granularity with Wa_14019166699

  • iris: refactor function that checks primitive id usage

  • iris: setup distribution granularity with Wa_14019166699

  • anv: disable fcv optimization on >= gfx125

  • intel/blorp: remove unused blorp batch flag

  • intel/compiler: add assert for Wa_22017182272

  • anv: add dirty tracking for push constant data

  • iris: change stream uploader default size to 2MB

  • anv: skip gfx push constants alloc optimization on gfx9/11

Tatsuyuki Ishi (8):

  • radv: Recompute max_waves after postprocessing RT config

  • radv: never set DISABLE_WR_CONFIRM for CP DMA clears and copies

  • util: Optimize mesa_hex_to_bytes

  • radv: Add radv_spirv_to_nir_options that summarize early gfx states.

  • radv: Rename cache_search_nir to cache_lookup_nir_handle.

  • radv: Re-sort RADV_PERFTEST env vars in docs

  • radv: Implement NIR caching behind RADV_PERFTEST=nircache.

  • radv: Remove radv_queue::device again

Teng, Jin Chung (1):

  • d3d12: HEVC Encode - Query slice config mode based on user slice setting

Thomas H.P. Andersen (6):

  • nvk: promote VK_EXT_index_type_uint8 to KHR

  • nvk: promote VK_EXT_line_rasterization to KHR

  • nvk: promote load_store_op_none to KHR

  • docs: update features.txt for nvk

  • nvk: support driconf option force_vk_vendor

  • driconf: override vendor id for X4 Foundations on NVK

Thong Thai (2):

  • radeonsi/vcn: remove EFC support for renoir

  • frontends/va/postproc: do not use efc if image is to be translated

Timothy Arceri (78):

  • glsl: add nir version of validate_geometry_shader_emissions()

  • glsl: use nir version of geom stream validation

  • glsl: remove now unused GLSL IR validate_geometry_shader_emissions()

  • glsl: don’t tree graft globals

  • Revert “ci: Enable GALLIUM_DUMP_CPU=true only in the clang job”

  • glsl: add basic params for AMD_gpu_shader_half_float extension

  • glsl: add half float support to the parser

  • glsl: add explicit half float conversion support

  • glsl: update assert to allow for half float support

  • glsl: add glsl_type_is_float_16() helper

  • glsl: add implicit half float conversions

  • glsl: add ubo packing support for half floats

  • glsl: skip conversion of half float back to float for GL queries

  • glsl: add some new helpers for half float builtin functions

  • glsl: add half float angle and trigonometry functions

  • glsl: add half float exponential functions

  • glsl: add f2f16() helper to ir_builder

  • glsl: add half float support for common functions

  • glsl: add support for half float packing functions

  • glsl: add half float geometric functions

  • glsl: add half float matrix functions

  • glsl: add half float vector relational functions

  • glsl: allow half float varyings

  • glsl: add half float interpolation functions

  • glsl: add half float derivative functions

  • glsl: add half float AMD_shader_trinary_minmax functions

  • compiler/types: Add a contains_32bit helper

  • gallium: add PIPE_CAP_FP16 for AMD_gpu_shader_half_float

  • glsl: add missing error check for half float varying

  • nir: allow gather info to handle nir_deref_type_array_wildcard

  • glsl: support array wildcards in lower named interface blocks

  • glsl: split var copies before lowering named interfaces

  • glsl: fix potential crash in expression flattening

  • glsl: move some lowering to the compiler

  • glsl_to_nir: merge function param handling

  • glsl_to_nir: support conversion of struct/array function params

  • glsl_to_nir: support conversion of struct/array function returns

  • glsl_to_nir: support conversion of opaque function params

  • glsl: don’t inline functions in glsl ir

  • nir: add some nir_parameter fields

  • glsl: add missing define to linker_util.h

  • glsl: add nir version of function recursion detection

  • glsl: move function inlining out of glsl_to_nir()

  • glsl: make use of nir recursion detection

  • glsl: implement nir version of lower discard flow

  • glsl: make use of nir lower discard flow

  • glsl: remove now unused glsl ir lower discard pass

  • glsl: make an explicitly safe version of visit_exec_list()

  • glsl_to_nir: never convert instructions after jump

  • glsl: remove unrequired do_lower_jumps() call

  • glsl: move invariant builtin validation to the nir linker

  • nir: add max_array_access data field

  • nir: add implicit_sized_array data field

  • glsl: add resize_tes_inputs() to the nir linker

  • nir: add variable field from_ssbo_unsized_array

  • glsl: don’t remove redefined per vertex block

  • glsl: add nir implemenation of block validation

  • glsl: switch to NIR block validation

  • glsl: call new nir resize_tes_inputs() pass

  • glsl: remove now unused resize_tes_inputs()

  • glsl: remove now unused glsl ir block validation

  • glsl: move some linking calls to gl_nir_link_glsl()

  • glsl: switch verify_subroutine_associated_funcs() to nir

  • nir: add subroutine fields to nir_function

  • glsl: move link_assign_subroutine_types() to the nir linker

  • glsl: move check_explicit_uniform_locations() to NIR linker

  • glsl: move mode_string() to helper

  • glsl: add some data members to nir_variable

  • glsl: make validate_intrastage_arrays() usable across files

  • glsl: move cross_validate_uniforms() to the nir linker

  • glsl: use shader info to store gs verts

  • glsl: use info from shader when linking

  • glsl: move validate_{stage}_shader_executable() to the nir linker

  • glsl: remove now unused do_dead_functions()

  • glsl: remove FragDepthLayout field

  • glsl: remove ActiveStreamMask field

  • glsl: remove UsesEndPrimitive field

  • glsl: inline _mesa_copy_linked_program_data()

Timur Kristóf (121):

  • radv: Correctly select SDMA support for PRIME blit.

  • nir: Fix divergence of reductions.

  • nir: Fix divergence analysis of load_patch_vertices_in.

  • nir: Cleanup divergence analysis for mesh shaders.

  • nir: Clean up divergence analysis for TES patch input loads.

  • aco: Eliminate SCC copies when possible.

  • radv: Lower mesh shader draw ID to zero when they have a task shader.

  • radv: Extract input and output stride info to new functions.

  • radv: Use mapped driver locations for determining I/O strides.

  • aco: Allow passing constant operand to is_overwritten_since.

  • radv/llvm: Remove dead code.

  • radv: Allow NGG culling with LLVM.

  • compiler: Add helper for counting tess level components.

  • ac/nir/tess: Always record tess level info and use it at the end.

  • ac/nir/tess: Don’t record mapped tess level location.

  • ac/nir/tess: Split tess factor write into multiple functions.

  • ac/nir/tess: Emit tess factor output independently of whether it can be passed by registers.

  • ac/nir/tess: Refactor how the end of HS is emitted.

  • aco: Use common helper for counting tess level components.

  • aco: Use tess factors when TCS jumps to epilog.

  • radv: Declare tess_lvl_in/out args for TCS epilogs.

  • radv: Always pass tess factors to epilogs in registers.

  • radv, aco: Delete now dead TCS epilog code.

  • nir: Add two new AMD specific tess intrinsics.

  • radeonsi: Implement new intrinsics for monolithic shaders.

  • radv: Copy TES primitive mode to TCS info.

  • radv: Implement new tess intrinsics.

  • radv: Call nir_opt_dead_cf in radv_optimize_nir_algebraic.

  • ac/nir/tess: Emit tess factor stores based on new intrinsics.

  • radv: Completely delete TCS epilogs.

  • radv, aco: Remove the code that jumped to RADV’s TCS epilogs.

  • ac/llvm, radeonsi: Handle tess_rel_patch_id in common code.

  • radeonsi: Put HS output count in TCS offchip layout, not patch data offset.

  • radeonsi: Implement dynamic TCS intrinsics for non-monolithic shaders.

  • radeonsi: Delete TCS epilogs entirely.

  • aco: Delete all TCS epilog code.

  • radeonsi: Add number of VS outputs to TCS output layout.

  • radeonsi: Remove tess bits from VS state.

  • radeonsi: Use one more bit for number of patches in TCS offchip layout.

  • ac/nir/tess: Remove dead code that was meant for epilogs.

  • radv: Add number of LS and HS outputs to tcs_offchip_layout.

  • radv: Change input patch size in TCS offchip layout to match RadeonSI.

  • radv: Change number of patches in TCS offchip layout to match RadeonSI.

  • radv: Include output patch size in TCS offchip layout.

  • radv: Reuse TCS offchip layout between TCS and TES.

  • nir/gather_info: Record per-primitive outputs without variables.

  • nir: Record per-primitive inputs without variables.

  • nir/recompute_io_bases: Sort per-primitive PS inputs last.

  • ac/nir: Introduce ac_nir_calc_io_offset_mapped.

  • ac/nir/tess: Load tess factors from variable when they are passed in registers.

  • ac/nir/tess: Clarify when a TCS output is stored in LDS or VRAM.

  • ac/nir/tess: Return undef when loading an unwritten TCS output.

  • ac/nir/tess: Map TCS LDS IO locations without gaps.

  • ac/nir/tess: Calculate reserved LDS outputs based on IO info.

  • ac/nir/tess: Remove superfluous args for reserved TCS outputs.

  • ac/nir/tess: Clarify when VS-TCS I/O can use registers.

  • radv: Only add extra dword to LS-HS stride when there are LS outputs.

  • radv: Pass key structures to gather intrinsic info.

  • radv: Extract gather_load_vs_input_info function.

  • radv: Slightly refactor gather_intrinsic_store_output_info.

  • radv: Record PS input clip/cull mask instead of number.

  • radv: Use NIR IO semantics to determine GS output info.

  • radv: Add helper for determining per-attribute vertex buffer descriptors.

  • radv: Add helper to determine usage of VS prologs.

  • radv: Remove unused VS input usage mask.

  • radv: Use NIR IO semantics to determine VS input info.

  • radv: Use IO semantic location for shader output info.

  • aco/optimizer_postRA: Remove a check from SCC no-compare optimization.

  • radv: Use NIR IO semantics to determine FS input info.

  • radv: Remove I/O variables after nir_lower_io.

  • radv: Slightly refactor the determination of max_ps_params.

  • radv: Increase maximum allowed PS params for enabling NGG culling.

  • radv: Remove unused gfx_level from gfx10_emit_ge_pc_alloc.

  • ac/nir/ngg: Don’t create dummy output variable for primitive ID.

  • ac/nir/ngg: Use IO semantics for determining instance rate inputs.

  • ac/nir/ngg: Rename confusing driver_location variable in mesh shader lowering.

  • radv: Use NIR IO semantics for VS input location mapping.

  • radv: Don’t set driver locations for mesh shaders.

  • radv: Don’t set driver locations for FS outputs.

  • radv: Don’t set driver locations for last pre-rasterization stage.

  • radv: Keep track of TCS outputs that need LDS.

  • radv: Remove dead code for creating per-patch IO mask.

  • radv: Add radv_gather_unlinked_io_mask to shader info header.

  • radv: Always use fixed I/O locations for TCS outputs in VRAM.

  • radv: Clean up gathering linked I/O info.

  • nir/print: Print per-primitive and explicit strict IO info.

  • nir/recompute_io_bases: Fix per-primitive inputs.

  • nir/gather_info: Clear per-primitive I/O masks at the beginning.

  • nir/lower_io_to_scalar: Support explicit (and per-vertex) FS inputs.

  • nir/lower_io_to_scalar: Support per-primitive outputs.

  • nir/opt_varyings: Allow optimizing primitive ID for MS -> FS.

  • nir/opt_varyings: Support per-primitive I/O.

  • nir/opt_varyings: Fix explicit and per-vertex FS inputs.

  • nir/opt_varyings: Add early return when producer stage is task.

  • nir/opt_varyings: Only propagate constant MS outputs, not other uniforms.

  • nir/opt_varyings: Debug print during relocate_slot.

  • nir/opt_varyings: Fix relocate_slot so it doesn’t mix up 32-bit and 16-bit I/O.

  • nir/opt_varyings: Add workaround for RADV mesh shader multiview.

  • ac/nir/ngg: Remove support for loading mesh shader outputs.

  • ac/nir/ngg: Refactor MS primitive indices for scalarized IO.

  • ac/nir/ngg: Slightly refactor mesh shader cull flag stores.

  • ac/nir/ngg: Use just one IO semantics variable in MS output store.

  • ac/nir/ngg: Refactor update_ms_output_info.

  • ac/nir/ngg: Refactor MS output store into two functions.

  • ac/nir/ngg: Split 16-bit MS output stores by components.

  • ac/nir/ngg: Enable packing 16-bit mesh shader outputs.

  • radv: Run DCE before deleting I/O variables.

  • radv: Only consider interpolated inputs as 16-bit float.

  • radv: Refactor emitting PS input types.

  • radv: Remove superfluous bool arg from slot_to_ps_input.

  • radv: Allow using high 16 bits of PS input slots.

  • radv: Rename per_vertex_shaded_mask to explicit_strict_shaded_mask.

  • radv: Rename LDS related variables in get_tcs_num_patches.

  • radv: Calculate VRAM tess patch size independently of LDS size.

  • ac/nir/tess: Split I/O mapping to two functions.

  • ac/nir/tess: Use LDS IO mapping when loading tess levels from LDS.

  • ac/nir/ngg: Implement packed 16-bit VS/TES outputs in non-dedicated slots.

  • ac/nir/ngg: Implement packed 16-bit GS outputs in non-dedicated slots.

  • ac/nir/lower_legacy_vs: Implement packed 16-bit VS/TES outputs in non-dedicated slots.

  • ac/nir/lower_legacy_gs: Implement packed 16-bit GS outputs in non-dedicated slots.

  • ac/nir/ngg: Fix packing 16-bit MS outputs.

Tomeu Vizoso (20):

  • ci: disable Igalia farm

  • gallium/util: Fix pipe_buffer_copy

  • mesa: Import TensorFlow Lite headers

  • teflon: Initial commit

  • etnaviv: Update headers from rnndb

  • etnaviv: Add a bunch of new params for NPUs

  • etnaviv: Don’t emit boilerplate for compute only contexts

  • etnaviv: Use NN cores to accelerate convolutions

  • etnaviv: Use TP cores to accelerate tensor transformations

  • teflon: Add table with known supported models to docs

  • etnaviv: Don’t init the blitter in compute-only contexts

  • etnaviv/nn: Implement zero run length encoding of weights

  • teflon: Enable convolutions with number of output channels not divisible by 8

  • etnaviv/nn: Ensure tile_y is > 0

  • etnaviv/nn: Fix calculation of remaining out channels

  • etnaviv/nn: Move unused field to its right place in the struct

  • etnaviv/nn: Enable image cache

  • etnaviv/nn: Don’t shortcut ZRL bits calculation

  • etnaviv/nn: Keep track of the sign bit when decrementing to zero

  • etnaviv/nn: Make parallel jobs disabled by default

Tranquillity Codes (1):

  • intel: Skip ioctls for querying device info when hardware is unsupported

Valentine Burley (27):

  • tu: Promote VK_EXT_index_type_uint8 to KHR

  • tu: Promote VK_EXT_load_store_op_none to KHR

  • tu: Promote VK_EXT_line_rasterization to KHR

  • docs: Update features.txt for anv, nvk and tu

  • nvk: Enable VK_KHR_shader_subgroup_uniform_control_flow

  • nvk: Advertise VK_KHR_vertex_attribute_divisor

  • nvk: Reorder device features

  • tu: Implement VK_KHR_map_memory2

  • tu: Advertise VK_KHR_vertex_attribute_divisor

  • tu: Reorder device features

  • nvk: Fix missing implementation of creating images from swapchains

  • nvk: Expose VK_EXT_display_control

  • nvk: Expose VK_EXT_surface/swapchain_maintenance1

  • nvk: Expose VK_EXT_swapchain_colorspace

  • docs/features: Add missing VK_EXT_surface/swapchain_maintenance1 entry

  • tu/rmv: Remove tu_rmv_DebugMarkerSetObjectNameEXT

  • nvk: Trivially expose three VK_GOOGLE extensions

  • tu: Expose VK_KHR_surface_protected_capabilities

  • tu: Trivially expose three VK_GOOGLE extensions

  • docs: Update features.txt for tu

  • docs: Update features.txt and new_features.txt for anv and nvk

  • nvk: Add support for version 2 of all descriptor binding commands

  • tu: Move tu_BindImageMemory2() to tu_image.cc

  • tu: Replace TU_HAS_SURFACE with TU_USE_WSI_PLATFORM

  • tu: Fix missing implementation of creating images from swapchains

  • tu: Replace TU_FROM_HANDLE with VK_FROM_HANDLE

  • drm-shim: Stub syncobj reset ioctl

Vasily Khoruzhick (4):

  • lima: ppir: always use vec4 for output register

  • lima: ppir: use dummy program if FS has empty body

  • lima: gpir: abort compilation if load_uniform instrinsic src isn’t const

  • lima: update expected CI failures

Vignesh Raman (5):

  • ci: Add kmod

  • ci: disable Collabora’s farm due to maintenance

  • Split debian-build-testing job

  • ci: Implement support for replaying ANGLE restricted traces

  • ci: handle missing dri libraries during listing

Vinson Lee (2):

  • intel/disasm: Remove duplicate variable reg_file

  • intel/clc: Fix file descriptor leak

Visan, Tiberiu (1):

  • amd/vpelib: revert SRGB to 709

Vlad Schiller (2):

  • pvr: Implement VK_EXT_memory_budget

  • pvr: Implement VK_KHR_index_type_uint8

Yifan Zhang (2):

  • amd: Add code to enable gfx11.5.1

  • radv: initialize video decoder for GFX11.5.1

Yiwei Zhang (105):

  • venus: avoid redundant layout transition for optimal internal layout

  • venus: populate oom from ring submit alloc failures

  • vulkan/wsi/wayland: fix returns and avoid leaks for failed swapchain

  • venus: ensure object id is unique

  • venus: fix pipeline layout lifetime

  • venus: drop some redundant comment

  • venus: fix pipeline derivatives

  • venus: fix to respect the final pipeline layout

  • venus: allow tls ring submission to utilize the entire ring shmem

  • venus: default to enable GPL

  • venus: force async pipeline create on threads creating descriptor pools

  • venus: use obj handle instead of id in device memory report

  • anv: refactor wsi_memory_allocate_info handling

  • anv: optimize the implicit fencing support of external memory

  • anv: extend implicit fencing support for case requiring implicit write

  • vulkan/util: drop redundant code gen from vk_extensions_gen.py

  • vulkan/runtime: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • v3dv: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • venus: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • hasvk: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • anv: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • radv: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • turnip: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • egl: refactor to use DETECT_OS_ANDROID instead of ANDROID

  • gallium: refactor to use DETECT_OS_ANDROID

  • util: refactor to use DETECT_OS_ANDROID

  • meson: drop -DANDROID

  • venus: update tracepoints to align with later optimizations

  • venus: fix the cmd stride used for qfb recording

  • venus: rewrite fence feedback interception to minimize batches

  • venus: refactor to add vn_cached_storage

  • venus: use vn_cached_storage for vn_queue_submission allocs

  • venus: misc cleanups for queue submission

  • venus: simplify feedback types tracking during submission

  • venus: massive feedback renamings for consistency and clarity

  • venus: refactor to add vn_queue_submission_setup_batch

  • venus: simplify to drop the struct vn_feedback_cmds accessor

  • venus: refactor semaphore feedback

  • venus: add vn_set_temp_cmd helper to initialize feedback batch cmd

  • venus: fix to ensure sfb cmds can get recycled

  • venus: mandate a few venus capsets long required before 1.0

  • venus: sync protocol for VK_KHR_fragment_shading_rate

  • venus: add VK_KHR_fragment_shading_rate

  • vulkan: fix runtime libraries’ dep against generated headers

  • venus: fix ffb batch prepare for a corner case and avoid a memcpy UB

  • vulkan: remove unused wsi_common_entrypoints include and dep

  • vulkan: properly ensure wsi_entrypoints header gen order

  • vulkan: remove header files from lib source files

  • vulkan: refactor the runtime header gen order dependency

  • anv/hasvk: default image_read_without_format to true

  • venus: qfb to track cmd handle directly

  • venus: combine query record and reset

  • venus: massive qfb renamings

  • venus: minor cmd count related refactors

  • venus: drop vn_get_temp_cmd_ptr

  • venus: simplify vn_cmd_reset and apply more code sharing

  • venus: refactor query record recycle

  • venus: rewrite qfb vn_feedback helpers

  • venus: refactor vn_queue_submission_add_query_feedback

  • venus: add vn_queue_submission_get_resolved_query_records

  • venus: optimize to further batch query records

  • venus: roundtrip now belongs to ring

  • venus: minor naming cleanups

  • venus: ensure shmem is attached to renderer before use for guest vram

  • venus: avoid excessive ring notifications

  • venus: further reduce idle timeout from 5ms to 1ms

  • venus: add enum vn_relax_reason

  • venus: avoid constant busy wait for query result waiting

  • venus: deprecate unused perf env vars

  • venus: decorate cmd enqueue macro internals with compiler hints

  • venus: add a more relaxed polling strategy

  • venus: cleanup 2 TODOs from 1.3 support

  • venus: remove obsolete TODOs

  • venus: use STACK_ARRAY to simplify modifier query

  • venus: use STACK_ARRAY to simplify BindBufferMemory2

  • venus: use STACK_ARRAY to simplify BindImageMemory2

  • venus: use STACK_ARRAY to simplify render pass creation

  • venus: use STACK_ARRAY to simplify physical device enumeration

  • venus: use STACK_ARRAY to simplify set layout creation

  • venus: use STACK_ARRAY to simplify sync wait

  • venus: rely on enum vn_descriptor_type for internal trackings

  • venus: move async_set_allocation check outside helpers

  • venus: set alloc to skip earlier for reserved and invalid bindings

  • venus: optimize mutable state restore

  • venus: misc set alloc and cleanup refactors

  • venus: drop vn_should_sanitize_descriptor_set_writes

  • venus: refactor descriptor set update and push

  • venus: use STACK_ARRAY to simplify descriptor set update and push

  • venus: use more relaxed profile for TLS ring seqno wait

  • venus: avoid the redundant template entry

  • venus: fix to drop an extra ;

  • venus: simplify push descriptor update with template

  • venus: optimize set update template data population

  • venus: simplify need and ignore rules for desc image info

  • venus: use STACK_ARRAY to simplify set template update and push

  • venus: clean up legacy descriptor update template bits

  • venus: fix swapchain image memory bind

  • venus: fix VkDeviceGroupSubmitInfo::deviceMask for feedback cmds

  • venus: avoid client allocators for ring internals

  • venus: fix to destroy all pipeline handles on early error paths

  • turnip: msm: clean up iova on error path

  • turnip: msm: fix racy gem close for re-imported dma-buf

  • turnip: virtio: fix error path in virtio_bo_init

  • turnip: virtio: fix iova leak upon found already imported dmabuf

  • turnip: virtio: fix racy gem close for re-imported dma-buf

Yogesh Mohan Marimuthu (6):

  • winsys/amdgpu: sws instead of ws for amdgpu_screen_winsys

  • winsys/amdgpu: rws instead of ws for radeon_winsys

  • winsys/amdgpu: aws instead of ws for amdgpu_winsys

  • winsys/amdgpu: use _destroy_locked() for failure to create winsys

  • winsys/amdgpu: remove tab space

  • winsys/amdgpu: add more comments for winsys create in header file

Yonggang Luo (37):

  • util: Add function util_is_power_of_two_nonzero_uintptr and macro IS_POT_NONZERO

  • asahi,panfrost: Use IS_POT_NONZERO to replace util_is_power_of_two_nonzero for different size

  • treewide: Use util_is_power_of_two_nonzero{64|_uintptr} when needed

  • svga: Cleanup duplicate ALIGN macro defines

  • nouveau: Use align64 instead of ALIGN over input layer_size_B

  • treewide: Use align64 instead of ALIGN for 64 bit value parameter

  • util: Update ALIGN prototype to match align

  • compiler/spirv: The spirv shader is binary, should write in binary mode

  • compiler/spirv: There is not need unqualify const in function vtn_string_literal

  • compiler/spirv: vtn_add_printf_string support for handling OpBitcast

  • zink: Update zink-anv-tgl flakes

  • treewide: Remove vulkan/runtime vulkan/util prefix in include path

  • freedreno/vulkan: Use vk_dynamic_graphics_state_init instead of direct assignment

  • vulkan/runtime: Mark vk_default_dynamic_graphics_state to be private

  • Revert “meson/vulkan/util: allow venus to drop compiler deps”

  • vulkan: allow building venus without libcompiler

  • glx: Remove DEBUG code in xfont.c

  • panfrost/shared: avoid use gallium helper in pan_minmax_cache.*

  • panfrost/meson: remove redundant gallium include from meson files

  • treewide: Replace the invalid usage #if DEBUG with #ifdef DEBUG

  • util: Cleanup strtod.(h|c) by introduce _mesa_get_locale

  • meson: Extract with_mesa_debug and with_mesa_ndebug for latter usage

  • meson: Define MESA_DEBUG for latter usage

  • treewide: Replace usage of macro DEBUG with MESA_DEBUG when possible

  • meson: Remove the non-used -DDEBUG manually

  • intel/meson: Remove redundant inc_gallium

  • radv: Remove redundant inc_gallium

  • radv: Rename src/amd/vulkan/vk_format.h to src/amd/vulkan/radv_formats.h

  • vulkan: Move vk_format_is_alpha and vk_format_is_alpha_on_msb into vk_format.h from pvr

  • pvr: inline and remove vk_format_get_channel_width

  • pvr: Merge imagination/vulkan/vk_format.h into imagination/vulkan/pvr_formats.h

  • pvr: Add pvr_ prefix for vk_format_* functions in pvr_formats.h

  • util: Fixes futex_wait on win32

  • util: futex_wait use TIME_MONOTONIC on win32 for consistence with other platform

  • util: Turn futex_wake parameter to int32_t for consistence across platforms

  • broadcom/common: Now “util/box.h” is under src, so remove the FIXME

  • nouveau: Fixes error: unused import: `crate::nvh_classes_cl906f::*`

Yusuf Khan (4):

  • nvk: remove some dead code files

  • nvk: fix valve segfault from setting a descriptor set from NULL

  • crocus: fix potential null pointer dereference if transfer_mapping fails

  • nouveau: Fix crash when destination or source screen fences are null

Zack Rusin (1):

  • svga: Fix instanced draw detection

Zan Dobersek (11):

  • freedreno: add fd_rd_output facilities for gzip-compressed RD dumps

  • tu/msm: fix RD_CHIP_ID size used when dumping RD

  • tu: tu_device should clean up its global bo

  • vulkan/rmv: enable logging miscellaneous internal resources

  • tu: add RMV support

  • freedreno/fdl: avoid overflow in layout size computations

  • tu: fix memory leaks in tu_shader

  • fd: enable prefixing the RD output filename

  • tu/autotune: use SAMPLE_COUNT_END_OFFSET when writing the ending sample count

  • tu: RB_SAMPLE_COUNT_ADDR is also used on a7xx

  • tu/query: improve CP_EVENT_WRITE7::ZPASS_DONE usage

antonino (1):

  • zink: plug leak in `zink_create_quads_emulation_gs`

chyyran (1):

  • util/format/fxt1: include “u_format_pack.h” instead of “util/format/u_format_pack.h”

daoxiang.gong (1):

  • zink - Fix for minLod and maxLod when mipmap filter is disabled

duncan.hopkins (19):

  • compiler/clc: fix compiler issue on MacOS with st_mtim[e] in stat.

  • egl: MacOS platform guard around pthread_condattr_setclock()

  • egl: Added DRI3 code guards.

  • egl: Changed EGLNativeDisplayType size check to make sure it is big enough instead of exactly the same size.

  • gallium/dri: Switch xf86drm.h for util/libdrm.h to allow for the no-op shim to be used.

  • gallium/dri: Added XCB dependency to frontends/dri/libdrm build. Fix header issues with xcb.h being used.

  • util: Updated util/libdrm.h stubs with drmGetMagic()

  • dri: guarded DRI code.

  • glx: Switched DRI2 functions over to use Apple specific alternatives and extension name.

  • meson: relaxed some meson restrictions on MacOS/Apple allowing for wider build support.

  • apple: Extended Apple feature support using GLX_USE_APPLE.

  • apple: Meson defines GLX_USE_APPLE to allow for Gallium drivers to work on MacOS.

  • zink: Fixed header location and compiling issue with [[deprecated]] from newer MoltenVK versions.

  • zink: use portability EXT on Apple.

  • zink: stopped the use of VkFormatProperties3 if the reported API is less than 1.3 or VK_KHR_format_feature_flags2 not present.

  • zink: removed `MESA_PRIM_QUADS` from the supported `PIPE_CAP_SUPPORTED_PRIM_MODES`.

  • zink: Avoid issues when kopper tries using XCB WSI on Apple.

  • zink/apple: added `moltenvk-dir` search to allow MoltenVK to be sourced from brew.

  • zink/apple: update docs to reflect the current status of Zink on macOS.

nyanmisaka (2):

  • frontends/va: Report vendor and device ID through VADisplayPCIID

  • radeonsi/uvd_enc: update to use correct padding size

qbojj (1):

  • vulkan: Fix calculation of flags in vk_graphics_pipeline_state_fill

thfrwn (1):

  • mesa: fix off-by-one for newblock allocation in dlist_alloc