Mesa 26.0.0 Release Notes / 2026-02-11

Mesa 26.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 26.0.1.

Mesa 26.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 26.0.0 implements the Vulkan 1.4 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA checksums

SHA256: 2a44e98e64d5c36cec64633de2d0ec7eff64703ee25b35364ba8fcaa84f33f72  mesa-26.0.0.tar.xz
SHA512: d39d190d0a17306f0aa69033e38dd8cf458dbf8da483b768841e2dc681dd670735999b212fbe0b29be839702a20750c87d6587bd925dca10693950830a17cd55  mesa-26.0.0.tar.xz

New features

  • VK_KHR_relaxed_block_layout on pvr

  • VK_KHR_storage_buffer_storage_class on pvr

  • VK_EXT_external_memory_acquire_unmodified on panvk

  • VK_EXT_discard_rectangles on NVK

  • VK_KHR_present_id on HoneyKrisp

  • VK_KHR_present_id2 on HoneyKrisp

  • VK_KHR_present_wait on HoneyKrisp

  • VK_KHR_present_wait2 on HoneyKrisp

  • VK_KHR_maintenance10 on ANV, NVK, RADV

  • VK_EXT_shader_uniform_buffer_unsized_array on ANV, HK, NVK, RADV

  • VK_EXT_device_memory_report on panvk

  • VK_VALVE_video_encode_rgb_conversion on radv

  • VK_EXT_custom_resolve on RADV

  • GL_EXT_shader_pixel_local_storage on Panfrost v6+

  • VK_EXT_image_drm_format_modifier on panvk/v7

  • VK_KHR_sampler_ycbcr_conversion on panvk/v7

  • sparseResidencyImage2D on panvk v10+

  • sparseResidencyStandard2DBlockShape on panvk v10+

  • VK_KHR_surface_maintenance1 promotion everywhere EXT is exposed

  • VK_KHR_swapchain_maintenance1 promotion everywhere EXT is exposed

  • VK_KHR_dynamic_rendering on PowerVR

  • VK_EXT_multisampled_render_to_single_sampled on panvk

  • VK_KHR_pipeline_binary on HoneyKrisp

  • VK_KHR_incremental_present on pvr

  • VK_KHR_xcb_surface on pvr

  • VK_KHR_xlib_surface on pvr

  • VK_KHR_robustness2 on panvk v10+

  • VK_KHR_robustness2 on HoneyKrisp

  • VK_KHR_robustness2 on hasvk

  • VK_KHR_robustness2 on NVK

  • VK_KHR_robustness2 on Turnip

  • VK_KHR_robustness2 on lavapipe

Bug fixes

  • 3c5c96fe raster performance regression on doom eternal

  • A commit within the last few days of this writing causes hasvk to only display black.

  • ACO: assertion in insert_exec_mask()

  • ACO: fix a hazard when the number of attributes loaded/consumed don’t match with VS prologs

  • ACO: loading 64-bit attributes can override the fetch index in VS prologs

  • ADL, ANV: Wuthering Waves leads to gpu reset on Alder Lake iGPU

  • After 25.3 update some app windows became glitchy on uhd 620

  • Amnesia: The Bunker (2023) OpenGL graphics glitch on Intel graphics

  • CI: It’s not enough to start build tests to run CI, rustfmt must also be started manually

  • Clarify gallium-rusticl-enable-drivers build option

  • Commit bc1a6b0a4121d09cab70506ad0addf70a18730bf breaks Chromium > Save As

  • Ethos gallium driver does not build on 32bit

  • Firefox crashes in some Gallium drivers since mesa 25.3.0

  • Flat bool variables (GLSL_TYPE_BOOL) are not properly managed

  • FurMark gets glitchi graphics when using Vulkan API on UHD 620 (mesa 25.2.6 and 26.0)

  • GL_AMD_framebuffer_multisample_advanced freezes the whole system

  • GTT memory leak when running OpenGL games/software on an AMD RX 6600 XT

  • Ghost of Tsushima page fault

  • Intel BDW regression due to load_push_data_intel intrinsic

  • Issue with blit from framebuffer using texture view to array texture layer

  • JSON manifest compatibility with multiarch systems

  • KHR-GL46.geometry_shader.limits.max_output_components

  • LLVM crashes when loading specific Minecraft Shaderpacks

  • LLVM instruction selection compilation error

  • LLVMPipe’s `VkPhysicalDeviceAccelerationStructurePropertiesKHR::maxPrimitiveCount` is lower than Vulkan requires.

  • MR 37884 breaks Encoding via VAAPI with FFMPEG on RX 9070XT

  • Main branch cannot be built with Python 3.10 now

  • Missing definition of __builtin_ia32_clflush since “util/cache_ops: Add some cache flush helpers”

  • Penumbra: Overture OpenGL game has graphical glitch for ice

  • Polaris, amdgpu: Application using VCE wedges GPU

  • Propely unlock global_bufmgr_list_mutex on error conditions in iris_bufmgr_get_for_fd()

  • RADV: ANGLE deqp regression

  • RADV: gfx12 RGP trace questionable utilization and time duration

  • RFE: Use _mesa_is_foo(ctx) helpers more

  • RX Vega 64 driver hang when processing a large amount of vertex shaders (OpenGOAL: Jak And Daxter 1)

  • Radv nir lowering seg-faults if given ray query proceed before initialize

  • Regression in Vulkan driver for Intel iGPU.

  • Regression: MSVC fails to build 32 bit binaries

  • RustiCL: fence fd leak on CL-GL interop

  • Shader inputs/outputs for vertex/pixel shaders that have the integer (int) type are broken on RDNA 3 and 4 graphics cards

  • Steam Deck/9060 XT Consistently hang with game demo “Cursemark”

  • Texture matrix stack pops do not seem to always update the texture matrix

  • Transcoding mpeg2video with ffmpeg h264_vulkan on Intel cause Conversion failed!

  • UB in NIR when using reallocated range_minimum_query_table

  • Uniform variable not updated correctly with shared contexts

  • Update Vulkan-Profiles and re-enable zink_check_requirements

  • VkPhysicalDeviceLimits.minMemoryMapAlignment uses hardcoded page size

  • Zink on Android: failed to create dri2 screen

  • [26.0.0~rc1] d3d12_screen.cpp:1165:(.text._ZL31d3d12_interop_query_device_infoP11pipe_screenjPv+0x4b): undefined reference to `d3d12_video_encoder_get_last_slice_completion_fence(pipe_video_codec*, void*, pipe_fence_handle**)’

  • [ANV][BMG] Regression - Flickering objects in Resident Evil Village

  • [ANV][DG2/LNL] SolarBay extreme RT regression

  • [ANV][EXT_debug_utils] descriptor set object_name leak when not calling vkFreeDescriptorSets

  • [ANV][LNL] - Alan Wake II (EGS) - Water surface at the beginning of the game has blocky textures

  • [ANV][LNL] - Detroit: Become Human (1222140) - Flickering horizontal artifacts across the screen

  • [ANV][LNL] - Eternal Strands (1491410) - Colorful graphical aberrations are present whenever a 3D asset is visible.

  • [ANV][PTL] R.E.P.O. GPU Hang

  • [ANV][PTL][DG2] Flickering textures in Assassin’s Creed Valhalla benchmark

  • [ANV][Slab][Low-ram] GPU fragmentation in low-memory device when turning on slab.

  • [BMG] Metro Exodus Enhanced Edition (1449560) - Crash

  • [RADV] [Performance] Indiana Jones TGC - lags very badly sometimes

  • [RADV]: cooperative matrix regression

  • [RADV][ACO][Feature Request] Allow op_sel in v_alignbit_b32 etc in GFX9 and GFX10

  • [RADV][GFX12] increase max image dimensions to 32768

  • [RADV][bisected][regression] - Doom: The Dark Ages (3017860) - Square flickering artifacts around Hebeth

  • [RX 9060 XT / gfx1200] VCN page fault & ring timeout during VAAPI HEVC encode with scale_vaapi

  • [anv] mpv video playback blacks out when resized larger than video resolution

  • [bisected][iris] - Celeste - Lighting artifacts during gameplay

  • [bsw][regression][bisected][hasvk] various crashing tests

  • [radv] - WITCH ON THE HOLY NIGHT (2052410) - Flickering squares on some UI elements with gfx1150/1151

  • [radv] Borderlands 4 triggers a consistent GPU page fault on RDNA2

  • [radv] Regression causes Resident Evil 4 crashes with instruction QA checks in vkd3d-proton

  • [radv] Regression causes glitches in Strange Brigade (Vulkan renderer)

  • [radv] Stuttering with latest mesa git (21 sept) on radv/6900 XT

  • [radv] [feature request] Add an env var to not expose resizable bar to app

  • [radv][bisected][regression] GhostwireTokyo RT gpu hangs with HPLOC commit

  • [regression] [bisected] RuneLite GPU Experimental - GPU crash

  • [venus] Many functions lack the VKAPI_CALL modifier, which results in compilation failure on the Win32 i686 platform.

  • [wsi_common_headless] `VkSurfacePresentModeCompatibilityKHR` is not populated when using `VK_EXT_headless_surface`

  • a8f5ced6 regression on Silent Hill 2 Remake

  • amdgpu: ring gfx_0.0.0 timeout, in vr when opening apps

  • android/driconf: sysprops get truncated

  • anv/intel-brw: enable SIMD32 shaders with ray queries

  • anv: Support VK_KHR_pipeline_binary

  • anv_finishme warning spam in journalctl

  • asahi: DMABuf import of multi-plane YCbCr (NV12 from ISP) not renderer correctly

  • brw: Gfx9 sampler messages violate r127 rule

  • ci: Add a full version of lavapipe-vkcts-asan

  • corrupted video when using pRefList0ModOperations on radv h264

  • es1-ABI-check and es2-ABI-check test fails

  • freedreno, tu: resource leak

  • freedreno: Fix KHR-GLES31.core.texture_cube_map_array.color_depth_attachments

  • game Interstellar Rift does not run on AMD while it does work on Nvidia (all models, all software versions)

  • gnome-control-center hitting assert

  • hk: NIR validation failed after nir_lower_vars_to_ssa

  • intel: Expose `XVE Pipelines XMX active (%)` performance counter

  • iris: OpenGL: GL_ARB_texture_cube_map: Broken reflections in Unreal Tournament 2004

  • lavapipe doesn’t expose VK_FORMAT_FEATURE_2_COPY_IMAGE_INDIRECT_DST_BIT_KHR

  • loader.c:156:14: error: call to undeclared function ‘drmCommandWriteRead’

  • lp_texture.c:1523:19: error: call to undeclared function ‘os_dupfd_cloexec’

  • lvp: VkDrmFormatModifierPropertiesList2EXT is not supported

  • mediafoundation: Sample leak/freeze with MFSinkWriter and DX11 usage

  • mesa: deleting a buffer bound only to an index also undoes the associated general target binding

  • mesa: regression caused by hash_table sizing

  • meson: When building radeonsi without llvm, it fails without setting amd_with_llvm to false explicitly

  • meson: mesa is double linking now?

  • nak: prepass instruction scheduler liveness assert fails (with nvk)

  • nir: Unit-test nir_opt_algebraic

  • nvk, nak: Broken icons in ENDLESS Legend 2 on a RTX 4080

  • nvk: CTS failures in sample_locations_ext.verify_interpolation.samples_1

  • panvk: Handle DRLR with more locations than attachments

  • panvk: Insufficient barriers for fragment self-dependencies

  • panvk: fau compute bug

  • r600/sfn: Assertion `cir.alu_vec.empty()` failed

  • radeonsi: crash with NIR_DEBUG=serialize

  • radv vulkan video encode does not process used_by_curr_pic_lt_flag correctly

  • radv, regression : Crysis 2 Remastered raytracing blocky reflections

  • radv: Forza Horizon 5 can trigger page fault on valid, mapped memory

  • radv: Hit assert when creating mix Shader Object

  • radv: Hit assert when over maxFragmentDualSrcAttachments but vkCmdSetColorBlendEnableEXT is set to false

  • radv: Is radv_wsi_get_prime_blit_queue bugged?

  • radv: Kingdom Come Deliverance 1 RDNA4 RGP capture has missing cache counters for dispatch

  • radv: No Man’s Sky XESS page fault GPU reset

  • radv: RB+ for depth-only is broken with unused color attachments

  • radv: RE4 Separate Ways DLC hangs RDNA2 GPU

  • radv: Strange perf delta in a particular CS in TLOU1

  • radv: don’t include constant data in RGP captures

  • radv: incorrect vectorization of 8-bit/16-bit causes random GPU hangs with DXVK

  • radv: shader miscompilation triggering a freeze

  • src/intel/blorp/meson.build:12:4: ERROR: Unknown variable “prog_mesa_clc”.

  • static linking regression since !37495 - spirv-tools shared library required at runtime if exists at build time

  • tu: GPU faults during LRZ clears on unallocated transient attachments in gmem mode

  • tu: resource leak

  • v3d: Build fails when ENABLE_SHADER_CACHE is disabled due to unconditional disk_cache access

  • v3d: green screen when rpivid hevc decoder is used

  • va no longer correctly converts YUV to RGB

  • venus: random failures in dEQP.api.info.image_format_properties2.1d

  • venus: synchronization tests sometimes get stuck in semaphore/fence wait

  • vulkan/runtime: Bad assertion for RT pipelines

  • win_bison random failure extern_stdin:40: ERROR: end of file in string

  • zink/radv: new cts fails on rdna3

Changes

Aaron Ruby (4):

  • device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr

  • Revert “device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr”

  • gfxsteam: Support QNX-native swapchain in host codegen

  • gfxstream: Partial revert of “gfxstream: revert “gfxstream: Add Vulkan func/structs for passing debugging data to host””

Adam Jackson (1):

  • iris: Stop hardcoding 0:2:0 for the PCI bus address

Adrián Larumbe (2):

  • mesa: gallium: make GL object maximum label length a pipescreen cap

  • panfrost: match a GL object’s maximum label length to KMD uAPI limit

Ahmed Hesham (2):

  • panfrost/lima/panvk: Define a common vendor ID

  • panfrost: fix get_image_width for 1D buffer images

Aitor Camacho (79):

  • nir: Add KosmicKrisp required utilities

  • kk: Add KosmicKrisp

  • mr-label-maker: Add KosmicKrisp

  • CODEOWNERS: Add KosmicKrisp owners

  • ci: Add KosmicKrisp Linux build

  • kk: Fix Linux build valgrind dependency

  • kk: Hash vertex input state

  • kk: Expose missing BC formats

  • kk: Set drawID in root descriptor table

  • docs: Add KosmicKrisp to Vulkan

  • docs: Reorder VK_EXT_image_robustness

  • kk: Reorder physical device extensions and features

  • kk: Fix Xcode GPU capture crash

  • kk: Add env variables to enable Xcode GPU capture

  • kk: Use our own driverID value

  • kk: Avoid Metal validation error due to empty calls

  • kk: Fix addressModeW for unnormalized coordinates

  • kk: Ignore depth clear value if load op is not clear

  • kk: Force vertex attribute rebinding when pipeline changes

  • docs,kk: Add KosmicKrisp documentation

  • kk: Add MESA_KK_DISABLE_WORKAROUNDS to disable workarounds

  • util: Introduce HAVE_BUILD_ID for build id utils

  • util: Add build_id for macOS

  • kk: Fill driverUUID

  • kk: Merge io type modifying passes into one

  • kk: Add multiViewport and EXT_shader_viewport_index_layer support

  • kk: Fix image to image copy

  • kk: Use residency sets for user allocations

  • kk: Move all resource tracking to the residency set

  • kk: Exposes more extensions/features we already supported

  • kk: Mark root buffer as not dirty after updating it

  • kk: Remove mem leaks in NIR->MSL, device/sampler create and cmdbuf release

  • kk: Track fragment helper status since Metal does not correctly demote them

  • kk: Remove mem leaks in cmd buf destroy and residency set creation

  • kk: Force attachment load as temp solution to preserve attachment

  • kk: Handle memory coherency for textures and buffers

  • kk: Clamp negative array indices to 0

  • vulkan/cmd_queue: Use vk_strdup and free allocated string memory

  • vulkan/wsi: Fix double destroy of present_id_timeline at swapchain create

  • docs,kk: Add KosmicKrisp environment variable documentation

  • kk: Guard writes after fragment demote

  • kk: Apply robustness only when requested

  • kk: Expose more features/extensions we already support

  • wsi/metal: Fix command buffer release at destroy

  • wsi/metal: Fix blit_imate_to_image’s pool selection for cmd buffer alloc

  • kk: Expose shader storage image read/write without format

  • kk: Expose shaderImageGatherExtended

  • kk: Match float formats to actual Metal features (union of Apple and Mac2)

  • kk: Expose ASTC HDR formats

  • kk: Fix emulated format’s swizzle

  • kk: Expose 4444 and ycbcr 2plane 444 formats

  • kk: Enable fragmentStoresAndAtomics

  • kk: Enable float16 and int8

  • kk: Account for dynamic VI when flushing draw state

  • kk: Mark graphics descriptors’ root dirty when dirtying graphics state

  • kk: Remove unneeded entrypoints in kk_encoder.h

  • kk: Split internally encoder fence signal and end

  • kk: Simplify compute and blit encoder start

  • kk: Change queue writes timing for easier compute merge for Metal4 upgrade

  • kk: Update query availability only if it has availability

  • kk: Propagate availability before we reset it in vkCmdResetQueryPool

  • kk: Remove signal and end from upload writes not to end compute encoders

  • kk: Remove render pass logic in event set/reset entrypoints

  • kk: Attachmentless render passes start postponed to pipeline bind

  • kk: Expose occlusionQueryPrecise

  • kk: Add environment variable to force robustness on all shaders

  • kk: Fix maxTexelBufferElements value

  • nir/opt_varyings: Support implementations that cannot compact 16-bits

  • kk: Fix compilation error when viewMask is 0

  • util: Fix HAVE_BUILD_ID ifdefs

  • kk: Expose extendedDynamicState required by VK_EXT_extended_dynamic_state

  • kk: Fix reported maxInlineUniformBlockSize to match spec expectations

  • kk: Remove unneeded member in kk_descriptor_set_binding_layout

  • kk: Handle unbound sets that contain dynamic buffers

  • kk: Fix texturequerylod

  • kk: Disable KHR_shader_maximal_reconvergence since subgroups are broken

  • nvk: Handle unbound sets that contain dynamic buffers

  • hk: Handle unbound sets that contain dynamic buffers

  • kk: Fix disabling workaround 4

Aksel Hjerpbakk (3):

  • panvk: refactor vk_stage_to_subqueue_mask

  • panvk: cull semaphores in unrelated subqueues

  • panvk: include cmd stages for semaphores on submit

Alejandro Piñeiro (4):

  • panfrost: cleanup outputs_read/outputs_written at pan_shader_info

  • mesa/st: add a warning if can’t set SoftFP64

  • panfrost/job: avoid shadowing variable name

  • pan/bi: report stats only if the shaders got compiled

Aleksi Sapon (7):

  • nir, vk: fix MSVC unused variable warning

  • llvmpipe: doc fixes

  • llvmpipe: use half-even rounding in norm and fixed mul

  • llvmpipe: use half-even rounding in lerp

  • llvmpipe: fix 64bit unpack on x86

  • llvmpipe: lerp rounding test

  • llvmpipe, virgl: update CI traces

Alessio Belle (2):

  • pvr: add device info for GE7800 (15.5.1.64)

  • pvr: add device info for GE8300 (22.67.54.30)

Alexander von Gluck (1):

  • egl/haiku/meson: Include shared libglapi code for dispatch functions

Allen Ballway (1):

  • android: support longer property names

Alyssa Milburn (1):

  • nv50,nvc0: Don’t set caps.max_texture_mb

Alyssa Rosenzweig (65):

  • brw: use the right int8/int16 division lowering

  • util: require typeof support

  • util/dynarray: infer type in append

  • anv: use D3D-compatible texturing for Proton

  • nir/lower_two_sided_color: cleanup

  • util: add util_ptr_is_aligned helper

  • nir: use alignment helpers more

  • intel: do not NIH util_is_aligned

  • intel: use util_is_aligned more

  • asahi: do not NIH util_is_aligned

  • panfrost,tu: use util_is_aligned

  • pvr: don’t NIH alignment helpers

  • people: add Yonggang

  • asahi,ail: fix multi-plane imports

  • util: add UTIL_DYNARRAY_INIT sentinel

  • treewide: use UTIL_DYNARRAY_INIT

  • util: add BITSET_BYTES helper

  • util: add BITSET_RZALLOC

  • treewide: use BITSET_BYTES, BITSET_RZALLOC

  • asahi: clang-format

  • brw,elk: drop unused spirv->nir routines

  • agx: use sparse live-sets

  • poly: fix cull distance

  • util: fix container_of on MSVC

  • pan/bi: initialize variable to fix warning

  • pan/bi: clean up NIR

  • nir/sweep: fix use-after-free with dominance LCA

  • nir/lower_wrmasks: drop support for I/O

  • nir/lower_wrmasks: drop callback

  • nir/lower_wrmasks: clean up & deprecate pass

  • brw: only initialize sample mask flag if needed

  • brw: only lower flrp once

  • people: update Marek’s email

  • nir: print nir_tex_instr::backend_flags if present

  • util/bitset: allow BITSET_*_RANGE(x, 0, -1)

  • util: fix (amusing) find-n-replace fail

  • util: add BITSET_*_COUNT macros

  • treewide: use BITSET_*_COUNT

  • asahi: clang-format

  • hk: fix flrp lowering

  • brw: constant fold before texture lowering

  • agx: fix AGX_MESA_DEBUG=nopreamble

  • asahi: test tilebuffer offsets

  • asahi: tightly pack tilebuffer

  • asahi: use flat tile size encoding

  • asahi: inline agx_shared_layout_from_tile_size

  • asahi: fix garbage with query reads

  • hk: hide vertexPipelineStoresAndAtomics

  • asahi/ci: skip fp64 subgroup tests

  • panfrost,nir: drop my lonely Authors tags

  • pan/mdg: clean up csel typing pass

  • nir: add nir_is_shared_access helper

  • brw: use nir_is_shared_access

  • agx: use nir_is_shared_access

  • pan/mdg: use nir_is_shared_access

  • ac/nir: use nir_is_shared_access

  • nir/builder: infer txf_ms/txl/txb opcodes

  • brw/nir_lower_fs_load_output: unify texture builders

  • vk/meta_copy_fill_update: simplify tex builder

  • radv: cleanup texture builder

  • asahi: use nir_txf

  • hk: unify tex builders

  • agx: fix SSA repair with phis with constants

  • brw: combine peephole select calls

  • nir: disable fast-math for lowering conversions

Alyssa Ross (1):

  • rocket: fix building for musl

Andrew Sinclair (2):

  • gfxstream: revert “gfxstream: Add Vulkan func/structs for passing debugging data to host”

  • gfxstream: revert “gfxstream: Remove unnecessary tag to simplify perfetto trace config”

André (1):

  • nouveau: fix memory leak by freeing drm version before returning

Andy Hsu (3):

  • meson: Support intel tools on Android.

  • u_trace: remove redundant char* to string conversion (v2)

  • intel/decoder: make libvulkan_intel to depend on stub decoder when buildtyle=release.

Anna Maniscalco (7):

  • nir/lower_tex: copy `is_sparse` when lowering txd

  • radv: recalculate legacy_gs_info on bind

  • radv: consistently use the value in bytes for esgs_itemsize

  • freedreno/fdl: add astc hdr formats

  • tu: advertise EXT_texture_compression_astc_hdr

  • docs/features: advertise GL_KHR_texture_compression_astc_hdr on zink

  • zink: fix use_reusable_pool condition

Antonio Ospite (3):

  • mesa: replace most occurrences of getenv() with os_get_option()

  • nouveau/drm-shim: remove double ‘/’ in include path

  • meson/android: bump platform-sdk-version to Android 15

Arcady Goldmints-Orlov (10):

  • kk: enable dualSrcBlend

  • kk: enable logicOp

  • kk: enable shaderDrawParameters

  • kk: enable shaderStencilExport

  • kk: Enable VK_EXT_shader_atomic_float

  • kk: enable VK_KHR_workgroup_memory_explicit_layout

  • kk: enable VK_KHR_vertex_attribute_divisor

  • kk: Enable independentBlend

  • nir: Use nir_shader_intrinsics_pass in nir_lower_io_to_scalar

  • kk: enable shaderClipDistance

Arjob Mukherjee (1):

  • pvr: Fixup for deqp-vk.api 2d.optimal.* conformance

Arzaq Naufail Khan (1):

  • anv: eliminate dead code

Ashish Chauhan (7):

  • pvr: Make display node optional

  • pvr: store arch in device-info

  • pvr: move PVR_TEX_FORMAT_COUNT to pvr_limits.h

  • pvr: split pvr_spm.c

  • pvr: split pvr_formats.c

  • pvr: mark pvr_queue.c as multi-arch

  • pvr: prepare for multi-gen compilation

Ashley Smith (1):

  • panfrost,panvk: Enable shader_realtime_clock on panthor 1.6

Augustin Cavalier (1):

  • renderdoc: Add Haiku platform support

Autumn Ashton (1):

  • radv/video: Implement VK_VALVE_video_encode_rgb_conversion

Benjamin Cheng (24):

  • radv/video: Fill maxCodedExtent caps first

  • radv/video_enc: Cleanup slice count assert

  • radeonsi/vcn: Check and override primary_ref_frame

  • radv/video: Override H265 SPS block size parameters

  • radv/video: Override H265 SPS unaligned resolutions

  • vulkan/video: NULL check codec-specific chain

  • radeonsi/vcn: Re-enable AV1 unidir for new FW

  • radv/video: Fix dummy DPB addresses

  • ac,radeonsi/vcn,radv/video: Drop signature param

  • radv/video: Align each layer of encode DPB to 256

  • ac/parse_ib: Implement VCN dec message parsing

  • radv/video: Fix num_ref_idx_l{0,1} related overrides

  • radv/video: Fix H264/H265 reference selection

  • radv/video: Support two L0 refs on VCN3+

  • radv/video: Override direct_spatial_mv_pred to 1

  • radv/video: Fix force_integer_mv=1 on intra frame

  • radv/video: Always end ref pic modification list

  • radv/video: Move probability table filling to bind

  • radv/video: Enable write combine for decode

  • radeonsi/vcn: Factor out rec_alignment

  • radeonsi/vcn: Allocate DPBs aligned to rec_alignment

  • radv/video: Allow aliasing of video images

  • radv/video_enc: Remove CTS WA

  • radv/video: Use a more reliable way of computing tile sizes

Benjamin Otte (1):

  • radv: Limit GTK workaround to affected versions

Bernd Kuhls (1):

  • blake3: add blake3_neon.c only for little endian archs

Bohan Yu (1):

  • Panfrost: Fix un-split 64-bit address for store_scratch instruction

Boris Brezillon (40):

  • nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering

  • nir: Teach nir_lower_io_vars_to_temporaries() about PLS vars

  • nir: Add a pass to downgrade inout PLS vars to {in,out} only ones

  • panvk/bifrost: Fix YCbCr texture/sampler array indexing

  • pan/cs: Fix cs_extract_tuple()

  • pan/cs: Fix bitop helpers

  • pan/cs: Rename cs_select_sb_entries_for_async_ops()

  • pan/decode: Print defer mode in deferrable instructions

  • panvk/csf: Make sure we don’t get the same iter SB assigned twice in a row

  • panvk/csf: Prepare for more complex scoreboard transitions

  • panvk/csf: Make sure FINISH_FRAGMENTs are properly ordered

  • panvk/csf: Use cs_vt_{start,end}()

  • pan/ci: Bump kernel versions for platforms testing panvk

  • pan/ci: Disable THP on panfrost-g52-piglit

  • people: Add Christoph Pillmayer to the list

  • pan/kmod: Cache the device props at the pan_kmod_dev level

  • pan/kmod: Expose the IO coherency property

  • pan/kmod: Enforce PAN_KMOD_BO_FLAG_NO_MMAP

  • panvk: Don’t allocate memory for a buffer descriptor in CreateBufferView()

  • panvk: Add a panvk_priv_mem_check_alloc() helper and use it

  • panvk: Rely on supported_bo_flags to mask PAN_KMOD_BO_FLAG_GPU_UNCACHED

  • panvk: Add a debug flag to force CPU-uncached mappings

  • panvk: Add a debug flag to force CPU map syncs through the kernel

  • panvk: Flush pending map syncs before submission

  • panvk: Force a cacheline alignment when allocating objects from WB shared pools

  • panvk: Use WB mappings for the global RW and executable memory pools

  • panvk: Fix a memory leak in the descriptor set logic

  • pan/bi: Fix leak in bi_iterator_schedule()

  • panvk: Don’t leak shader binaries when loaded from the cache

  • pan/cs: Don’t leak builder resources

  • panvk: Free the decode context in the create_device() error path

  • pan/ci: Update the g610 flakes to avoid UnexpectedImprovement(Pass)

  • pan/ci: Extend g610-vk pre-merge test coverage

  • ci: Add panfrost drivers to debian-arm64-asan

  • pan/ci: Replace the g610-vk-full job by a g610-vk-asan one

  • zink/ci: Add tests to the anv-tgl fails list to reflect CI state

  • panvk/csf: Fix BY_REGION dependencies

  • panvk: Fix set_compute_sysval()

  • pan/ci: Keep THP enabled on the g52-piglit job

  • panvk: Fix the deviceID reported by the driver

Bram Stolk (1):

  • loader: fix UB in wayland helper code.

Caio Oliveira (32):

  • mesa/st: Lower to ALU scalar after fp64 subgroup lowering

  • intel/mda: Allow to specify directories with `-f`

  • brw: Consolidate late lowering of int64 operations

  • iris: Enable GL_KHR_shader_subgroup_* extensions for Gfx >= 9 when possible

  • brw: Fix EU validation of VxH and Vx1 region

  • brw: Fix MOV_INDIRECT lowering for various platforms

  • brw: Set relevant immediate bits for Gfx9-11 in JIP and UIP helpers

  • brw: Don’t set destination of branch instructions

  • anv, hasvk: Don’t assert on alignment if the value is known to be zero

  • brw: Remove 3src_exec_size from the field macros

  • brw: Properly set ‘desc as register’ for SEND in assembler

  • intel/mda: Use function to read content of objects

  • intel/mda: Handle better processing a lot of archives

  • brw: Move MUL related validation

  • brw: Move AVG related validation

  • brw: Move ADD related validation

  • brw: Drop asserts for brw_SRND

  • brw: Remove LINE from brw_builder and brw_generator

  • brw: Make LINE normalization into validation

  • brw: Move PLN/LINE normalization

  • brw: Add EU validation for ROR/ROL

  • brw: Move MATH related validation

  • nir/gcm: Consider dead code elimination done by GCM as progress

  • brw: Perform mark_last_urb_write_with_eot optimization after CFG

  • brw: Move normalization of 3-src instructions swizzles to a single place

  • brw: Move LRP related validation

  • brw: Consolidate generator code for emitting “regular” instructions

  • brw: Rework UIP and JIP setting code

  • brw/scoreboard: Use a predicate helper for the nomask workaround

  • brw/scoreboard: Disable nomask workaround for Xe2+

  • brw: Fix and properly use increment_a64_address()

  • brw: Fix cooperative matrix constant sources other than src0

Calder Young (16):

  • brw: fix SIMD lowering of fp16 sampler message data with multiple components

  • anv: Fix ray query shadow stack buffer size

  • intel: Fix calculation of max_scratch_ids on fused devices

  • anv: Fix missing const qualifiers on some params in anv_blorp.c

  • anv: Add shorthand for executing on the companion cmd buffer

  • anv: Use companion cmd buffer for CCS and MCS image barriers

  • anv: Fix scratch pool buffer allocation sizes

  • anv: Fix misplaced assertion in anv_scratch_pool_alloc

  • anv: Fix typo when checking if async rt scratch size changed

  • anv: Fix valgrind errors on batch buffers allocated from bo_pool

  • anv: Fix load factor for batch buffer allocation

  • anv/rt: Disable compaction for updatable acceleration structures

  • anv,brw: Allow multiple ray queries without spilling to a shadow stack

  • anv,brw: Add helper to get stack ids per dss for ray queries

  • Revert “anv,brw: Allow multiple ray queries without spilling to a shadow stack”

  • anv: Avoid dumping BVH before command buffer is submitted

Carlos Santa (5):

  • intel/tools: intel_hang_replay refactoring

  • intel/hang_replay: move common code into a lib

  • intel/tools: Handle new replay properties in the Xe KMD error dump file

  • intel/hang_replay: add Xe support

  • intel/hang_replay: add option to dump VM state as part of the dump

Casey Bowman (2):

  • anv: Remove vf_flush for start of command buffers

  • anv: Make pipeline mode switches show which mode is being entered

Caterina Shablia (9):

  • panvk: move sparse blackhole stuff to panvk_sparse.{c,h}

  • pan/lib: introduce row_align_B and array_align_B constraints

  • panvk: sparse partially-resident image -related queries

  • panvk: align rows and layers of sparse resident images

  • panvk/csf: implement sparse image non-opaque binds

  • panvk: report support for sparseResidencyImage2D

  • panvk: do not access the image in image view’s destructor

  • panvk: remove AFBC header zeroing

  • panvk: fix sparse image non-opaque binds

Chia-I Wu (4):

  • panfrost: make RUN_COMPUTE.ep_limit configurable

  • panvk: set compute_ep_limit on v12+

  • panvk: fix calculate_task_axis_and_increment

  • panvk: rework calculate_task_axis_and_increment

Christian Gmeiner (39):

  • bin/ci: Fix SyntaxWarning about return in finally block

  • bin/ci: Update python-gitlab to 5.x for Python 3.14 compatibility

  • anv: Convert DEBUG_PIPE_CONTROL logging to use mesa_log_stream

  • etnaviv: isa: Add norm_mul instruction

  • anv: Convert DEBUG_SPARSE logging to use mesa_log

  • anv: Convert DEBUG_HEAPS logging to use mesa_log

  • anv: Fix needs_temp_copy() incorrectly matching depth/stencil formats

  • util/log: Add MESA_LOG_PREFIX environment variable to control log prefixes

  • etnaviv: Disable trilinear filtering for shadow samplers

  • etnaviv: blt: Add S8_UINT_Z24_UNORM format translation

  • etnaviv: blt: Add Z16_UNORM format translation

  • mesa: OES_texture_stencil8 requries OpenGL ES 3.1

  • meson: require sysprof-capture-4 >= 4.49.0

  • anv: Convert DEBUG_SPARSE logging to use mesa_logi

  • etnaviv/ci: Add KHR-GLES2 conformance testing

  • etnaviv: Add support for ARB_vertex_type_2_10_10_10_rev

  • etnaviv: Improve flatshading

  • lavapipe: Trivially expose VK_GOOGLE_user_type extension

  • etnaviv: rs: Move RS_SINGLE_BUFFER control to per-operation basis

  • etnaviv: Defer GPU state reset until first draw call

  • lavapipe: Advertise variableMultisampleRate

  • vulkan/wsi: Add wsi_common_is_swapchain_image() helper

  • treewide: Use wsi_common_is_swapchain_image() helper

  • etnaviv: isa: Print parser error

  • etnaviv: isa: Add type suffixes to immediate value encoding

  • etnaviv: isa: Remove dual16 mode parameter from parser API

  • etnaviv: isa: Fix f16 immediate encoding

  • etnaviv: isa: Add assembler support for infinity and NaN immediates

  • pvr: Use BUILD_ID_EXPECTED_HASH_LENGTH

  • etnaviv: Update headers from rnndb

  • etnaviv: blt: Set 64BPP_FORMAT flag for clears and copies

  • etnaviv/ci: Add gitlab-ci-inc.yml to file list

  • ci: Describe imagination farm

  • ci: Build imagination vulkan driver

  • pvr/ci: Add dEQP-VK testing for BXS-4-64 on TI AM68 SK

  • pvr/ci: Increase timeout to prevent job failures

  • pvr/ci: Update CI expectations

  • meson: Restore .clang-format for ninja clang-format target

  • pan/compiler: Fix progress reporting in pan_nir_lower_store_component

Christoph Pillmayer (20):

  • pan: Enable rematerialization for more ops

  • pan: Fix bi_load_tl dst arg name

  • pan: Pull out normal block logic from compute_w_entry

  • pan: Add spill cost metric

  • pan: Make W_entry loop aware

  • nir: Fix preseved metadata in sort_unstructured_blocks

  • nir: Update progress info in nir_sort_unstructured_blocks

  • pan: Avoid some redundant SSA spills

  • pan: Copy nir_dominance.c to bi_dominance.c

  • pan: Adapt calc_dominance from nir to bi

  • pan: Fix bi_find_loop_blocks

  • pan: Use bitset instead of bool array in bi_find_loop_blocks

  • pan/bi: Add missing 8bit widen swizzles

  • pan/decode: Fix indent in pandecode_dcd

  • pan/preload: Prepare for reading from single sampled view

  • panvk: Create MS shadow images and views

  • panvk: Setup attachments for ms to ss rendering

  • panvk: Implement VkSubpassResolvePerformanceQueryEXT

  • panvk: Expose EXT_multisampled_render_to_single_sampled

  • pan/bi: Fix bi_find_loop_blocks for single block loops

Collabora’s Gfx CI Team (10):

  • Uprev Piglit to 2ac68e5fb59215ecf89049ec15f3f7494b51a589

  • Uprev Piglit to ec76cc7a31f03c4f4f9d6e3b00f8a70c8ee0fb32

  • Uprev ANGLE to e9626fbced6841d804e7eaf48bb078770822032b

  • Uprev Piglit to 5309e3401d6b03e8a0bb7bfdc1e0f5bc1ad754af

  • Uprev ANGLE to 127a84404b88dbc4327ffb7f831a9a36c3b111bc

  • Uprev ANGLE to ee05836a4934129527544385203ecf420afc5dd1

  • Uprev ANGLE to 2ed4b049c064add3109c7b1e0c954a0bce856df8

  • Uprev Piglit to 2842979ebe03b99c33c3e49af5960c69be6c6d46

  • Uprev ANGLE to b406401e42080c2f8fe479e6c5fa48dfae97c482

  • Uprev Piglit to 62d499d63d2b8b29a67efd9d93ed9b6a94d4950e

Connor Abbott (60):

  • tu: Fix corner case with clearing input attachment

  • tu: Remove useless tu_image_view_init parameter

  • tu: Don’t patch GMEM for input attachments never in GMEM

  • tu: Don’t resolve twice in between subpasses

  • tu: Clear RB_MRT_BUF_INFO::LOSSLESSCOMPEN for stencil

  • tu: Fix 3d load path with D24S8 on a7xx

  • tu: Also disable stencil load for attachments not in GMEM

  • tu: Make blit setup take source and destination samples

  • tu: Add CCU_RESOLVE_CLEAN workaround

  • tu: Remove tu_attachment_info

  • tu: Make r*d_src_depth and r*d_src_stencil generic

  • tu: Add support for “unresolve” ops

  • tu: Implement VK_EXT_multisampled_render_to_single_sampled

  • tu: Fix RT count with remapped color attachments

  • tu: Rename tu_render_pass_attachment::clear_views to used_views

  • tu: Fix attachment stores with subpasses with partial views

  • tu: Zero MSRTSS temporary image before creating it

  • freedreno: Document BV BIN_PREAMBLE usage

  • freedreno/a7xx: Document GRAS_LRZ_CB_CNTL

  • freedreno: Expand a7xx LRZ metadata definition

  • freedreno/registers: Fix encoding fields in 64b registers

  • freedreno/crashdec: Add support for CP_BV_MEMPOOL

  • freedreno: Add synchronization-related control registers

  • freedreno: Decode CP_RESOURCE_LIST

  • freedreno/a7xx: Add BV registers for ROQ status

  • tu: Refactor VSC bo initialization

  • tu: Use scratch mem for conditional loads/stores on a7xx

  • tu: Add tu7_thread_control helper

  • tu/cs: Allow conditional execution in substreams

  • tu: Initialize registers for BV

  • tu: Rewrite visibility stream allocation

  • tu: Correctly set GRAS_LRZ_CB_CNTL

  • freedreno: Add has_pred_bit feature bit

  • tu: Use predicate bit for perf queries

  • tu/a7xx: Support concurrent binning

  • freedreno: Make BV ROQ registers a7xx-only

  • tu: Handle case where pipeline writes unused color attachments

  • editorconfig: Set for glsl files

  • glsl/float64: Fix fmax with NaNs

  • nir, glsl: Add support for softfloat32

  • tu: Expose preserving fp32 denorms via softfloat32

  • tu: Make softfloat shader compiled on demand

  • util/glsl2spirv: Use better glslang flag for -Olib

  • tu: Support softfloat64

  • tu: Stop setting RB_CCU_DBG_ECO_CNTL to 0 for GMEM passes

  • tu: Stop setting GRAS_LRZ_CB_CNTL before GMEM render passes

  • tu: Set GRAS_MODE_CNTL once

  • tu: Set 8E09 once

  • tu: Stop setting view_index_is_input

  • tu: Call nir_lower_sysvals_to_varyings once

  • spirv: Remove view_index_is_input

  • tu: Fix GRAS_BIN_FOVEAT* programming with more than 1 layer

  • tu: Fix FragCoord offset when HW viewport offset is enabled

  • nir, tu: Add and use load_frag_coord_gmem_ir3

  • ir3: Support addr0 align of 8

  • tu: Implement VK_QCOM_subpass_shader_resolve

  • tu: Implement VK_EXT_custom_resolve

  • tu: Fill render pass state when resuming

  • ir3: Fix condition for using uniform predicates

  • freedreno/crashdec: Fix crash with older kernels

Corentin Noël (1):

  • ci: Uprev crosvm and virglrenderer

Daivik Bhatia (4):

  • v3dv: move format helpers to new v3dv format table header files.

  • v3dv: replace raw integers with enum types in helper functions.

  • v3dv: centralize limit macros in v3dv_limits.h

  • v3dv: improve barrier handling for secondary command buffers

Daniel Lang (1):

  • etnaviv: Use FLOAT type for R32G32B32A32_{U,S}INT vertex formats

Daniel Schürmann (77):

  • nir: add nir_imul_nuw() and nir_imul_imm_nuw() helpers

  • nir: don’t use nir_build_alu() with incomplete sources

  • nir: guard nir_def_as_alu()

  • nir/constant_folding: switch to nir_shader_lower_instructions()

  • vulkan/nir: call nir_opt_constant_folding() during vk_spirv_to_nir()

  • nir/builder: add option to immediately constant-fold ALU instructions upon insertion

  • nir/lower_flrp: ad-hoc constant-fold ALU instructions

  • tree-wide: don’t call nir_opt_constant_folding after nir_lower_flrp

  • nir/algebraic: ad-hoc constant-fold ALU instructions

  • radv/shader_info: remove unused output_usage_mask

  • radv/shader_info: use union for precomputed register values of non-overlapping stages

  • radv/shader_info: rename gs_ring_info -> legacy_gs_info and use union with ngg_info

  • radv/shader_info: repack and compact struct radv_shader_info

  • radv: skip shader cache if trap handler is enabled

  • radv: hash keep_executable_info into shader key rather than device cache key

  • radv/null_device: don’t attempt to upload shaders

  • radv/null_device: set more options which affect compilation

  • radv/device: return early in radv_CreateDevice() if creating a null device

  • radv: remove radeon_winsys::get_chip_name() and use info->marketing_name directly

  • amd, radv: create null device without winsys

  • radv: delete winsys/null/*

  • radeonsi: use ac_null_device_create() when AMD_FORCE_FAMILY is set

  • amd/common: rename ac_fake_hw_db.h -> ac_surface_test.h

  • aco/scheduler: remove unused include

  • aco/scheduler: assert that the register demand stays within pre-determined bounds

  • aco/scheduler: remove MoveState::RAR_dependencies_clause

  • aco/scheduler: use hashmap for RAR_dependencies

  • aco/scheduler: refactor downwards dependency check

  • aco/scheduler: move clauses through RAR dependencies

  • nir/opt_load_store_vectorize: don’t add negative offsets to load/store_shared2_amd

  • amd: enable load/store_shared2_amd for GFX6

  • nir/opt_large_constants: Fix dead deref instructions accessing lowered variables

  • treewide: Never preserve nir_metadata_dominance without nir_metadata_block_index

  • radv: Only call nir_opt_memcpy once

  • radv: Only call nir_opt_dead_write_vars once

  • radv: call nir_opt_find_array_copies before first radv_optimize_nir()

  • radv: don’t lower_vars_to_ssa during optimization loop

  • nir/lower_vars_to_ssa: return early if there is no local variables to lower

  • radv: Only call nir_lower_alu_width once in radv_optimize_nir()

  • radv: move nir_opt_copy_prop_vars out of optimization loop

  • drm-shim: handle DRM_CAP_ADDFB2_MODIFIERS

  • amd/drm-shim: handle AMDGPU_INFO_HW_IP_COUNT

  • amd: remove radeon_info::dev_filename

  • amd: remove radeon_info::lowercase_name

  • amd: replace uses of radeon_info::name with ac_get_family_name()

  • amd: remove radeon_info::is_pro_graphics

  • amd: restrict radeon_info::marketing_name to 64 characters and copy it

  • Revert “radv: move nir_opt_copy_prop_vars out of optimization loop”

  • Revert “radv: Only call nir_opt_dead_write_vars once”

  • radv: remove precomputed registers from radv_shader_binary

  • amd: add newer small APUs to get_task_num_entries()

  • amd/common: link with libamdgpu_addrlib

  • radeonsi: use si_shader_encode_{sgprs|vgprs} in si_compute.c

  • aco: disable XNACK on all GPUs

  • ac/gpu_info: move some CU information into separate struct ac_cu_info

  • ac/gpu_info: correct some SGPR and VGPR allocation values in ac_cu_info

  • ac/gpu_info: create separate function ac_fill_cu_info() to fill out CU info

  • aco/tests: don’t pass CHIP_UNKNOWN to ACO

  • aco: pass aco_compiler_options to init_program()

  • aco: add ac_cu_info to aco_compiler_options

  • ac/gpu_info: add some more flags to ac_cu_info

  • aco: use additional flags from ac_cu_info

  • amd: add ac_cu_info::has_mad32 flag and use in ACO

  • amd: add ac_cu_info::has_point_sample_accel flag and use in ACO

  • amd: add and use ac_cu_info::has_gfx6_mrt_export_bug

  • amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug

  • aco: remove radeon_family from aco::Program

  • aco/lower_to_hw: Fix SGPR Operand RegClasses of subdword copies

  • aco/lower_to_hw: Don’t use 2 SGPR operands before GFX10 in a single VOP3 instruction in do_pack_2x16()

  • aco/lower_to_hw: Fix SGPR Operand RegClasses for pack_2x16

  • aco/validate: Validate correct RegisterClasses after lowering to HW instructions

  • aco/tests: Add test for subdword extraction from SGPR

  • aco/tests: Add new test to pack 2x16 SGPRs into VGPR

  • aco/validate: validate constant bus limit after register allocation based on PhysReg

  • nir/loop_analyze: determine for all ALU whether it can be constant-folded

  • nir/loop_analyze: determine whether all control flow gets eliminated upon loop unrolling

  • nir/opt_load_store_vectorize: delay aliasing test in try_vectorize_shared2()

Danylo Piliaiev (28):

  • tu/lrz: Fold disable_write_for_rp check into tu_lrz_disable_write_for_rp

  • tu/lrz: Disable LRZ when CmdSetRenderingAttachmentLocations is used

  • tu/lrz: Disable LRZ writes when draw doesn’t write to all attachments

  • tu: Faster descriptor set allocator

  • vulkan: Always fill DS state for EXT_dynamic_rendering_unused_attachments

  • tu: Use cmd->rp_trace u_trace for draw calls

  • tu: Fix renderpass-level tracepoints not showing up in binning

  • tu: Add concurrent_binning_barrier tracepoint

  • tu: Add a reason for concurrent binning disablement to RP tracepoint

  • freedreno/fdl: Move LRZ FC size calculation to a separate function

  • tu/lrz: Try harder to have LRZ fast-clear enabled with FDM offset

  • tu: Fix CB barrier description

  • tu: Don’t CONCURRENT_BIN_DISABLE when there is no depth image

  • tu: Do not WAIT_FOR_BR if concurrent binning is disabled

  • tu/cs: Helpers to create a region that can be easily enabled/disabled

  • tu: Disable by default CB running alongside renderpasses

  • tu: Disable FLAG_WAIT_FOR_BR sync when CB is disabled

  • freedreno/layout: Use blocks for linear mipmap fallback where possible

  • tu: Handle mismatch in mip layouts for reinterpreted compressed images

  • freedreno: Update A7XX_RB_UNKNOWN_8E09 to be in line with blob

  • tu: Add custom resolve tracepoints

  • ir3: Generify helper_sched to support other flags

  • ir3: Schedule (eolm)/(eogm)

  • tu: Fix passing tmp arrays to tu_desc_set_swiz/fdl6_buffer_view_init

  • tu: Don’t use u_trace_address::bo, only raw iova

  • tu: Restore PC_TESS_BASE after BIN preemption save/restore

  • tu: Fix misleading lrz_disabled_at_draw values for RP

  • tu: Fix typo in min bounds calculation of FDM scissors

Dave Airlie (36):

  • lavapipe: drop lavapipe specific macro for generic one.

  • lavapipe: cleanup some whitespace in lvp_private.h

  • lavapipe: drop unused macro.

  • lavapipe: remove image pointer from lvp_image_view

  • lavapipe: drop device pointer from lvp_cmd_buffer

  • lavapipe: drop device pointer from pipeline object

  • lavapipe: drop device pointer from queue

  • lavapipe: drop device pointer from pipeline cache

  • lavapipe: drop unneeded physical device in sparse image format props

  • lavapipe: drop physical device pointer from lvp_device

  • lavapipe: drop instance pointer from lvp_device.

  • lavapipe: use vk_query_pool as the base for lvp_query_pool

  • intel/elk: drop a bunch of tables for unused elk gens.

  • c11/threads: fix build on c23

  • nir: add a cmat call instruction type.

  • nir: add a flag for functions that are used in cmat calls.

  • nir: add support for cooperative matrix reduction operations.

  • spirv: add support for cooperative matrix reduction operation

  • radv: add support for cooperative matrix reductions.

  • nir: add coopmat per element operations.

  • spirv: add initial support for cooperative matrix per-element ops

  • radv: add support for cooperative matrix per element operations.

  • dozen: return INCOMPATIBLE_DRIVER on instance create failure

  • nak/cmat: free the type mapping hash table.

  • device-select: add a layer setting to disable device selection logic

  • zink: use device select layer settings to disable device selection

  • lavapipe: drop apiVersion from instance

  • lavapipe: repack render attachment.

  • lavapipe: drop mem pointer and offset from buffer

  • lavapipe: drop data pointer from lvp_query

  • lavapipe: drop unused defines

  • radv/coopmat: fix deref stride

  • gallivm: swap 1d array coords before casting.

  • gallivm: let reduce ops use llvm intrinsics

  • lavapipe: add support for VK_KHR_cooperative_matrix.

  • gallivm: handle u16 correct on const loads.

David Heidelberg (1):

  • ci: implement debian-cross-riscv64

David Rosca (72):

  • frontends/va: Move encode functions to separate file

  • frontends/va: Move decode functions to separate file

  • frontends/va: Move remainig processing functions to postproc.c

  • radeonsi/vpe: Stop clearing embedded buffer on allocation

  • radeonsi/vcn: Don’t use temporary feedback buffer when not needed

  • radeonsi/uvd_enc: Don’t use temporary feedback buffer when not needed

  • radeonsi/video: Change si_vid_resize_buffer to take si_resource

  • radeonsi/vcn: Stop using rvid_buffer

  • radeonsi/vce,uvd_enc: Stop using rvid_buffer

  • radeonsi/vpe: Stop using rvid_buffer

  • radeonsi/video: Remove rvid_buffer

  • frontends/va: Support H264 encode pic_order_cnt_type 1

  • radeonsi/vcn: Support H264 encode pic_order_cnt_type 1

  • frontends/va: Always reset H264 slice ref modification and marking count

  • radeonsi/vce: Don’t check ref modification and marking flags

  • radv/video: Introduce two levels of write_memory support

  • radv/video: Only use write_memory for encode feedback with full support

  • radv/ci: Enable video tests on navi21 and navi31

  • radeonsi/vcn: Fix creating context buffer on VCN5

  • radeonsi/vcn: Fix AV1 bidir compound encode with order_hint disabled

  • radv/video: Don’t require encode FW version >= interface version

  • radv/video: Fix AV1 bidir compound encode with order_hint disabled

  • ac/parse_ib: Fix parsing multiple engine commands in one VCN IB

  • ac/parse_ib: Parse VCN_IB_COMMON_OP_RESOLVEINPUTPARAMLAYOUT

  • vulkan/video: Add chroma subsampling to video session

  • vulkan/video: Avoid NULL pointers in session parameters

  • radv/video: Correctly handle no feedback query for encode

  • radv/video: Add NULL checks for picture parameters

  • radv/video: Support intra only without dpb

  • radeonsi/vcn: Remove before_encode() func

  • radeonsi/vcn: Drop vcn_enc_2_0 encode() override

  • radeonsi/vcn: Only allow to enable pre-encode on first frame

  • radeonsi/vcn: Update spec, slice, quality and deblock params each frame

  • vulkan/video: Fix coding AV1 seq_choose_screen_content_tools = 1

  • radv/video: Fix coding allow_screen_content_tools and force_integer_mv

  • radv/video: Fix coding used_by_curr_pic_lt_flag

  • radeonsi/vce: Add workaround for unaligned input surface

  • frontends/va: Add AV1 encode high_bitdepth flag

  • radeonsi/video: Add VPS/SPS/PPS and sequence header functions to radeon_bitstream

  • radeonsi/uvd_enc: Use radeon_bitstream functions to code headers

  • radeonsi/vce: Use radeon_bitstream functions to code headers

  • radeonsi/vcn: Use radeon_bitstream functions to code headers

  • radeonsi/video: Make helper radeon_bitstream functions static

  • radeonsi/vcn: Cleanup HEVC encode deblock params handling

  • radeonsi/uvd_enc: Cleanup HEVC encode deblock params handling

  • radeonsi/vcn: Cleanup AV1 screen content tools coding

  • radeonsi/vcn: Remove unnecessary vars for AV1 encode

  • radeonsi/vcn: Reduce allocated size for pre-encode recon pics

  • radeonsi/vcn: Fix maybe uninitialized warning

  • frontends/va: Use util_dynarray for decode slice data buffers

  • radv/video: Remove enc_session from video session state

  • radv/video: Use radv_enc_aligned_coded_extent for session params overrides

  • radv/video: Remove tile config and skip mode from video session state

  • radv/video: Init session and update rate control in ControlVideoCoding

  • radv/video: Drop casts from vk_find_struct*

  • radv/video: Fix AV1 quantization map maxQIndexDelta value

  • radv: Enable DCC modifiers for multi plane formats on GFX12

  • radv/video: Use different dpb swizzle mode for 10 bit encode

  • radv/amdgpu: Only wait on queue syncobj when needed

  • frontends/va: Use correct pipe profile for VAProfileH264ConstrainedBaseline

  • radeonsi/video: Don’t report support for H264 Baseline profile

  • frontends/va: Support VA_PICTURE_H264_NON_EXISTING

  • radeonsi/vcn: Use is_non_existing H264 ref flag

  • frontends/va: Remove MPEG4 decode support

  • radeonsi/video: Remove MPEG4 decode support

  • r600: Remove MPEG4 decode support

  • virgl: Remove MPEG4 decode support

  • nouveau: Remove MPEG4 decode support

  • pipe: Remove MPEG4 decode support

  • frontends/va: Also treat PRI/TRC_RESERVED0 as unspecified

  • frontends/va: Fix RGB/YUV conversion in Get/PutImage

  • radv/video: Fix maxActiveReferencePictures for H265 decode

Dmitry Baryshkov (10):

  • ci: drop google-freedreno remnants

  • ci: describe my small lab

  • freedreno/ci: add a200 nightly jobs

  • ethosu: drop file names from the generated file

  • rocket: drop file names from the generated file

  • freedreno/ci: mark egl_chromium_sync_control tests as passing

  • freedreno/ci: update fails / flakes list for a750-gl-cl job

  • freedreno/ci: correct rules for a618-gles-asan

  • gfxstream: don’t dump genvk.py args to generated files

  • freedreno/ci: use third A200 runner

Dmitry Osipenko (3):

  • virtio/vdrm: Fix varying offsets of struct vdrm_device members

  • virgl: Implement resource_create_with_modifiers

  • virgl: Support new resource-layout command

Dorinda Bassey (1):

  • util/rust: Add handle type detection to descriptor API

Dylan Baker (47):

  • Version: Bump to 26.0

  • docs: reset new_features.txt

  • docs: update calendar for 25.3.0-rc1

  • intel/mda/tests: use an ASSERT on fread()

  • intel/mda: Fix potential underflow in printing code

  • intel/compiler/brw: fix potential unsigned overflow

  • intel/compiler/brw: Add assert that we don’t have a negative value

  • intel/mda: Use GTEST fixtures to manage File handles

  • intel/mda: Use a vector to track the contents variable

  • anv: Fix potential overflow from doing 32bit math on 64bit types

  • anv: try to help coverity understand we’re not racing

  • anv: assert that we don’t overflow

  • anv: prevent potential, but unlikely, overflow

  • docs: Extend calendar entries for 25.3 by 1 releases.

  • docs: update calendar for 25.3.0-rc2

  • docs: update calendar for 25.3.0-rc3

  • docs: update calendar for 25.3.0-rc4

  • docs: add release notes for 25.3.0

  • docs: Add sha sums for 25.3.0

  • docs: update calendar for 25.3.0

  • docs/relnotes/25.3.0: Remove duplicate bug fixes

  • docs/relnotes/25.3.0: Escape some rst language constructs

  • bin/gen_release_notes: Remove cast that does nothing

  • bin/gen_release_notes: Remove duplicate bug entires

  • meson: make dep_lua a disabler

  • meson: make libarchive a disabler

  • docs/release-calendar: Shift 25.3 releases by one week

  • docs: add release notes for 25.3.1

  • docs: Add checksums for 25.3.1

  • docs: update calendar for 25.3.1

  • anv/video: void cast array we intentionally read off the end of

  • anv/video: Read the right source for memcpy

  • anv/video: Cast intentional read past end of struct member to void*

  • iris: remove uses of pipe_surface as a pointer

  • docs: add release notes for 25.3.2

  • docs: Add checksums for 25.3.2

  • docs: update calendar for 25.3.2

  • docs: add release notes for 25.3.3

  • docs: Add 25.3.3 checksums

  • docs: update calendar for 25.3.3

  • anv: Use { 0 } to initialize struct

  • anv: initialize anv_address to ensure that the protection field is set

  • docs/releasing: fix which commit is cherry-picked

  • docs/releasing: Use a pull request instead of push for relnotes

  • docs/releasing: Add a section to update the website

  • docs/releasing: Use the GitLab CI as the test procedure

  • bin/pick: When the main widget is replaced, trigger a redraw

Ella Stanforth (13):

  • pvr: Avoid putting tile buffer allocators on the heap

  • pvr: Add routine for filling out usc_mrt_setup from dynamic rendering state

  • pvr: add pipeline handling to use dynamic rendering info

  • pvr: make pvr_get_tile_buffer_size static

  • pvr: move tile_buffer_size logic to pvr_device.c

  • pvr: move pvr_load_op to pvr_mrt.h

  • pvr: move pvr_load_op_state to pvr_mrt.h

  • pvr: move load_op_shader_generate to pvr_mrt

  • pvr: use linked list to back deferred clears

  • pvr: Convert format table to indexing with pipe_format

  • pvr: Fix bugs in the format table

  • pvr: fix suspend and resume for dynamic rendering

  • pvr/csbgen: fix packing multiple addresses

Emma Anholt (120):

  • wsi: Fix the flagging of dma_buf_sync_file for the amdgpu workaround.

  • virgl: Fix VIRGL_DEBUG=tgsi to work on debugoptimized builds.

  • nir/link_opt_varyings: Make it participate in NIR_DEBUG=print.

  • tu: Make sure we clear dead writes to vars before nir_link_opt_varyings().

  • nir/shrink_stores: Don’t shrink stores to an invalid num_components.

  • nir/copy_prop_vars: Mask out no-op writes to variables.

  • docs/perfetto: Add row for panvk support.

  • docs/perfetto: Be helpful and opinionated about config selection.

  • docs/perfetto: Give a hint on how to cross compile the tools.

  • docs/perfetto: Explain using tracebox, and put commands in the list.

  • docs/perfetto: Be more clear about the role of MESA_GPU_TRACES=perfetto

  • docs/perfetto: Put V3D at the same level of heading as other drivers.

  • pps: Remove the cpu.cfg file.

  • v3dv: Fix assertion failure for not-found primary_fd during enumeration.

  • docs: Give more reproducible instructions for how to build the docs.

  • tu: Fix leak of MSTRSS temporaries.

  • tu: Fix leak of compute shader pipeline->base.executables_mem_ctx;

  • tu: Fix buffer overflow optimizing MSRTT.

  • tu: Avoid buffer overflows during inline uniform block updates.

  • tu: Add a loop count to VK_pipeline_executable_properties.

  • ir3: Drop use of nir_lower_wrmasks().

  • ir3: Drop ir3_nir_lower_64b_intrinsics

  • ir3: Drop the vector splitting and simplify ir3_nir_lower_64b_global().

  • ir3: Fix incorrect use of predicated ifs on getlast.

  • ir3: Make the debug-print block numbers be the NIR block numbers.

  • ir3: Perform vectorization on ldg/stg just like other memory access.

  • ir3: Drop old comment about ldg vectorization limitation.

  • tu: Use a register pack for VPC_PS_CNTL.

  • tu: Template tu6_emit_window_scissor by CHIP.

  • tu: Template tu6_emit_rt_workaround() by CHIP.

  • tu: Use tu_cs_emit_regs() for SU_POLY_OFFSET setup.

  • tu: Template tu6_build_depth_plane_z_mode by CHIP.

  • tu: Template tu7_emit_tile_render_begin_regs by CHIP.

  • tu: Template r2d_coords by CHIP.

  • tu: Template tu_CmdBeginTransformFeedbackEXT() by CHIP.

  • tu: Template tu_CmdBindTransformFeedbackBuffersEXT by CHIP.

  • tu: Template tu_CmdBindIndexBuffer2KHR by CHIP.

  • tu: Use non-deprecated reg packing in tu6_setup_streamout()’s CRBs.

  • tu: Template fdm_apply_store_coords() by CHIP.

  • tu: Template update_vsc_pipe by CHIP.

  • tu: Template tu6_emit_msaa() by CHIP.

  • tu: Template tu7_emit_subpass_shading_rate by CHIP.

  • tu: Template tu6_emit_vpc_varying_modes() by CHIP.

  • tu: Template tu_pipeline_builder_parse_rasterization_order() by CHIP.

  • tu: Convert tu_init_cmdbuf_start_a725_quirk() to non-deprecated packing.

  • tu: Move VPC_SO_FLUSH_BASE to use reg packing.

  • tu: Move tu6_emit_gs() to use reg packing.

  • tu: Explicitly use 6XX scratch reg packing in perfcntrs_pass_cs_entries.

  • tu: Use non-deprecated names for scratch regs.

  • tu: Use appropriate chip variants for FOVEAT regs.

  • tu: Use appropriate chip variants for LRZ reg packing.

  • tu: Use appropriate chip variants for VRS reg packing.

  • tu: Use appropriate chip variants for SC_BIN_CNTL reg packing.

  • tu: Use appropriate chip variants for VPC/PC reg packing.

  • tu: Use appropriate chip variants for SP_CS reg packing.

  • tu: Use appopriate chip variants in SC scissor/viewport reg packing.

  • tu: Use appropriate chip variants in PS setup.

  • tu: Use appropriate chip variants for CONSERVATIVE_RAS_CNTL.

  • tu: Use appropriate chip variants for A2D reg packing.

  • tu: Use appropriate chip variants for RB regs.

  • tu: Only emit GRAS_SU_RENDER_CNTL and SP_RENDER_CNTL on >=a7xx.

  • tu: Use appropriate variants for GRAS_SU regs.

  • tu: Use a register pack for VPC_VARYING_LM_TRANSFER_CNTL_DISABLE[].

  • tu: Use non-deprecated packing for SP_DITHER_CNTL.

  • tu: use non-deprecated packing for GRAS_CL_ARRAY_SIZE.

  • tu: Use appropriate variants for other GRAS regs.

  • tu: Use appropriate variants for SP regs.

  • tu: Use proper reg packing in another place.

  • tu: Use appropriate variant for HLSQ regs.

  • tu: Pass around the new packing struct for GRAS_LRZ_CNTL.

  • tu: Use non-deprecated reg packing for RB_CLEAR_TARGET().

  • tu: Convert remaining tu_cs_emit_pkt4()s to avoid deprecated reg definitions.

  • freedreno/registers: Apply autopep8 to gen_header.py.

  • freedreno/registers: Simplify a bit of reg printing.

  • freedreno/registers: Restore reg definitions required by kernel.

  • tu: Drop emitting of deprecated packing.

  • nir/shader_bisect: Fix C code printing after review feedback changes.

  • nir/shader_bisect: Allow passing in a –lo / –hi to continue a run.

  • nir/loop_analyze: Use nir_unsigned_upper_bound for loop trip limits.

  • nir/uub: Use an optional max_samples from drivers for sample counts.

  • nir: Optimistically unroll loops using induction var as a sample id.

  • tu,freedreno: Drop the “.bo_write” flag.

  • tu: Add CRB builder.

  • tu: Move pipeline SO setup to the CRB builder.

  • tu: Move VFD CRBs to the CRB builder.

  • tu: Move tu6_emit_mrt() to use CRB.

  • tu: Move tu6_emit_window_offset() to use CRB.

  • tu: move tu6_emit_msaa() to use CRB.

  • tu: Move a bunch of program config to CRB.

  • tu: Split loading immediates for a program from the program config.

  • tu: Move tu_xs_config() to use the CRB builder.

  • nir: Drop the mode argument of nir_lower_vars_to_scratch().

  • nir: Introduce nir_lower_vars_to_scratch_global().

  • ir3: Move the compute shader threadsize forcing earlier.

  • ir3/ra: Make a helper to get RA register pressure limits.

  • ir3: Improve spilling of NIR vars to scratch.

  • freedreno/a3xx: Improve the name of CONSTFOOTPRINT and fix constlen==0 case.

  • freedreno/a3xx-a5xx: restore cbuf0 direct upload.

  • tu: Fix use-after-free in device destruction on old kernels

  • ir3: Fix leak in vars_to_scratch callback.

  • nir/opt_algebraic: Fix return type of fdot(vec(a, 0.0, …), b).

  • nir: Avoid UB of (int)0xff << 24 evaluating usadd_4x8_vc4.

  • nir/algebraic: Apply autopep8.

  • nir: Add a note on how load_sample_pos_from_id works.

  • ir3: Use the new NIR pass for load_barycentric_at_* optimization.

  • ir3: Rename the file for ir3_nir_lower_load_sample_pos().

  • nir: Fix constant evaluation of non-32-bit bitfield_extract.

  • nir: Let nir_eval_const_opcode() return a poison mask in case of UB.

  • nir/constant_expressions: Set the poison flag during i/ubitfield_extract.

  • nir: Specify f2i/f2u as undefined if the float is out of range of the int.

  • nir: Define extract/insert_i8 and friends to be UB if the shift is too large.

  • nir: Define udot_2x16_uadd_sat to have UB according to the SPIRV spec.

  • nir: Rename the unit_test_*_amd intrinics to be un-vendored.

  • nir/opcodes: Avoid technical UB left shifting ints.

  • nir/opcodes: Cast isub/iadd3’s args to uint to avoid UB integer underflow.

  • nir/search_helpers: Avoid UB in is_2x_16_bits()/is_neg2x_16_bits().

  • nir/algebraic: Fix typo in error message print.

  • nir/opt_algebraic_tests: Mark patterns as unsupported or xfails.

  • lima/ci: Remove erroneous skips.

  • ci/tu: Clear stale xfails from the nightlies.

Eric Engestrom (114):

  • mr-label-maker: fix label for mesa release MRs

  • ci: uprev vkd3d

  • docs: update/fix vk spec urls

  • docs: update calendar for 25.2.6

  • docs: add release notes for 25.2.6

  • docs: add sha sum for 25.2.6

  • asahi/virtio: fix memleak

  • util/meson: don’t build libmesa_util_clflushopt unless needed

  • util/meson: don’t build libmesa_util_clflush unless needed

  • lavapipe/ci: document fixed tests

  • lavapipe/ci: mark more tests as flaky

  • ci: track src/c11/ changes

  • ci: track src/android_stub/ changes

  • ci: uprev vkd3d

  • docs: update calendar for 25.2.7

  • docs: add release notes for 25.2.7

  • docs: add sha sum for 25.2.7

  • docs: add 25.2.8 to the calendar

  • broadcom/ci: automatically reboot rpi3 when they fail to find the root device

  • broadcom/ci: fix rpi4 retries

  • docs/release-calendar: add 26.0 branchpoint and release candidates

  • perfetto: use the new upstream repo

  • meson: auto-disable `amd-use-llvm` when `llvm=disabled`

  • meson: auto-disable `draw-use-llvm` when `llvm=disabled`

  • ci: use $CI_TRON_JOB_PRIORITY tag on all ci-tron jobs

  • broadcom/ci: apply “Cannot open root device” reboot workaround to all rpi boards

  • broadcom/ci: update device count in ci-tron farm

  • docs: update calendar for 25.2.8

  • docs: add release notes for 25.2.8

  • docs: add sha sum for 25.2.8

  • rust: configure clippy to only report issues relevant to our MSRV

  • ci: read the MSRV from clippy.toml to avoid having too many copies to keep in sync

  • meson: add rust_global_args for flags for all the rust compilations

  • meson/rust: allow `else { if {} }`

  • meson/rust: allow “needless lifetimes”

  • meson/rust: allow explicit `if x.is_none { return None }` instead of `x?`

  • rusticl/meson: deny all clippy lints before allowing global ones

  • rusticl: rewrite blocks using if/else for clarity

  • etnaviv: allow ISA struct to be spelled all uppercase

  • nil: drop duplicate lib in “liblibnil.a”

  • nak: set nir_shader_compiler_options one one step

  • nak: use filter() instead of open-coding it

  • nak: use `matches!()` instead of open-coding it

  • nak: avoid errors when generated code is empty

  • nak: silence clippy warning about `x * 0`

  • nak: remove unnecessary use of `format!()`

  • nak: drop empty string from `eprintln!()`

  • nak: remove conversion into the same type

  • nak: remove “reference which is immediately dereferenced by the compiler”

  • nak: rewrite `repeat().take()` into `repeat_n()`

  • nak: drop unnecessary reference on both sides of `==`

  • nak: use `assert_eq!(a, b)` instead of `assert!(a == b)`

  • nak: use `foo &= bar` instead of `foo = foo & bar`

  • nak: add all identical values in one step

  • nak: remove unused lifetime

  • nak: drop redundant closure

  • nak: drop unnecessary mutable reference

  • nak: replace `!foo.is_{none,some}()` with their positive counterpart

  • compiler/rust: replace `!first.is_none()` with `first.is_some()`

  • compiler/rust: rewrite `match` into a simpler `if let`

  • compiler/rust: remove unnecessary lifetimes

  • compiler/rust: allow CFG & BitSetStreamTrait to have a `len()` without also having an `is_empty()`

  • compiler/rust: drop “borrow of a value the compiler would automatically borrow”

  • rusticl: silence incorrect clippy error about re-implementing memcpy

  • nak: drop “reference which is immediately dereferenced by the compiler”

  • nak: use saturating_sub() instead of open-coding it

  • nak: drop clone of Copy-able types (RegOrigin & SSAValue)

  • nak: drop cast of u8 to u8

  • nak: allow LdCacheOp values to be named `Cache*`

  • nak: drop “reference which is immediately dereferenced by the compiler”

  • nak: drop “deref on an immutable reference”

  • nak: replace .get(0) with .first()

  • nak: merge identical if branches for blackwell, ampere and ada

  • nak: replace `.find(x).is_some()` with `.contains(x)`

  • nak: drop “unneeded `return` statement”

  • nak: use std::mem::size_of_val(data) instead of open-coding it

  • util/rust: cleanup derelict allow(dead_code) annotations

  • rusticl: drop collapsible_else_if annotation now that it’s allowed globally

  • rusticl: cleanup derelict allow(non_upper_case_globals) annotation

  • nak: cleanup derelict allow(dead_code) annotations

  • nil: cleanup derelict allow(dead_code) annotations

  • ci: fix path to clippy.toml

  • panvk: fix accidental assignment in assert

  • radv/ci: document recent flakes

  • broadcom/ci: document recent flakes

  • turnip/ci: document recent flakes

  • lavapipe/ci: document recent flakes

  • etnaviv/ci: document fixed tests

  • zink+nvk/ci: document fixed tests

  • virtgpu_kumquat: cleanup derelict allow(dead_code) & allow(unused) annotations

  • virtgpu_kumquat_ffi: mark the remaining allow annotations (all non_camel_case_types) as expected

  • virtgpu_kumquat_ffi: use `mutex.get_mut()` instead of `mutex.lock()` to get compile-time garantee that the mutex isn’t already locked

  • virtgpu_kumquat_ffi: use auto-deref instead of doing it by hand

  • virtgpu_kumquat_ffi: mark single-item match as expected

  • mr-label-maker: tag src/virtio/virtgpu_kumquat* as part of gfxstream

  • rusticl: fix ‘enable-drivers’ meson option

  • Revert “renderdoc: Add Haiku platform support”

  • vk/runtime,zink: only integrate renderdoc on supported platforms

  • docs: update url to ci-tron docs

  • docs: delay 26.0 branchpoint by a week

  • etnaviv: run rustfmt

  • ci: run rustfmt on all rust files

  • nir/meson: only try to generate the nir_opt_algebraic tests when requested

  • nir/meson: drop redundant –build-tests in favour of just checking if –out-tests is set

  • VERSION: bump for 26.0.0-rc1

  • .pick_status.json: Update to bed1576b141a5d4398c71abeec5af3674b390aa0

  • pick-ui: update for python 3.14 support

  • nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests

  • VERSION: bump for 26.0.0-rc2

  • .pick_status.json: Update to 248b8184078c6df2c00c987e499348532b52a6e5

  • .pick_status.json: Mark a66d19b691bb8531dd075861b6b103eb488d9237 as denominated

  • Revert “meson: static link spirv-tools for darwin”

  • VERSION: bump for 26.0.0-rc3

  • .pick_status.json: Update to d7814bcad0426c26e88874a3ef2d99d7220bcf48

Eric R. Smith (23):

  • panfrost/panvk: Add size calculations to compiler register code

  • panvk: sanity check block size for unorm format

  • panfrost: add explicit get_dmabuf_modifier_planes override

  • panfrost: update AFBC code to handle tiling for 64bpp formats

  • pan: Add 16 bit AFBC support (v10+ only)

  • mesa: Add R16G16_R16B16_UNORM and related formats

  • dri: check modifier in dri_create_image_from_winsys

  • panfrost: add 422 AFBC formats

  • nir: add intrinsics for pixel local storage

  • pan: fix a bifrost disassembly assert failure

  • panvk: fix ycbcr format issues on bifrost

  • panvk: enable ycbcr on bifrost

  • panfrost: do not allow skipping of fragment shader when alpha-to-coverage

  • panfrost: add benchmarking documentation

  • pan: add variant to shader name for G310 variants

  • pan drm-shim: add a way to specify the GPU variant in PAN_GPU_ID

  • pan: add actual register usage to the shaderdb stats

  • pan: pass a pointer to bi_compile_variant_nir, rather than a struct

  • pan: prettier output when statsfull flag is set

  • pan: pass pan_shader_info data to pan_stats_verbose

  • pan: refactor shader info setting

  • pan: move pan_shader_update_info call for bifrost

  • mesa: do not unbind general point when different indexed points are deleted

Erico Nunes (2):

  • ci: lima farm maintenance

  • Revert “ci: lima farm maintenance”

Erik Faye-Lund (115):

  • zink/ci: document a flake

  • zink/ci: document a nightly failure

  • radeonsi/ci: document flake

  • pvr: remove unused macros

  • pvr: remove needless include

  • pvr: move queue function to pvr_queue.c

  • pvr: break out pvr_instance and pvr_physical_device

  • pvr: factor out pvr_sampler

  • pvr: rename rogue_get_slc_cache_line_size

  • pvr: move non-rogue helpers to pvr_hw_utils.h

  • pvr: move static_asserts to source-files

  • pvr: rework pds_state array length logic

  • panfrost: initialize sig before use

  • panfrost: remove needless variable

  • pan/kmod: fix priority query logic

  • panvk: do not open-code debug_get_num_option

  • panvk: assert that shader_present isn’t zero

  • panfrost: remove stale code

  • pvr: split idep_pco_uscgen_programs_h in two

  • pvr: encapsulate border-table

  • pvr: encapsulate clear-state

  • pvr: factor out write_immutable_samplers

  • pvr: store has_pbe_stride_align_1pixel in pvr_device_features

  • pvr: respect has_pbe_stride_align_1pixel

  • pvr: limit availability of HW defs

  • mesa/main: correct formatquery error-handling

  • mesa/st: do not enable EXT_texture_buffer_object with rgba only

  • mesa/main: correct error message

  • mesa/main: do not check for ARB_texture_buffer_object for GL 3.1

  • v3d: only expose rgba buffer-textures

  • panfrost: only expose rgba buffer-textures

  • zink: only expose rgba buffer-textures

  • mesa: introduce and use _mesa_has_texture_buffer_range

  • panfrost/ci: remove some out-of-date xfails

  • mesa/st: do not drop binding prematurely

  • docs/panfrost: remove some stray newlines

  • pan: make S8_UINT code behave like the rest

  • pan: add support for float-formats

  • pvr: move include to source-file

  • pvr: do not store VkFormat in pvr_format

  • pvr: remove unused member

  • pvr: move border-specific format-code into pvr_border.c

  • pvr: split out pbe-details from main format-table

  • pvr: rework format binding flags

  • pvr: use strongly-typed enum instead of uint32_t

  • pvr: do not store compressed pbe-formats

  • pvr: add helpers to query limits based on device-info

  • pvr: fixup some includes

  • pvr: make queries arch-agnostic

  • pvr: replace constant-returning function with a macro

  • pvr: break out pvr_free_list into a separate module

  • pvr: rename colliding symbol

  • pvr: disable has_gs_rta_support for ge7800 as well

  • pvr: run clang-format

  • panfrost: do not over-estimate memory needed for dummy-rt

  • panfrost: factor out meat of pan_bytes_per_pixel_tib to helper

  • panfrost: do not over-estimate format tib-size

  • mesa/st: always override internal-format for 10-bit formats

  • pvr: add missing include

  • pvr: add missing forward-decl

  • pvr: store format-table in pvr_physical_device

  • pvr: limit availability of HW defs

  • pvr: factor out cmdbuf functions from pvr_query.c

  • pvr: factor out pvr_rt_dataset to separate module

  • pvr: factor out framebuffer-specific code

  • pvr: split pvr_device.c

  • pvr: split pvr_csb.c

  • pvr: split pvr_descriptor_set.c

  • pvr: mark pvr_border.c as multi-arch

  • pvr: mark pvr_pass.c as multi-arch

  • pvr: mark pvr_tex_state.c as multi-arch

  • pvr: mark pvr_job_compute.c as per-arch

  • pvr: mark pvr_cmd_buffer.c as per-arch

  • pvr: mark pvr_cmd_query.c as per-arch

  • pvr: mark pvr_job_render.c as per-arch

  • pvr: mark pvr_job_transfer.c as per-arch

  • pvr: mark pvr_job_context.c as per-arch

  • pvr: split pvr_image.c

  • pvr: mark pvr_hw_pass.c as per-arch

  • pvr: mark pvr_job_common.c as per-arch

  • pvr: mark pvr_sampler.c as per-arch

  • pvr: mark pvr_query_compute.c as per-arch

  • pvr: mark pvr_mrt.c as multi-arch

  • pvr: mark pvr_framebuffer.c as per-arch

  • pvr: prepare winsys files for multi-arch

  • pvr: only build pvr_dump_csb.c for rogue

  • pvr: make blit/clear-code rogue-specific

  • pvr: use rogue-prefix for rogue-specific code

  • pvr: pass device-info to a few winsys functions

  • pvr: build pvr_arch_*.c as a multi-arch sources

  • pvr: make some winsys files multi-arch

  • pvr: limit hw-defs to rogue

  • panfrost: enable texel-buffers for three-component formats

  • pvr: add missing include

  • pvr: add missing forward-declaration

  • pvr: clean up include

  • pvr: use per-arch macro for pvr_device_init_spm_load_state

  • pvr: use pvr_arch aliases

  • pvr: do not use PVR_PER_ARCH for static function

  • pvr: do not use alias in definition

  • pvr: rename PVR_PER_ARCH aliases to pvr_arch_ for clarity

  • pvr: use pvr_arch defines in implementations

  • docs/pvr: document the multi-arch approach

  • pvr: encapsulate format-table

  • docs: upgrade bootstrap to 5.3.8

  • docs: use option-directive

  • panfrost/ci: remove fixed CTS-flakes

  • panfrost/ci: remove fixed failures

  • panfrost/ci: add warning about g720 results

  • docs: remove ancient stuff from faq

  • docs/faq: do not recommend basing drivers on i965

  • lima: update unknown field

  • panvk: move cmd_resolve_attachments to panvk_vX_cmd_meta.c

  • panvk/meta: make helpers static

  • panvk: promote VK_EXT_robustness2 to VK_KHR_robustness2

Faith Ekstrand (201):

  • panvk: Fix integer dot product properties

  • util: Don’t advertise cache ops on x86 without SSE2

  • util: Build util/cache_ops_x86.c with -msse2

  • nvk: Include the chipset in the pipeline/binary cache UUID

  • nvk: Disable sampleLocationsSampleCounts for 1x MSAA

  • nvk: Emit inactive vertex attributes

  • nvk: Look at the right pointer in GetDescriptorInfo for SSBOs

  • nvk: Capture/replay buffer addresses for EDB capture/replay

  • panvk/shader: Implement [de]serialization of ASM and NIR strings

  • panvk/shader: [de]serialize desc_info.max_varying_loads

  • panvk/shader: Use the right copy size for deserializing dynamic UBOs/SSBOs

  • panvk: Add an in-memory shader cache

  • panvk: Use the build SHA for the pipeline/binary cache UUIDs

  • panvk: Enable the disk cache

  • zink: Disable building the zink_check_requirements tool for now

  • Update the Vulkan-profiles wrap to 1.4.330 and re-enable zink_check_requirements

  • vulkan/util: Add a vk_format_srgb_to_linear() helper

  • vulkan/meta: Handle VK_RENDERING_ATTACHMENT_RESOLVE_SKIP_TRANSFER_FUNCTION_BIT

  • vulkan/meta: Handle VkResolveImageModeInfoKHR

  • nvk: Plumb attachment flags through to MSAA resolve

  • nvk: Switch to CmdEndRendering2KHR()

  • nvk: Advertise the new maintenance10 format features

  • nvk: Advertise VK_KHR_maintenance10

  • nvk: Don’t re-initialize the descriptor writer if the set matches

  • nvk: Document some environment variables

  • nvk: Add an NVK_DEBUG=coherent flag

  • nvk: Enable ASTC on Tegra

  • nvk: VK_EXT_shader_uniform_buffer_unsized_array

  • pan: roll lower_texture() into postprocess()

  • pan/bi: Call constant folding in postprocess()

  • nir: Handle lowered I/O in lower_viewport_transform()

  • nir: Check the deref mode in lower_point_size()

  • pan/bi: Move lower_noperspective*() to postprocess()

  • pan: Move point size and viewport lowering to postprocess

  • vulkan/runtime: Add a get_push_range_for_stage() helper

  • vulkan/runtime: Add a vk_compile_shaders() helper

  • vulkan/runtime: Add an environment variable to validate shader binaries

  • nvk: Advertise VK_KHR_pipeline_binary

  • panvk: Initialize the disk cache earlier

  • panvk: Advertise VK_KHR_pipeline_binary

  • nir: Simplify assign_io_var_locations()

  • drm-uapi: Import the new NVIDIA modifiers

  • nil: Add support for Blackwell 8 and 16-bit modifiers

  • nir: Add a couple panfrost sysvals to divergence analysis

  • pan/compiler: Expose the bifrost optimization loop

  • panvk: Split var copies and lower local vars early

  • panvk: Lower copy_deref and indirect derefs before nir_lower_io

  • panvk: Only lower outputs to temporaries

  • panvk: Optimize in the preprocess hook

  • nir: Add a type parameter to nir_lower_point_size()

  • pan: Use nir_lower_point_size for the float16 conversion

  • panvk: Make noperspective_varyings const

  • panvk/dispatch: s/shader/cs/g

  • panvk: Add a panvk_common_sysvals struct

  • spirv: Assume variable workgroup size unless it’s set

  • pan/bi: Add some helpers an an info field for needing the extended FIFO

  • pan/bi: Add support for writing gl_PrimitiveID from IDVS

  • pan/genxml: Rename Primitive Index Override

  • panvk: Set primitive_index_override when prim ID is written by IDVS

  • spirv: Only set workgroup_size_variable on compute-like stages

  • vulkan/drm-syncobj: Stop returning early waiting for sync files

  • poly,asahi: Rename poly_tess_args to poly_tess_params

  • poly,asahi: Rename poly_ia_state to poly_vertex_params

  • asahi: Upload vertex and geom/tess params together

  • hk: Expose the vertex param buffer to other stages

  • poly,asahi: Move vertex_output_buffer to poly_vertex_param

  • poly,asahi: Fetch directly from poly_vertex_state::output_buffer in GS

  • SQUASH: poly,asahi: Move the output mask to poly_vertex_state

  • asahi: Reorder state uploads in agx_draw_patches()

  • poly: Rename poly_nir_lower_gs.h to poly_nir.h

  • poly: Add a poly_nir_lower_sysvals() pass

  • nir: Improve comments for a couple poly intrinsics

  • poly: Fetch the index size from a sysval

  • poly,asahi: Put the indirect draw directly in the geometry params

  • poly: Add helpers for filling out poly_geometry_params

  • poly: Add helpers for filling out poly_vertex_params

  • hk: Use the new poly param helpers

  • agx: Use the new poly param helpers

  • poly: Move vs_grid to poly_vertex_params

  • poly/asahi: Pull a bunch of vertex_id_for helpers into poly/prim.h

  • poly,asahi: Pull restart unrolling into libpoly

  • poly: Generalize unroll_restart() to arbitrary workgroup/subgroup sizes

  • poly: Make all heap allocations atomic

  • nvk: Add a dedicated_image to nvk_device_memory

  • nir: Add LAYER_ID and VIEW_INDEX to nir_lower_sysvals_to_varyings()

  • spirv: Emit SYSTEM_VALUE_LAYER_ID for fragment shaders

  • nir: Support sysval intrinsics in lower_sysvals_to_varyings()

  • microsof: Run lower_sysvals_to_varyings after lower_input_attachments

  • tu: Set use_layer_id_sysval for nir_lower_input_attachments

  • nir: Always use sysvals in lower_input_attachments()

  • pan: Move compiler to compiler/bifrost

  • pan: Move midgard to compiler/midgard

  • pan: Move util/* to compiler/

  • pan/bi: Add separate meson files for bifrost tests

  • pan: Add a central libpanfrost_compiler library

  • pan: Move pan_arch() to pan_model.h

  • pan: Move disassembly wrappers to a new pan_compiler.h

  • pan: Move pan_shader NIR helpers to pan_compiler.h

  • pan/compiler: Move all NIR passe definitions to pan_nir.h

  • pan/compiler: Move pan_ir.h into pan_compiler.h

  • pan: Move pan_shader_compile() to pan_compiler.h

  • pan/genxml: Decode blend shaders on CSF

  • pan/blend: Use flat inputs for blend shaders

  • panvk/jm: Delete panvk_varying_hw_format()

  • pan/bi: Fix LD_VAR_BUF indirect offset calculations

  • pan: Move PRINTF_BUFFER_SIZE to the compiler

  • pan: Drop bifrost_shader_blend_info::format

  • pan: Move pan_compile_shader to pan_compiler.c

  • pan/bi: Handle small vectors in bi_src_index()

  • pan/bi: Only delete function temp variables

  • pan/bi: Move opt_sink and opt_move calls to postprocess

  • panvk: Run pan_preprocess_nir() in the preprocess step

  • pan/bi: Run nir_lower_all_phis_to_scalar() late

  • panvk: Upload all variants at the end of compile_shader()

  • panvk: Add separate COMPUTE and FRAGMENT cases in compile_shader()

  • panvk: Use nir_instr_clone() for input attachment loads

  • panvk: Stop using descriptor helpers in lower_input_attachments

  • panvk: Break input attachment lowering into its own file

  • panvk: Call lower_input_attachment_loads() from compile_shader()

  • panvk: Re-prefix panvk_shader_desc_info/map with lower_

  • panvk: Move I/O lowering out of panvk_lower_nir()

  • panvk: Store the varying attribute descriptor count in desc_info

  • panvk: Only pass the panvk_shader_desc_info to panvk_lower_nir()

  • panvk: Make compile_inputs const in panvk_compile_nir()

  • panvk: Restructure VS variant handling

  • panvk: Pull multiview lowering out of panvk_lower_nir()

  • panvk: Drop compile_inputs from panvk_lower_nir()

  • drm-uapi: Sync the panthor header

  • drm-uapi: Sync the panfrost header

  • util: Move STACK_ARRAY into util

  • pan/kmod: Add a panfrost_kmod_driver_version_at_least() helper

  • pan/kmod: Expose the BO flags supported by a pan_kmod_device

  • pan/kmod: Add new helpers to sync BO CPU mappings

  • panvk: Mask off BO_FLAG_WB_MMAP in adjust_bo_flags()

  • panvk: Implement Flush/InvalidateMappedMemoryRanges()

  • panvk: Sync CPU maps around host image copies

  • panvk: Store the memory heaps/types in the physical device

  • panvk: Base memoryTypeBits on phys_dev->type_count

  • panvk: Advertise a HOST_CACHED memory type if we have WC maps

  • panvk: Add various flush/invalidate helpers for internal BOs

  • panvk: Map our standalone private BOs writeback when it makes sense

  • panvk: Add a write_desc_data() helper

  • panvk: Use write-back maps for descriptor sets

  • panvk: Use WB maps for command buffer memory

  • nvk: Check before claiming UNIFORM_TEXEL_BUFFER_BIT

  • nil: Claim buffer support for R64_[US]INT

  • nak: Use nir_lower_io_lower_64bit_to_32

  • nvk: Add support for 64-bit vertex attributes

  • pan/genxml: Get rid of non-existant Tiler Heap fields

  • pan/genxml: Add float internal and writeback formats

  • pan: Add a helper for packing blend constants

  • pan/blend: Add support for float blending

  • panfrost: Only set blend constants if needed

  • panfrost: Plumb through float blending equations

  • panvk: Set pan_blend_equation.is_float

  • panvk: Check can_fixed_function() before checking constants

  • pan: Add support for blending with F16 and F11/10 formats

  • util: Add a helper to convert color blend factors to alpha

  • pan/blend,panvk: Optimize blend equations

  • pan/bi: Dump shader to stderr

  • pan/bi: Use nir_print_shader() instead of nir_log_shader()

  • panvk: Lock around the compile_shaders() when debug dumping

  • nir: Add some new panfrost fragment shader intrinsics

  • pan/nir: Add a NIR pass to lower FS outputs to the new intrinsics

  • pan: Implement the new NIR FS intrinsics

  • pan/bi: Use bi_emit_collect_to() for load_const

  • pan/bi: Use MUX for setting LD_TILE sample indices

  • pan: Use nir_intrinsic_blend_pan for blend shaders

  • pan: Switch to nir_intrinsic_load_blend_input_pan

  • pan/compiler: Drop pan_compile_inputs::bifrost::rt_conv

  • pan/bi: Lower FS outputs to blend in NIR

  • pan: Move pan_nir_lower_writeout to midgard/

  • pan/genxml: Fix some sizeof() asserts

  • pan/genxml: Enable CSF tracing of RUN_FULLSCREEN

  • panvk: Use a full-screen barrier draw for FB barriers

  • pan/bi: Add a bi_instr::blend_target

  • panvk/csf: Stop calling blend_emit_descs() with no FS

  • pan/genxml: The BLEND array must be 64B aligned

  • panvk/csf: Set the correct DCD_FLAGS_1.render_rarget_mask

  • panvk/blend: Stop setting color_mask = 0

  • pan/bi: Mark whole flat variables

  • panvk/jm: Drop the loads_blend_const hack for uniform_count

  • panvk: Push our own blend descriptors

  • nir: panfrost tile loads are always divergent

  • pan/bi: Implement pack_32_4x8 natively

  • nir,pan: Rework the pafrost tile load intrinsic

  • nir,pan: Add and implement a new store_tile_pan intrinsic

  • nir/lower_blend: Move the format to nir_lower_blend_rt

  • nir/lower_blend: Optimize trivial logic op cases

  • nir: Expose the guts of nir_lower_blend as builder helpers

  • pan/blend: Use the blend builder helpers instead of nir_lower_blend()

  • panfrost: Lower pixel-local storage to load/store_tile in NIR

  • ci: Mark fbo-blending-format-quirks as a fail on G52

  • panfrost: SPDX everything

  • pan/genxml: Add lisence blocks to the XML files

  • panfrost: Add a few missing license blocks

  • panvk: Map ro_sink_address_poly to an OOB address

  • nvk: Enable ZPASS_PIXEL_COUNT in draw_state_init()

  • nir/lower_bool_to_bit_size: Use the correct num_components for conversions

  • pan/bi: Run lower_alu_width after opt_algebraic_late

  • pan/bi: Don’t attempt to fuse AND(ICMP, ICMP) if the AND is swizzled

Felix DeGrood (12):

  • intel/tools: make frame and cb index base-0 in intel_measure

  • intel/tools: add eop timestamp to intel_measure

  • intel/tools: make eop default

  • intel/tools: add cmdbuf/queue annotation parsing

  • intel/ds: reduce min sampling period of pps-producer to 5us

  • anv/pps: remove assert for double init

  • anv/rt: rewrite encode.comp for better performance

  • anv/rt: fully restore code to write instance_count

  • anv/rt: multithread writing of invalid leaves

  • anv/rt: reduce writes to block_incr_and_start_prim

  • anv/perfetto: include all pc reasons

  • anv/rt: avoid out of bound access by clamping global id

Frank Binns (5):

  • pvr: sort extensions alphabetically

  • pvr: Advertise VK_KHR_relaxed_block_layout

  • pvr: Advertise VK_KHR_storage_buffer_storage_class

  • nvk: remove duplicate header include

  • pvr: check image usage features against image features

Franz Hoeltermann (1):

  • device-select: Avoid usage of legacy GetPhysicalDeviceProperties This caused validation errors and redundantly called both the new “2” variant and the legacy variant

Georg Lehmann (172):

  • nir: remove manual nir_load_global

  • treewide: use nir_load_global alias of nir_build_load_global

  • nir: remove manual nir_store_global

  • treewide: use nir_store_global alias of nir_build_store_global

  • nir: remove manual nir_load_global_constant

  • treewide: use nir_load_global_constant alias of nir_build_load_global_constant

  • aco/optimizer: re-index labels

  • aco/optimizer: add seperate fp16 abs/neg/fcanonicalize labels

  • aco/optimizer: rework canonicalized label

  • aco/optimizer: replace 64bit mul with 1.0/-1.0 with bitwise instruction if possible

  • aco/isel: emit v_mul_f64 with modifiers for fneg/fabs

  • aco/isel: emit v_mul_f64 for fp64 fsat

  • aco/optimizer: fix applying 64bit neg/abs

  • aco/optimizer: apply fp64 modifiers

  • aco/tests: add some simple fp64 modifier tests

  • aco/lower_to_hw: emit vop2 for gfx12+ fp64 reductions

  • aco/isel: emit vop2 v_fadd_f64 for gfx12+

  • aco/isel: emit vop2 v_mul_f64 for gfx12+

  • aco/isel: emit vop2 v_min_f64 for gfx12+

  • aco/isel: emit vop2 v_max_f64 for gfx12+

  • aco/isel: emit vop2 v_lshlrev_b64 for gfx12+

  • aco/opcodes: remove VOP3 alias for new gfx12 VOP2 opcodes

  • aco: fix v_mad_mix denorm behavior

  • aco: allow v_fma_mix with denorms for gfx9 chips where it’s fused

  • aco/optimizer: never unfuse fma

  • radv: do not report wave32 in gl_SubgroupSize for Doom Dark Ages

  • aco/gfx10_3: work around NSA hazard

  • nir/opt_algebraic: optimize open coded pack_32_2x16

  • aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits

  • aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits

  • aco,nir: support subdword v_permlane_b16

  • aco/optimizer: refactor insert

  • aco/optimizer: add extract_float helper

  • aco/optimizer: make label_mad more generic

  • aco/optimizer: add new helper functions for combining two instructions

  • aco/optimizer: use new helpers to create fma

  • aco/optimizer: create fma with s_mul_f32/f16

  • aco/optimizer: add less agressive pattern matching option

  • aco/optimizer: use new helpers for min3/max3/minmax/maxmin

  • aco/optimizer: use new helper functions to create med3

  • aco/optimizer: create max3/min3/med3 with salu min/max

  • aco/optimizer: use new helpers to optimize mul(b2f(a), b)

  • aco/optimizer: use new helpers for add16 opts

  • aco/optimizer: use new helpers for packed fma

  • aco/tests: test packed fma opts

  • aco/optimizer: reduce max alu_opt_info stack operands to 4

  • aco/optimizer: parse pseudo alu instructions

  • aco/optimizer: use new helpers for v_or opts

  • aco/optimizer: use new helpers for xor opts

  • aco/optimizer: use new helpers for v_add_u32 opts

  • aco/optimizer: optimize add(mad_u32_u16(a, b, 0), c)

  • aco/optimizer: use new helpers for s_lshl<n>_add_u32

  • aco/optimizer: use new helpers for v_add_lshl_u32

  • aco/optimizer: add more v_add_lshl_u32 opts

  • aco/optimizer: use new helpers for v_and opt

  • aco/optimizer: use new helpers for remaining add opts

  • aco/optimizer: use new helpers for v_sub opts

  • aco/optimizer: use new helpers for bitwise n2 opts

  • aco/optimizer: add some bitop combining

  • aco/optimizer: use cndmask for neg(b2i)

  • aco/optimizer: some more mul opts

  • aco/optimizer: create ff0/bcnt0

  • aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))

  • aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)

  • nir/opt_algebraic: create more bit test

  • aco/opt_postRA: allow v_cmpx to clobber exec before nop split/create vector

  • aco/optimizer: move med3 -> add_clamp opt later

  • aco/optimizer: add new helpers for applying output modifiers

  • aco/optimizer: handle gfx11+ vinterp as fma special case

  • aco/optimizer: use new helpers to apply neg/abs to output of instructions

  • aco/optimizer: back propagate modifiers through rcp

  • aco/optimizer: use new helpers to apply packed fsat

  • aco/optimizer: use new helpers to apply insert

  • aco/optimizer: use new helpers to create v_fma_mixlo_f16

  • aco/optimizer: use new helpers for omod/clamp

  • aco/optimizer: apply omod to pseudo scalar trans instructions

  • nir: don’t sink alu that uses ballot(true)

  • nir/peephole_select: allow ballot

  • nir/peephole_select: allow mbcnt_amd

  • aco/optimizer: fix uses in to_uniform_bool_instr

  • aco/optimizer: validate uses

  • aco/optimizer: propagate salu fneg

  • aco/optimizer: propagate salu fabs

  • nir/opt_uniform_subgroup: don’t try to optimize non trivial clustered reduce

  • nir/opt_uniform_subgroup: fix swizzle_amd without fetch_inactive

  • nir/divergence_analysis: fix swizzle_amd without fetch inactive

  • nir/opt_uniform_subgroup: use nir_shader_intrinsics_pass

  • nir/opt_uniform_subgroup: wire up mbcnt_amd path

  • nir/opt_uniform_subgroup: handle more trivial shuffles/votes

  • aco/optimizer: keep pass_flags valid for all instructions

  • aco/isel: emit exec copy for ballot(true)

  • aco/optimizer: fix skip_smem_offset_align with non temp register operands

  • aco/optimizer: propagate fixed registers

  • aco/optimizer: propagate fixed regs to copy/extract/insert

  • aco/isel: emit register copies for workgroup ids

  • radv: optimize known front_face_fsign too

  • aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt

  • ac/llvm/gfx6: move mrtz writemask workaround to ac_build_export

  • ac/nir/lower_ps_late: remove gfx6 mrtz writemask workaround

  • radv/nir: fix radv_nir_remap_color_attachment progress

  • radv: consider dual src blend for when epilog needs alpha

  • radv: gather color0_written with scalar io correctly

  • radv: eliminate unused FS output channels

  • radv/nir: fix front_face_fsign opt

  • radv: use nir_opt_uniform_subgroup

  • radeonsi: use nir_opt_uniform_subgroup

  • aco/isel: remove uniform reduce/scan optimization

  • aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)

  • aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)

  • zink/ci: update radv trace checksums

  • nir/divergence: add nir_def_is_divergent_at_use_block helper

  • nir/opt_uniform_subgroup: optimize min/max/and/or reduce of bcsel(div, con, con)

  • nir/opt_uniform_subgroup: optimize add/xor reduce of bcsel(div, con, con)

  • gallivm: use nir_alu_instr_is_sz/nan_preserve

  • nir: use a seperate enum for per alu floating point math control

  • nir/opt_varyings: use per instruction inf/nan flag for moving past interp

  • nir/opt_varyings: use per instruction nan flag for promoting to flat

  • vtn: implement default fp_math_ctrl without using execution mode

  • gallivm: stop using per shader float fast math flags

  • nir: remove per shader float fast math flags

  • ac/nir/cull: do not reuse variables if subgroup ops are used

  • aco: allow opsel for last v_alignbyte/bit operand

  • nir/opt_uniform_subgroup: optimize uniform ddx/ddy

  • nir/opt_algebraic: explicitly add some -0.0 variants of patterns

  • nir/opt_algebraic: canonicalize scmp with -0.0

  • nir/search: respect sign of zero when comparing floats

  • nir/opt_algebraic: replace is_negative_zero with constant -0.0

  • ci: disable vmware farm

  • util: add IEEE 754-2019 min/max number

  • nir/opcodes: fix fsat signed zero correctness

  • nir/opcodes: use util_max_num/util_min_num for fmin/fmax constant folding.

  • nir: prevent undefined behavior in idiv/imod/irem constant folding

  • nir/opt_varyings: actually clone alu math control to different shader

  • nir: document signed zero, inf, nan preserve flags

  • nir: add nir_alu_instr_is_exact helper

  • spirv: don’t set float control for integer dot

  • nir: move exact bit to nir_fp_math_control

  • amd/drm-shim: add vega20

  • nir/opt_algebraic: move fsat last for fsqrt(fsat(a))

  • ac/nir/lower_sin_cos: use nir_shader_alu_pass

  • ac/nir/lower_sin_cos: preserve fp_math_ctrl

  • ac/nir/opt_pack_half: preserve fp_math_ctrl

  • ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports

  • aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards

  • aco/tests: test VALUReadSGPRHazard with v_cmpx

  • aco/tests: test VALUMaskWriteHazard with v_cmpx

  • aco/tests: don’t destroy vk_device if it was never created

  • nir: make fquantize2f16 32bit only

  • nir/constant_expression: remove fquantize2f16 denorm special case

  • hasvk: create a new intrinsic for push constant to uniform load lowering

  • brw: make sure nir_opt_algebraic_late was called after late brw_nir_optimize

  • nir/constant_expressions: don’t avoid unused source variable warnings

  • nir/constant_expressions: flush input denorms if denorms have to be flushed

  • nir: document that both input and output denorms have to be flushed

  • nir/opt_algebraic: use fcanonicalize

  • nir/search: allow inexact patterns if denorms have to be flushed

  • ci: update trace checksums

  • radeonsi: only override float_mode for llvm

  • aco: add fma_mix opcodes with rtz fp16 rounding

  • aco/insert_fp_mode: exclude some instructions that will never round

  • aco/insert_fp_mode: insert fp mode in reverse

  • aco/optimizer: support fma_mix with rtz

  • aco/optimizer: apply v_cvt_pkrtz_f16_f32 as fma_mix to operands

  • ac/nir,radv: remove ac_nir_opt_pack_half

  • aco/optimizer: fix parsing salu p_insert as shift

  • aco: fix demote in header of single iteration loop

  • aco: add a helper function for non supported DPP opcodes

  • aco: disable DPP for rev integer subs and shifts

  • nir/opt_algebraic: use correct syntax to create exact fsat

  • aco/lower_branches: consider jump target of conditional branches based on vcc

  • aco: handle all SALU that modifies PC in needs_exec_mask

  • aco/opt_postRA: don’t optimize across calls

Gert Wollny (23):

  • r600/sfn: rework 64 bit to vec2 32 bit lowering

  • r600/sfn: drop unused code

  • r600/sfn: correct register interference range

  • r600/sfn: drop range pinning for registers after RA

  • r600/sfn: extract function to update group after instr insert

  • r600/sfn: move some common code into try_readport

  • r600/sfn: Track whether a ALU group has a exec flag update

  • r600/sfn: make sure kill and update_exec don’t happen in one group

  • r600/sfn: AR loads are not dependend on the future and other code blocks

  • etnaviv: isa: Add “thread” info to TEX instruction

  • r600/sfn: Don’t start a new ALU-CF if LDS pipeline loads are pending

  • r600: Handle dummy dest in assembler and disass

  • r600/sfn: remove some unused static variables

  • r600/sfn: Silence warning about unused parameter

  • r600/sfn: Don’t assign dest registers in non-write interpolation slots

  • r600/sfn: fix querying number of sources for LDS ops in readport validation

  • r600/sfn: don’t use dummy register with non-write 64 bit slots

  • r600/sfn: change register ID of dummy dest register

  • r600/sfn: Add slot access operator to AluGroup

  • r600/sfn: Make value factory a member of the block scheduler

  • r600/sfn: Add method to force-override the dest of an AluInstr

  • r600/sfn: Fix test creation and handling of 3-src without dest

  • r600/sfn: use PS and PV inline registers when possible

Gil Pedersen (1):

  • intel: Add PIPE_FORMAT_R10G10B10X2_UNORM support

Gurchetan Singh (38):

  • virtio: kumquat: slice length fix

  • gfxstream: kumquat: opaque fd or dmabuf, not both

  • gfxstream: codegen: add vkTraceAsyncGOOGLE to GLOBAL_COMMANDS_WITHOUT_DISPATCH

  • gfxstream: codegen: remove CheckOutOfMemory

  • gfxstream: fix build after VK 1.4.33.0 spec update

  • gfxstream: meson format -i {all meson files}

  • subprojects: update rustix and libc to newer versions

  • subprojects: enable proper cross-compile on MinGW of certain crates

  • subprojects: add windows-link and windows-sys

  • subprojects: rustix: enable windows + macos build support

  • subprojects: errno: support for windows

  • util: rust: more rust support for windows/MacOS

  • util: be consistent about transitive dependencies

  • gfxstream: WindowsVirtGPU.h –> WindowsVirtGpu.h

  • gfxstream: enable kumquat building on Windows

  • gfxstream: silence non-null Clang check on Android

  • gfxstream: make functions static when needed

  • gfxstream: delete createImmutableSamplersFilteredImageInfo

  • gfxstream: codegen: don’t generate custom protocols in function table

  • gfxstream: more fixes for missing prototypes

  • util: fix arithmetic on a pointer to void warning

  • meson: add -Wgnu-pointer-arith to _trial_msvc

  • android_stub: add missing definition

  • util: fix error about missing include

  • android_stub: fix missing prototypes issues

  • gfxstream: fix logspam in TLS helper function

  • gfxstream: fix warning

  • gallium/tessellator: fix -Wmissing-prototype issues

  • gfxstream: drm_fourcc.h –> drm-uapi/drm_fourcc.h

  • gfxstream: explicitly list Python dependencies for gfxstream codegen

  • gfxstream: filter VkPhysicalDeviceProperties2 structs before encoder call

  • freedreno: check dependencies before running custom_target(..)

  • virtio/kumquat: fixes to enable meson2hermetic

  • meson: check for <poll.h>

  • meson: avoid calling nm.full_path() when tool is not found

  • meson: add dependency on android-hwvulkan-headers

  • meson,gfxstream: add Android support via meson2hermetic

  • gallium: fix sometimes-uninitialized warning

Hans-Kristian Arntzen (4):

  • vulkan/wsi: Promote EXT_swapchain/surface_maintenance1.

  • vulkan: Add KHR_swapchain_maintenance1 promotions.

  • vulkan/wsi: Add missing KHR_surface_maintenance1 promotions.

  • egl/x11: Fix memory leak when querying translated coord.

Hyunjun Ko (9):

  • anv/video: rework for handling alternative quantizer for vp9 decoding.

  • anv/video: handling segmentations features for vp9 decoding

  • vulkan/video: Fix H.265 short-term reference picture set handling

  • vulkan/video: Fix H.265 long-term reference handling

  • anv/video: fix VP9 chroma subsampling format detection

  • anv/video: clean up VP9 picture state setup

  • anv/video: fix a typo in Vulkan AV1 decoding.

  • anv/video: Compute AV1 tile positions internally

  • anv/video: disable encoder on untested platforms

Iago Toral Quiroga (2):

  • broadcom/compiler: use nir_opt_uub

  • nir/opt_vectorize_load_store: allow sizes unaligned with high offset for loads

Ian Forbes (6):

  • svga: Check if Stencil buffer is NULL

  • svga: Enable GL_ARB_texture_mirror_clamp_to_edge

  • svga: Fix vertex-fallbacks Piglit test

  • svga: Don’t crash if only one of Depth or Stencil buffer is present

  • svga: Report “VRAM” more accurately

  • svga: Set modifier in surface_get_handle

Ian Romanick (36):

  • nir/algebraic: Don’t generate integer min or max that will need to be lowered

  • brw: Apply Gfx9 vgrf127 workaround in more cases

  • elk: Apply vgrf127 workaround in more cases

  • nir/opt_if: See through inot

  • brw: Correctly generate conditional modifier for BFN

  • vulkan: Fix incorrect assert

  • nir/opt_if: Specify which branches are valid for evaluate_if_condition

  • nir/opt_if: Conditionally do not propagate constants through bcsel

  • nir/opt_if: Both parts of logic-joined conditions can be evaluated

  • brw: Don’t spill_all on internal shaders

  • brw: Force allow_spilling when spill_all is set

  • brw: Don’t pass compressed to brw_lower_vgrf_to_fixed_grf

  • brw: Return the new register from brw_lower_vgrf_to_fixed_grf

  • brw: Add OPT macro to brw_shader.cpp like brw_opt.cpp

  • brw: Add fill and spill opcodes for LSC platforms

  • brw: Eliminate redundant fills and spills

  • brw: Eliminate duplicate fills

  • lavapipe: fp16 flrp must also be lowered

  • nir/lower_flrp: Check and set shader_info::flrp_lowered

  • glsl: Move flrp lowering out of the loop

  • elk: only lower flrp once

  • broadcom/compiler: only lower flrp once

  • vc4: Don’t call nir_lower_flrp in vc4_optimize_nir

  • nir/algebraic: Mask with shifted constant instead of shift-then-mask

  • brw: Add brw_reg::is_grf

  • brw/cmod: Don’t propagate between instructions in different groups

  • brw: elk: Disable can_do_cmod for MACH

  • brw/cmod: Allow FIXED_GRF

  • brw/dce: Don’t generate more NULL destinations after brw_lower_3src_null_dest

  • brw/cmod: Propagate to an instruction with same source

  • brw: Do cmod prop again after post-RA scheduling

  • brw: Do cmod prop again after scheduling

  • nir/algebraic: Add missing f on F-strings

  • nir/algebraic: Detect missing f on F-strings

  • mesa: Fix segfaults in _mesa_delete_program and _mesa_reference_program_

  • iris/elk: Restore setting nir->num_uniforms to zero.

Icenowy Zheng (19):

  • gallivm: orcjit: remember Context in addition to ThreadSafeContext

  • pvr: enable samplerMirrorClampToEdge feature

  • pvr: fix cleaning up failed CreateDevice

  • pvr: fix PVR_DEBUG=info when running w/o KHR_display

  • pvr: copy WSI can_present_on_device function from PanVK

  • vulkan/wsi/headless: do not destroy images that are never created

  • pvr: advertise VK_KHR_incremental_present

  • pvr: prevent a NULL dereference for pass-less pipeline creation w/o info

  • pvr: advertise VK_EXT_headless_surface

  • pvr: advertise X11-related WSI instance extensions

  • zink: add Mesa powervr to explicit sync / invalid<->linear allowlists

  • zink: only warn about fillModeNonSolid when used

  • gallivm: orcjit: support GALLIVM_DEBUG=dumpbc

  • mesa: workaround GL_INVALID_OPERATION in GLES 2.0 draws

  • mesa: fix GL_INVALID_OPERATION with GLES1/2 + Kopper

  • mesa: fix GL_INVALID_OPERATION when releasing buffer in GLES1/2 ctx

  • nir/algebraic: fix Python-3.10-incompatible syntax

  • pco: add NIR global_atomic lowering

  • vk: descriptors: sort bindings along with flags

Isaac Marovitz (1):

  • kk: BCn Formats

Iván Briano (10):

  • hasvk: don’t report custom sample locations for sample count 1

  • brw: plug some holes in brw_wm_prog_data

  • brw: shut -Wmaybe-uninitialized up

  • anv: report actual AS descriptor limits

  • nir: clear SAMPLE_MASK_IN if we lowered it

  • nir: add nir_lower_single_sampled::lower_sample_mask_in option

  • anv: maxFragmentShadingRateCoverageSamples is 16 on all platforms

  • anv: coarse_pixel doesn’t require any InputCoverageMaskState

  • anv: enable fragmentShadingRateWithShaderSampleMask on Xe2+

  • brw: fix local_invocation_index with quad derivaties on mesh/task shaders

Janne Grunau (3):

  • hk: Report the correct plane count in VkDrmFormatModifierProperties2?EXT

  • meson: Add asahi to aarch64’s auto-generated drivers

  • util/driconf/asahi: Override GL renderer for web browsers

Jarrett Johnson (1):

  • kk: advertise multiDrawIndirect

Jason Macnak (6):

  • gfxstream: Handle BGRA in Gfxstream AHB format conversions

  • gfxstream: codegen changes for new filenames and namespaces

  • gfxstream: Add Vulkan func/structs for passing debugging data to host

  • gfxstream: Remove unnecessary tag to simplify perfetto trace config

  • gfxstream: Reland “Add Vulkan func/structs for passing debugging da…”

  • gfxstream: Reland “Remove unnecessary tag to simplify perfetto trac…”

Jeff Burnett (1):

  • util: Don’t force 64-bit division on 32-bit platforms

Jesse Natalie (17):

  • d3d12: Only try to compute scaled point size for stream 0

  • u_threaded_context: Use 64-bit bitmask utils

  • zink: Fix 64-bit bitmask usage

  • mesa: Cast bitmasks to 64-bit before negating

  • dzn: Suppress new MSVC warning by upconverting to uint64_t

  • spirv2dxil: Move clip/cull merging from common passes to just spirv2dxil passes

  • wgl: Support contexts created from non-window DCs

  • wgl: Only swap back and front buffers after a successful present

  • wgl/d3d12: Return success based only on Present return

  • d3d12: Allow state promotion for non-simultaneous access textures

  • d3d12: Decay state when resolving context -> global state

  • d3d12: Assert that there’s no front buffer writes

  • d3d12: Ensure that flush_resource causes batches to get flushed

  • d3d12: Don’t promote to read-write states

  • d3d12: Fix resolving global state vs per-context state with promotion

  • d3d12: Don’t use D3D12 B8G8R8X8 format

  • nir: Suppress ‘potentially uninitialized local pointer variable used’ warning

Jianxun Zhang (5):

  • anv: And a new function to consolidate import paths

  • isl: Add a macro for number of maximum planes of modifiers

  • anv: Replace ANV_MAX_PLANES with ISL_MODIFIER_MAX_PLANES

  • anv: Use gralloc helper to get tiling

  • anv: Enable compression on importing Android buffers (xe2)

Job Noorman (31):

  • ir3: move ir3_catN_absneg to ir3.c

  • ir3: add has_sel_b_fneg compiler flag

  • ir3: allow (neg) on sel.b on a6xx gen4+

  • ci,marge_queue: read token from file by default

  • nir: mark fneg distribution through fadd/ffma as nsz

  • ir3/ra: fix assert during file start reset

  • ir3/ra: reset merge set preferred reg when unavailable

  • spirv: don’t set in_bounds for structs

  • spirv: set in_bounds for ptr_as_array

  • nir: print in_bounds info for deref_type(_ptr_as)_array

  • rusticl: fix mismatched-lifetime-syntaxes lint warning

  • nir: add has_umul_16x16 option

  • nir: add opt_uub pass

  • ir3: add support for umul24

  • ir3: removed unused parameter from ir3_optimize_loop

  • ir3: add options parameter to ir3_optimize_loop

  • ir3: enable nir_opt_uub

  • ir3: add ir3_disasm_options struct

  • freedreno/computerator: add option to print raw disassembly

  • ir3: don’t use list_head for rpt groups

  • ir3: merge rpt groups after postsched

  • ir3/ra: try to allocate subreg movs earlier

  • ir3/ra: try to allocate overlapping regs for shared subreg movs

  • tu: add UBO lowering workaround for Yooka-Laylee

  • ir3/legalize: run dbg nop/sync sched later

  • ir3: print eq and needs_helpers instruction flags

  • ir3/legalize: schedule (eq) more accurately

  • ir3/bisect: fix off-by-one issues while bisecting

  • ir3/legalize: fix (eq) scheduling for sam.s2en

  • ir3: print (eolm)/(eogm) flags

  • tu,freedreno: add chicken bit to enable (eolm)

John Anthony (3):

  • panfrost: Add shader core count to RENDERER string

  • panvk: Add shader core count to deviceName

  • pan: Use correct architecture name for v12+

Jonathan Marek (1):

  • tu: remove magic bo reg packing (use iovas directly)

Jordan Justen (5):

  • intel/dev: Add INTEL_PLATFORM_NVL_U platform enum

  • intel/dev: Add NVL-S/U device info

  • intel/dev: Add NVL-S/U PCI IDs (with FORCE_PROBE required)

  • intel/brw: Add brw_data_type_float/brw_data_type_int

  • intel/brw: Add new encode/decode for use with brw_data_type_float/int

Jose Maria Casanova Crespo (10):

  • v3d: mark FRAG_RESULT_COLOR as output_written on SAND blits FS

  • v3dv: use vk_drm_syncobj_copy_payloads helper

  • v3dv: Enable VK_FORMAT_A2R10G10B10_UINT_PACK32 format

  • v3dv: Enable VK_FORMAT_B8G8R8A8_SINT and VK_FORMAT_B8G8R8A8_UINT formats

  • v3dv: Enable VK_FORMAT_B8G8R8A8_SNORM format

  • v3dv: only apply simulator stride alignment for from_wsi images

  • broadcom/compiler: enable umul24 and imul24 ALU opcodes

  • broadcom: Drop use of nir_lower_wrmasks

  • v3d: Enable TFU blits with raster destinations on 7.1 HW (RPi5)

  • v3dv: Enable TFU blits with raster destinations on 7.1 HW (RPi5)

Josh Simmons (1):

  • radv: Fix crash in sqtt due to uninitalized value

Joshua Ashton (1):

  • vulkan/wsi: Handle 0xFFFFFFFF special case in vk_wsi_force_swapchain_to_current_extent driconf

Joshua Simmons (1):

  • vtn: Fix OpCopyLogical destination type

José Expósito (2):

  • winsys/amdgpu: Fix userq job info log on PPC

  • venus: Fix error log on PPC

José Roberto de Souza (29):

  • intel/dev: Add supports_low_latency_hint to intel_device_info

  • anv: Add support for low latency hint on Xe KMD

  • iris: Release global_bufmgr_list_mutex on missing error paths

  • iris: Move code to emit binding tables to its own function

  • iris: Improve iris_emit_binding_tables()

  • iris: Move code to emit push constants to its own function

  • iris: Rename iris_binding_table::sizes to iris_binding_table::surf_count

  • intel/brw: Split to a function the code that calculate sampler channels that should be written

  • iris: Fix slab memory leak

  • iris: Make uint32 the type used for slab sizes

  • intel/brw: Nuke brw_inst::is_volatile()

  • drm-uapi: Sync xe_drm.h

  • anv: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION

  • iris: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION

  • intel/brw: Add comment to ubo_ranges

  • intel/brw: Document UBO_START

  • anv: Fix variable shadowing

  • anv: Set push_constant_range once

  • iris: Reduce COMPUTE_WALKER_BODY code duplication

  • iris: Remove duplicated iris_measure_snapshot(INTEL_SNAPSHOT_COMPUTE)

  • iris: Nuke iris_emit_execute_indirect_dispatch()

  • intel/blorp: Remove duplicated calls in blorp_exec_compute()

  • anv/hasvk: Nuke register_config from anv_performance_configuration_intel

  • anv/hasvk: Add intel_perf_get_configuration_id() and replace intel_perf_load_configuration() usage

  • intel/perf: Nuke intel_perf_load_configuration() and related code

  • intel/perf: Add Xe2 mdap_metrics struct and set it

  • intel/perf: Extend Xe2 mdap_metrics to Xe3

  • intel/perf: Change mdapi switch cases from ver to verx

  • intel/perf: Add Gfx 12.5 mdap_metrics struct and set it

Juan A. Suarez Romero (31):

  • v3d/ci: add new flakes for rpi5

  • broadcom/ci: document some of the failures

  • broadcom/ci: disable baremetal jobs already running with CI-Tron

  • broadcom/ci: unlock more CI-Tron jobs

  • vc4/simulator: create GEM BOs in GTT memory for AMD GPUs

  • vc4/simulator: add helper to get stride alignment

  • vc4: set stride alignment when using simulator

  • v3d/simulator: create GEM BOs in GTT memory for AMD GPUs

  • broadcom/simulator: add helper to get stride alignment

  • v3d: set stride alignment when using simulator

  • v3dv: align width to 256 when using simulator

  • v3d: enable forward facing primitive for lines and points

  • v3dv: enable forward facing primitive for lines and points

  • broadcom/ci: adjust fractions for nightly jobs

  • broadcom/ci: unlock more CI-Tron jobs

  • v3dv/ci: add timeout in expected list

  • broadcom/ci: unlock more CI-Tron jobs

  • v3d/ci: add SKQP failure

  • v3d/ci: update expected results

  • broadcom/ci: remove all baremetal nightly jobs

  • broadcom/ci: remove `ci-tron-` prefix from nightly jobs

  • broadcom/ci: update expected list

  • broadcom/ci: set testgroup size for asan

  • v3d: don’t build disk cache access on shader disablement

  • v3dv/ci: skip tests causing GPU issues

  • broadcom/compiler: enable skip_helpers

  • v3dv: create a proper load_uniform instruction

  • nir: add ACCESS to load_uniforms

  • broadcom/compiler: use skip_helpers with textures, UBOs and SSBOs

  • v3d/ci: add new fail in expected list

  • broadcom/cle: bump up gen version for v3d

Julia Zhang (1):

  • amdgpu/virtio: unmap bo in destroy_host_blob

Julian Orth (1):

  • kopper: disable color management for wayland surfaces

Juston Li (2):

  • anv/android: align AHardwareBuffer naming to ahb

  • anv/android: query and use explicit layout for ahb resolve

Karmjit Mahil (12):

  • tu: Use TU_BREADCRUMBS_ENABLED value

  • ir3: fix comparison of different signedness build issue

  • freedreno/afuc: Fix potentially uninitialized variable

  • freedreno/decode: Add code to extent pkt processing with Lua

  • freedreno/cffdump: Emulate RMW

  • freedreno/docs: Add -k option to nc command

  • freedreno/registers: Remove extra space in reg definition

  • meson: Use adreno-pm4-pack.xml.h instead of custom definitions

  • freedreno/decode: Add some code to the already present generate-rd

  • freedreno/registers: Mark functions as constexpr where possible

  • freedreno/registers: Clarify bit 64B of CP_REG_TO_MEM

  • gallium: Fix gnu-empty-initalizer error

Karol Herbst (36):

  • nak: extract cmat load/store element offset calculation

  • nak: ensure deref has a ptr_stride in cmat load/store lowering

  • nak: simplify SM80 HMMA latency categorization

  • nak: improve fp16 latencies on Ampere

  • nak: fix MMA latencies on Ampere

  • st/interop: fix fence leak

  • rusticl/queue: fix error code for invalid queue properties part 1

  • rusticl/queue: fix error code for invalid queue properties part 2

  • rusticl/queue: fix error code for invalid sampler kernel arg

  • rusticl/kernel: take no kernel_info reference inside the launch closure

  • rusticl/spirv: preserve signed zeroes by default

  • rusticl/kernel: fix clGetKernelSuggestedLocalWorkSizeKHR implementation

  • rusticl/kernel: Do not run kernels with a workgroup size beyond work_dim

  • nvk/ci: add broken coop matrix CTS tests to skips

  • nak/cmat: add alignment info to matrix load/stores

  • nak/cmat: add optimisation to cmat load/store to do 32-bit load for f16vec2

  • nir: mark cmat_load_shared_nv as CAN_ELIMINATE

  • nak: add Movm

  • nak/cmat: use movm

  • nir: add ACCESS to shared_uniform_block_intel

  • rusticl: remove unecessary transmutes around uuids

  • rusticl/mesa: remove unnecessary lifetimes

  • rusticl/mesa: convert pointer to ref without transmute in PipeScreen::from_raw

  • gallium: add SUBGROUP_FEATURE bits for rotate and rotate_clustered

  • llvmpipe: advertise support for subgroups in all stages

  • clc: handle all optional subgroup extensions

  • rusticl: properly check for subgroup support

  • docs: reorder and add zink to CL subgroup entries

  • clc: reorder headers to fix compilation errors due to UNUSED

  • clc: support some atomic and generic address space features

  • clc: enable generic address space and seq_cst and device scope atomic features

  • nir: fix nir_fixup_is_exported for LLVM-22

  • clc: fix compile compatability with LLVM-22

  • rusticl/mesa: only use resource_from_user_memory if the cap is advertised

  • vtn/opencl: flush denorms for cbrt()

  • vtn: set default fp_math_ctrl values for kernels

Kenneth Graunke (52):

  • iris, crocus: Disable new IO slot validation for FB fetch load_output

  • elk: Disable IO semantic validation when remapping patch offsets

  • ci: Run intel shader-db on Haswell, Broadwell, and Meteorlake

  • nir: Drop writemask from all Intel memory store intrinsics

  • brw: Add an assertion that writemasks can be fully ignored

  • brw: Use nir_intrinsic_[set_]base rather than poking at const_index[0]

  • brw: Store brw_urb_inst::offset in bytes on Xe2

  • iris: Use iris_any_prog_key, not brw_any_prog_key

  • brw: Delete program_string_id from brw program keys

  • brw: Delete input_slots_valid from brw_wm_prog_key

  • brw: Set extended_bindless_surface_offset to true for Gfx12.5+

  • nir: add new intrinsics to load/store from URB on intel

  • brw: Implement URB handle intrinsics for TCS and TES stages

  • brw: Pass devinfo to brw_nir_lower_tes_inputs

  • brw: Flip the TESS_LEVEL_INNER/OUTER vue map slot assignments

  • brw: Rework the tess level remapping interface

  • brw: Remap tesslevels before other patch remapping

  • brw: Rename remap_non_header_patch_values to remap_patch_values

  • brw: Pass devinfo into remap_patch_urb_offsets

  • brw: Lower tesslevel vars to vectors even for unlinked TCS/TES

  • brw, anv, iris: Switch to reversed patch header layouts

  • brw: Rename read_attribute_payload_intel to load_attribute_payload_intel

  • brw: Generalize read_attribute_payload_intel to handle more cases

  • brw: Use io_sem.location instead of base to get varying slots

  • brw: Add infrastructure for lowering to URB intrinsics

  • brw: Switch to NIR URB intrinsics for TCS outputs

  • brw: Switch to NIR URB intrinsics for TES inputs

  • brw: Switch to URB intrinsics for TCS inputs

  • brw: Rewrite legacy tess level remapping

  • brw: Drop check for legacy tess levels from remap_patch_urb_offsets

  • brw: Combine output stores for TCS outputs even when unlinked

  • intel/elk: Also disable output constant offset src folding

  • brw: Fix outdated comments about urb->offset units

  • brw: Use LSC extended descriptor offsets for Xe2 URB messages

  • intel: Replace signed char with int8_t

  • brw: Calculate tessellation URB offsets when lowering to URB intrinsics

  • brw: Rename brw_nir_lower_vue_inputs to brw_nir_lower_gs_inputs

  • brw: Add missed access to store_urb_lsc_intel intrinsics

  • brw: Delete attr_desc struct

  • nir: Fix mod analysis of ishl to shift the recursive result

  • brw: Extend load_urb/store_urb to handle 32-bit non-vec4-aligned access

  • brw: Make lower_{inputs,outputs}_to_urb_intrinsics non-static

  • brw: Extend URB lowering infrastructure to handle mesh shader outputs

  • brw: Lower mesh shader outputs in NIR

  • brw: Lower task shader payload access in NIR

  • brw: Delete all the old backend mesh/task URB handling code

  • nir: Support Intel URB intrinsics in nir_opt_offsets

  • brw: Call nir_opt_offsets for mesh shaders

  • brw: Update try_load_push_input to handle dword-unit offsets too

  • brw: Make max_push_bytes a parameter to URB lowering data

  • brw: Move GS URB Read Length limiting to brw_nir_lower_gs_inputs()

  • brw: Convert GS pulled inputs to use URB intrinsics

Khem Raj (1):

  • glx: fix const qualifier warnings found with C23 glibc support

Kitlith (3):

  • hk: override can_present_on_device

  • panvk: Free drm device in can_present_on_device

  • pvr: Free drm device in can_present_on_device

Konstantin Seurer (60):

  • vulkan/cmd_queue: Fix indentation for struct array copies

  • vulkan/cmd_queue: Free all elements of struct arrays

  • radv/bvh: Add radv_first_active_invocation

  • vulkan: Add vk_ir_header::driver_internal

  • vulkan: Bump MAX_ENCODE_PASSES to 4

  • vulkan/bvh: Add some debug helpers

  • radv/rra/gfx12: Properly validate geometry indices

  • radv: Emit compressed primitive nodes on GFX12

  • vulkan: Remove the vk_ir_triangle_node::id field

  • vulkan/bvh: Add leaf.h to vk_bvh_includes

  • radv/bvh: Pair compress triangles in more cases

  • aco: Fixup out_launch_size_y in the RT prolog for 1D dispatch

  • radv: Always use compact bvh encoding

  • radv: Report smaller bvh sizes when possible

  • lavapipe: Bump maxPrimitiveCount

  • lavapipe: Zero image null descriptors

  • lavapipe: Bump MAX_DESCRIPTOR_UNIFORM_BLOCK_SIZE

  • gallivm/nir/soa: Use the sign of src1 for imod

  • llvmpipe: Always recompute 1/w

  • nir: Remove parallel copy handling from rewrite_uses_to_load_reg

  • nir/from_ssa: Stop using nir_parallel_copy_instr

  • nir: Remove nir_parallel_copy_instr

  • radv: Add re-format commit to .git-blame-ignore-revs

  • nir: Move nir_def directly after nir_instr

  • treewide: add & use parent instr helpers

  • nir: Remove nir_def::parent_instr

  • nir: Fix typo in nir_opt_ray_query_ranges

  • nir: Ignore ray query ranges that don’t start with rq_initialize

  • radv: Use hw_leaf_node_count for computing BVH size

  • radv/rra/gfx12: Fix primitive/geometry index validation

  • radv/bvh: Assert that indices_midpoint is valid

  • radv/bvh: Fix calculating the vertex payload/prefix sizes

  • radv/bvh: Avoid a slow case when compressing triangles

  • radv/nir: Use fmt_idx correctly

  • radv: Optimize BVH4 acceleration structure updates

  • nir/opt_algebraic: Remove a pattern for 8bit floats

  • nir/opt_algebraic: Do not emit patterns for 64bit booleans

  • nir/print: Print annotations as comments

  • nir: Allow shaders in tests to be annotated

  • nir: Allow using nir_eval_const_opcode in C++ code

  • nir: Add f2f16_ru/rd opcodes

  • spirv: Add internal f2f16 opcodes

  • aco: Add support to f2f16 with rtpi/rtni

  • radv/rra: Count box16 nodes properly

  • radv/bvh: Add radv_aabb16 and use it for box16 nodes

  • radv/bvh: Use box16 nodes when bvh8 is not used

  • radv: Fix crash if proceed comes before initialize

  • nir: Add an assert_eq intrinsic for testing nir_opt_algebraic

  • nir: Fix the types of udot_.*_uadd_sat

  • nir: Add a unit test base class for algebraic patterns

  • nir/opt_algebraic_tests: Add an option for generating unit tests

  • nir: Generate unit tests for nir_opt_algebraic

  • vulkan: Implement HPLOC

  • radv: Use HPLOC for TLAS builds

  • vulkan: Handle inactive primitives with LBVH builds

  • vulkan: Avoid NAN in the IR BVH

  • vulkan: Limit the number of LBVH invocations

  • radv/rra: Fix nullptr dereference

  • vulkan: Make sure no NaNs end up in the BVH

  • radv/bvh: Make sure internal nodes are collapsed when possible

Lakshman Chandu Kondreddy (1):

  • dri: Add R32F,RG32F,RGBA32F format mappings for DRIImage

Lars-Ivar Hesselberg Simonsen (23):

  • panvk/v9+: Reduce maxBoundDescriptorSets to 7

  • panvk: Only call req_res when required

  • panvk: Fix IUB decode

  • pan/format: Fix mapping for I16F

  • pan/format: Disable PAN_BIND_STORAGE_IMAGE for RGBA4/BGRA4

  • panfrost: Rename (LD|LEA)_BUFFER to (LD|LEA)_PKA

  • pan/va: Change LEA_BUF_IMM src description

  • pan/va: Add LEA_BUF

  • pan/genxml: Remove reg_format from v9+ ConversionDesc

  • nir: Add pan intrinsics for texel buffer access

  • pan/va: Add late lowering passes for texel buffers

  • pan/format: Add PAN_BIND_TEXEL_BUFFER

  • panvk: Increase maxBufferSize to UINT32_MAX

  • pan/v9+: Remove unnecessary nir_u2u32 from load_tex_size

  • panfrost/bi: Fix potential out-of-bounds writes

  • glsl/nir: Add texture_buffers to shader info

  • nir: Add channels to pan texel_buf intrinsics

  • pan/bi: Add texel buf lowering support for Bifrost

  • pan/bi: Add lowering pass for texel buffer indices

  • panvk/bi: Add texel buffer branch to meta_desc_copy

  • pan/bi: Make texel buffers use Attribute Buffers

  • pan/bi: Change texel buffer limits

  • panfrost/bi: Fix unbound texel buffers

Laura Nao (3):

  • ci: Enable Perfetto tracing support in Mesa builds for Linux/Android

  • ci/prepare-artifacts: Keep pps-producer binary in artifacts

  • ci/container: Add script to build Perfetto tracebox

Leon Perianu (1):

  • pvr: pvr_pds_fragment_program_create fix allocation callback usage

LingMan (5):

  • rust: build `equivalent` dependency with the correct edition

  • rust: build `paste` dependency with the correct edition

  • rust: build `ucd-trie` dependency with the correct edition

  • meson: silence warnings in rust subprojects

  • meson: specify minimal target meson version for rust subprojects

Linus Karl (2):

  • rocket: fix build on non LP64 architectures

  • ethos: fix build on non LP64 architectures

Lionel Landwerlin (168):

  • Revert “wsi: Implements scaling controls for DRI3 presentation.”

  • brw: add a new sampler payload parameter description

  • brw: port some NIR lowering to the sampler payload description

  • brw: switch to new sampler payload description scheme

  • brw: new Xe2 sampler opcodes

  • anv: reenable KHR_maintenance8 on Xe2+

  • brw: get rid of GET_BUFFER_SIZE opcode

  • anv: fix image-to-image copies of TileW images

  • brw: account for disabled SEND fused message in cycle computation

  • Revert “brw: add serialize send stats”

  • brw: add missing offset to MCS fetching messages

  • brw: constant fold u2u16 conversion on MCS messages

  • brw: only consider cross lane access on non scalar VGRFs

  • brw: fix ballot() type operations in shaders with HALT instructions

  • brw: fix missing generation requirement on sampler opcode

  • nir/divergence: fix handling of intel uniform block load

  • brw: mark divergence data as valid for debug purposes

  • brw: handling dynamic programmable offsets pre-Xe2

  • anv: reenable VK_KHR_maintenance8 on pre-Xe2 platforms

  • anv: rename structure holding 3DSTATE_WM_DEPTH_STENCIL state

  • brw: handle GLSL/GLSL tessellation parameters

  • nir/lower_io: add missing levels intrinsics to get_io_index_src_number

  • anv/brw: fix output tcs vertices

  • anv: destroy sets when destroying pool

  • anv: expose VK_EXT_shader_uniform_buffer_unsized_array

  • vulkan/runtime: enable null pointer to vkCmdSetSampleMaskEXT()

  • vulkan/render_pass: Add a missing sType

  • vulkan/render_pass: handle maintenance10 resolve flags

  • anv: implement VK_KHR_maintenance10

  • Revert “anv: Convert DEBUG_SPARSE logging to use mesa_log”

  • brw: disable io_semantic validation for mesh intrinsics

  • u_trace: reserve chunk space before emitting copies

  • anv: avoid null pointer access in utrace copies on CCS

  • brw: avoid invalid URB messages

  • anv: enable accelerationStructureCaptureReplay

  • anv: avoid invalid timestamp generation due to skipped commands

  • anv: don’t use IndirectStatePointersDisable at the end of secondaries

  • anv: avoid unnecessary stalling on secondaries

  • brw: stop emitting flush operations for begin/end interlock

  • vulkan/runtime: split out partitioning logic

  • vulkan/runtime: simplify robustness state hashing

  • vulkan/runtime: drop some geometry shader hashing

  • vulkan/runtime: drop blake3 hash on precomp shaders

  • vulkan/runtime: split precomp shader hashing from precomp loading

  • vulkan/runtime: keep the set layouts on the stack until pipeline creation

  • vulkan/runtime: use stage flags to track valid stages

  • vulkan/runtime: split compute shader hashing from compile

  • vulkan/runtime: split graphics shaders hashing from compile

  • vulkan/runtime: split rt shaders hashing from compile

  • vulkan/runtime: use only blake3_hash to shader key

  • vulkan/runtime: switch precomp shaders to blake3 hashes

  • vulkan/runtime: track imported stages

  • vulkan/runtime: implement VK_KHR_pipeline_binary

  • anv: enable KHR_pipeline_binary support

  • anv: limit maxComputeSharedMemorySize to 48KiB

  • anv/blorp/iris: rework Wa_14025112257

  • anv: disable software detiling on Xe2+ for image atomics 64bits

  • intel/isl: add INTEL_DEBUG=noccs-modifier to disable CCS modifiers

  • anv: ensure shader printf is functional on all backends

  • brw: fixup 64bit atomics emulation on 2D array images

  • anv: consider 64bit atomics on similar formats with mutable images

  • brw: fixup immediate bindless surface handling

  • brw: fix SIMD lowering of sampler messages with fp16 data

  • vulkan/runtime: fix incorrect assert on empty shader groups

  • anv: track descriptor mode in SBA tracepoint

  • anv: optimize pipeline switching with secondaries

  • brw: fix workaround fence rlen field

  • anv: fixup load_ubo lowering

  • anv: ensure slab allocated memory matches image requirements

  • anv: split non binding related intrinsics from apply_layout

  • anv: bump maxTessellationControlTotalOutputComponents

  • anv: Wa_18040903259 only applies to RCS when in GPGPU mode

  • anv: avoid pipe control reason tracking in emit_pipe_control

  • anv: put more readable PIPE_CONTROL reasons

  • brw: compute final copy propagation resulting source

  • brw: fix SS surfaces usage

  • nir: print out number of printfs

  • nir: fix lower_printf with no arguments

  • spirv: fix printf generation

  • nir/lower_printf: fix array alignment

  • nir/lower_printf: fix missing singleton add

  • anv: enable mesh/task shader hashes

  • anv: enable application shader printfs with debug option

  • brw: switch to load_(pixel_coord|frag_coord_z|frag_coord_w) intrinsics

  • anv: shrink image opaque data

  • brw: use default builder for urb handle adjustment

  • brw: Implement load/store URB intrinsics

  • anv: remove errors on format queries

  • brw: fix sample mask flag emission

  • anv: add 32-wide subgroup requirement heuristic

  • brw/iris: remove fs key for coherent_fb_fetch

  • vulkan/runtime: track dynamic descriptor offsets for RT pipelines

  • anv: fix broken ray tracing dynamic descriptors

  • vulkan/runtime: add an internal flag for independent sets

  • anv: reintroduce non independent sets dynamic descriptor optimization

  • iris: lower load_num_workgroups

  • anv: move load_num_workgroups tracking to driver

  • brw: remove driver specific load_num_workgroup lowering

  • vulkan/runtime: include unaligned dispatch bit in hashing

  • anv/brw: drop cs_prog_key::lower_unaligned_dispatch usage

  • anv: fix internal representations of shaders

  • intel: remove unused show_shader_stage debug option

  • anv: add missing device_memory_report for shaders

  • anv: fixup error path for shader allocation

  • anv: program STATE_BASE_ADDRESS instruction ptr using pdevice address

  • anv: fix dynamic buffers & independent sets

  • anv: switch shader heap placement to beginning of heap by default

  • anv: remove unused gpu_memcpy function

  • anv: remove use of emit_apply_pipe_flushes() in various helpers

  • anv: add tracking of involved stages in pipe flushes

  • anv: move cs/pb-stall detection to flushing function

  • anv: remove pb-stalls from various locations

  • anv: update pipeline barriers for Xe2+

  • anv: consider CS coherent with L3 on Xe2+

  • anv: disable deferred bits on Gfx20+

  • anv: remove unused event field

  • anv: store event creation flags

  • anv: use the blitter/video barrier helper for event signalling

  • anv: switch events to use 0/!0 values for unsignaled/signaled

  • anv: use flushing PIPE_CONTROL for event signaling

  • anv: use anv_add_pending_pipe_bits for event reset

  • intel: rename DCFlushEnable to ForceDeviceCoherency

  • anv: introduce an new virtual pipecontrol flag for BTI change

  • anv: implement Wa_18037648410

  • anv: use RESOURCE_BARRIER for event waiting when possible

  • anv: instrument resource barriers instruction in u_trace

  • anv: implement WA_18039014283

  • anv: add a no-resource-barrier debug flag

  • anv: disable crast on SKL

  • brw: Implement URB handle intrinsics for task/mesh stages

  • brw: move MUE initialization out of the SIMD loop

  • anv: remove CS-L3 coherency on Xe2

  • nir/printf-helpers: set writes_memory at printf emission

  • nir: add missing divergence handling for ray_query_global_intel

  • nir: use load() helper for inline_data_intel

  • nir: add a new push_data_intel intrinsic

  • brw: invert condition to reduce code nesting

  • brw: add a pass to lower ubo to push constant data

  • anv: stop going through push ranges on the first empty slot

  • anv: ensure internal compute kernels are run at SIMD16

  • anv/brw/iris: get rid of param array on prog_data

  • iris: manage TBIMR null push constant wa in driver

  • intel: rework push constant handling

  • anv/brw: prep work for SIMD32 ray queries

  • brw: enable ray query spilling in SIMD32

  • brw: handle lowering of a couple of opcodes

  • brw: enable topology opcodes in SIMD32

  • brw/nir/rt: ensure we can load 2 RT_DISPATCH_GLOBALS

  • brw: enable SIMD32 compute shaders with ray queries

  • brw: fix derivatives on non 32bit floats

  • brw: handle layer_id only through system value

  • brw: drop unused color_outputs_valid key

  • brw: switch buffer/image size intrinsics lowering to NIR

  • anv: remove all kinds of useless info for internal shaders

  • anv: enable debug printfs on internal shaders

  • brw: add missing base offset decoding

  • brw: improve push constant loading using base offsets

  • brw: apply same workaround to spawn than trace opcode

  • brw: treat inline parameters like UNIFORM

  • nir/compiler_options: add nir_load_pixel_coord

  • brw: set nir_shader_compiler_options::has_pixel_coord

  • brw: populate wm_prog_data earlier

  • brw: make coarse pixel bit available to NIR lowering

  • nir: add intrinsics for Z calculation in shaders with FSR

  • brw: move coarse_z computation to NIR

  • brw: use fp64 to compute coarse_z

  • iris: fix incorrect intrinsic usage on ELK

  • vulkan/wsi/direct: remove VkDisplay created from GetDrmDisplayEXT on ReleaseDisplayEXT

Lorenzo Rossi (8):

  • vulkan: increase MESA_VK_MAX_DISCARD_RECTANGLES

  • nvk: implement VK_EXT_discard_rectangles

  • nak/dataflow: Fix typo in comments

  • nak: Add latency_upper_bound to ShaderModel

  • nak/reg_tracker: Add SparseRegTracker

  • nak: Add cross-block instruction delay scheduling

  • nak: Fix delay insertion missing WaR

  • nak/sm120: Fix panic for CS2R during prepass

Loïc Molinari (1):

  • panfrost: Fix clean_pixel_write_enable forced check for AFBC

Lucas Fryzek (8):

  • util: Move ASTC unpack routines to common util

  • anv: For HIC only convert tile worth of memory at a time

  • anv: Implement host_image_copy astc emulation on CPU

  • anv: Enable host_image_copy on emulated formats

  • lvp: Enable VK_FORMAT_R4G4B4A4_UNORM_PACK16

  • lp: Implement gallium depth_bounds_test capability

  • drisw: Modify drisw_swap_buffers_with_damage to swap entire buffer

  • Revert “drisw: Copy entire buffer ignoring damage regions”

Lucas Stach (5):

  • etnaviv: blt: fix tile count calculation for in-place resolve

  • etnaviv: don’t emit steering state when uniforms are unchanged

  • etnaviv: check all necessary dirty bits when marking constbufs during draw

  • etnaviv: simplify constant dirty bit handling during state emission

  • etnaviv: idle the pipe before flushing texture caches

Ludvig Lindau (8):

  • panfrost/panvk: Merge stores in vector spills

  • panfrost/panvk: Reduce fills from LCRA

  • panfrost: Make instrs_equal check res table/index

  • pan/va: Add LD_CVT

  • pan/genxml: Move BufferDescriptor for v9+

  • pan/genxml: Add ConversionDesc to v9+ BufferDescriptor

  • pan/v9+: Make texel buffers use BufferDescriptor

  • pan/v9+: Change texel buffer limits

Luigi Santivetti (11):

  • pvr: split out driver specific framebuffer data

  • pvr: split framebuffer attachments allocation and setup

  • pvr: split framebuffer clear values allocation and setup

  • pvr: split out device tile buffers teardown

  • pvr: split out command buffer render pass inheritance

  • pvr: be more restrictive of when to emit vdm terminate

  • pvr: do not assert in multi-layer rta emulated path

  • pvr: get the format for start of render clears from pass info

  • pvr: move code for resolving attachments

  • pvr: add support for VK_KHR_dynamic_rendering

  • pvr: enable VK_KHR_dynamic_rendering

Marek Olšák (182):

  • r300: fix DXTC blits

  • winsys/radeon: fix completely broken tessellation for gfx6-7

  • radeonsi/ci: update hawaii failures

  • radeonsi: rename si_get_strmout_en -> si_get_streamout_enable_state

  • radeonsi: rename num_active_shader_queries -> streamout.num_ngg_queries

  • radeonsi: return false from si_update_ngg early on gfx11+

  • radeonsi: allow queries to return more than UINT32_MAX

  • radeonsi: cosmetic changes for queries

  • radeonsi: mostly fix NGG streamout overflow queries when XFB is disabled

  • zink: fix mesh and task shader pipeline statistics

  • amd: don’t use non-existent GLM packet fields on gfx12

  • amd: don’t use non-existent GL1 packet fields on gfx12

  • ac/surface: add helper use_tile_swizzle to consolidate that logic

  • winsys/amdgpu: don’t set ac_surf_info::surf_index = NULL

  • radv: don’t set ac_surf_index::surf_index to NULL

  • radv: don’t check VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT for surf_index

  • radv: don’t check vk_format_is_depth_or_stencil for surf_index

  • radv: move VK_IMAGE_USAGE_HOST_TRANSFER_BIT checking to ac_surface.c

  • radv: move more surf_index logic to use_tile_swizzle

  • radv: set RADEON_SURF_SHAREABLE for surf_index logic

  • ac/surface: pass ac_addrlib* everywhere instead of ADDR_HANDLE

  • ac/surface: move surf_index and fmask_surf_index into ac_addrlib

  • amd: constify struct radeon_surf

  • ac/surface: pass all ac_compute_surface info via ac_surf_config, not radeon_surf

  • radeonsi: enable ACO by default

  • radeonsi/ci: update failures

  • nir/lower_indirect_derefs: don’t lower compact arrays unconditionally to fix perf

  • ac/nir: set support_indirect_inputs/outputs in common code

  • nir: add nir_intrinsic_ssbo_descriptor_amd for lowering get_ssbo_size

  • amd: lower get_ssbo_size in ac_nir_lower_resinfo

  • nir/lower_io: force src offset=0 for any indirect access with num_slots == 1

  • nir/validate: expand IO intrinsic validation with nir_io_semantics

  • Revert ABI breakage “amd: Add user queue HQD count to hw_ip info”

  • nir/lower_interpolation: check IO location correctly

  • gallium/noop: don’t unref buffers passed to set_vertex_buffers to fix crashes

  • radv: set ZMM_TRI_EXTENT for conservative rasterization == overestimate

  • nir: add NIR_PASS_ASSERT_NO_PROGRESS

  • nir/opt_copy_propagate: refactor for readability, describe missing stuff

  • nir: rename nir_copy_prop -> nir_opt_copy_prop

  • nir: document how nir_opt_dce works

  • nir: document how nir_opt_cse works and suggest improvements

  • nir: add nir_separate_merged_clip_cull_io

  • nir,glsl,zink: remove the option nir_io_separate_clip_cull_distance_arrays

  • ac,radeonsi: remove gfx11 FW-based MCBP

  • nir: for nir_shift_channels, fill undefined components with undef instead of .x

  • nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees

  • nir/lower_io_passes: lower indirect TCS outputs sooner and clarify the behavior

  • nir/lower_io_passes: simplify conditions for when to lower IO to temps

  • nir/lower_io_passes: fold bool lower_indirect_inputs

  • nir/lower_io_passes: only sort variables for nir_lower_io_vars_to_temporaries

  • ac: document RELEASE_MEM limitation with PS_DONE/CS_DONE on gfx6-11

  • ac,winsys/amdgpu: report why ac_query_gpu_info failed

  • nir: fix a typo in NIR_PASS_ASSERT_NO_PROGRESS for non-debug builds

  • nir/lower_io_passes: call nir_opt_undef to eliminate undef output stores

  • st/mesa: call nir_opt_intrinsics for the GL_SELECT shader

  • st/mesa: call nir_opt_intrinsics slightly later

  • gallium/hud: don’t fclose stdout for GALLIUM_HUD=…,stdout

  • ac/nir: move aco_nir_op_supports_packed_math_16bit here

  • ac,radv: move opt_vectorize_callback to common code

  • nir/validate: don’t require offset src to be 0 if constant

  • nir: handle load_fs_input_interp_deltas in nir_is_input_load

  • nir: add shader_info::disable_input/output_offset_src_constant_folding

  • nir/opt_constant_folding: add nir_io_add_const_offset_to_base behavior

  • nir: remove nir_io_add_const_offset_to_base

  • nir/recompute_io_bases: don’t use safe iterators

  • nir/recompute_io_bases: move color input bases after all other inputs

  • nir/recompute_io_bases: report progress only if anything was changed

  • gallium: add a flag to finalize_nir to allow drivers to skip NIR opts

  • amd: rename most GFX115x definitions for released chips

  • nir/has_divergent_loop: require divergence metadata, check all function impls

  • winsys/amdgpu: retry the CS ioctl on -ENOMEM only if GDS OA is used

  • winsys/amdgpu: protect driver stats changes by a mutex

  • iris: add struct iris_scissor_state because pipe_scissor_state will be changed

  • panfrost: don’t expose 32K textures because st/mesa doesn’t support them

  • gallium: change pipe_scissor_state to 32 bit integer

  • gallium: change pipe_framebuffer_state width/height to 32-bit integer

  • gallium: declare pipe_resource::height0 as 32-bit integer for 64K textures

  • gallium/u_blitter: change width/height parameters to 32-bit integer

  • mesa: remove unused _mesa_total_texture_memory

  • mesa: remove unused mesa_store_cleartexsubimage, _mesa_store_compressed_teximage

  • mesa: remove unused make_null_texture

  • mesa: merge mostly duplicated mesa_format_image_size & mesa_format_image_size64

  • mesa: use size_t for image address computations

  • mesa: remove MaxTextureMbytes, use the cap instead

  • mesa: bump MAX_TEXTURE_RECT_SIZE, MAX_RENDERBUFFER_SIZE

  • mesa: raise MAX_TEXTURE_LEVELS to 17 to allow 64K mipmap textures

  • st/mesa: don’t use the PBO GetTexImage compute shader for 64K textures

  • st/mesa: disallow the PBO upload fragment shader

  • radeonsi: fix a few non-critical 64-bit integer overflows

  • radeonsi: reject textures that don’t fit in the CPU address space

  • radeonsi: allow 64K viewports

  • st/mesa: remove bogus framebuffer state assertions

  • radeonsi: enable 64K x 64K textures

  • zink/ci: update fixed tests

  • nir/lower_io_vars: don’t insert output stores for unrelated streams before emits

  • nir/gather_info: clear clip/cull_distance_array_size if the IO is not present

  • nir: split gathering array sizes from nir_lower_clip_cull_distance_array_vars

  • nir: give nir_lower_clip_cull_distance_array_vars a better name

  • nir: add FRAG_RESULT_DUAL_SRC_BLEND and an option to use it

  • radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND

  • ac/nir: allow smaller workgroups for GS

  • nir: fix the value of nir_io_use_frag_result_dual_src_blend

  • nir/print: print tex->sampler_dim

  • nir/lower_io: remove unused option nir_lower_io_lower_64bit_float_to_32

  • nir/lower_io: explain properly how nir_lower_io_lower_64bit_to_32* options work

  • nir/opt_cse: update potential future plans merging copy propagation with CSE

  • radeonsi: double pixel throughput in certain cases of PS without inputs

  • radeonsi: don’t load sampler states for buffer and MS samplers

  • radv: double pixel throughput in certain cases of PS without interpolated inputs

  • mesa: allow pipeline statistics in glCreateQueries

  • radeonsi: fix color interpolation when finalize_nir is called twice

  • radeonsi: assert that invalid FS inputs aren’t present

  • radeonsi: assert that IO bases don’t have holes & the same base isn’t used twice

  • radeonsi: remove unused FS input slots due to colors

  • radeonsi: don’t scalarize IO in finalize_nir

  • radeonsi: rename si_nir_scan_shader -> si_nir_gather_info, etc.

  • radeonsi: remove unnecessary NIR divergence analysis invocations

  • radeonsi: call si_nir_mark_divergent_texture_non_uniform later

  • Revert “radeonsi: use nir_opt_large_constants earlier”

  • radeonsi: update XFB info in the correct place after mediump IO lowering

  • radeonsi: lower nir_var_mem_shared later

  • radeonsi: fold nir_lower_compute_system_values_options into pass parameters

  • radeonsi: rename si_shader_info & si_shader_variant_info sysval fields

  • radeonsi: move CS user SGPR layout determination into si_shader_variant_info

  • radeonsi: move CS sysval si_shader_info fields into si_shader_variant_info

  • radeonsi: lower compute system values later

  • radeonsi: use si_preprocess/postprocess_nir function names

  • radeonsi/ci: update gfx12 flakes

  • radeonsi: move NIR callbacks to si_get.c

  • radeonsi: call nir_lower_fp16_casts in si_postprocess_nir

  • radeonsi: don’t set progress uselessly in si_postprocess_nir

  • radeonsi: call nir_opt_16bit_tex_image in si_postprocess_nir

  • radeonsi: use ac_nir_opt_vectorize_cb

  • radeonsi: call nir_lower_gs_intrinsics in si_preprocess_nir

  • radeonsi: lower task & mesh shader IO is si_preprocess_nir

  • radeonsi: move sparse intrinsic lowering to a separate file, call it later

  • radeonsi: remove glsl_tests subdirectory

  • radeonsi: move more lowering from si_lower_nir to si_preprocess_nir

  • radeonsi: remove the rest of si_lower_nir

  • radeonsi: call si_nir_lower_color_inputs_to_sysvals in si_preprocess_nir

  • radeonsi: merge 2 PS color input lowering passes for monolithic shaders

  • nir,radeonsi: simplify load_color0 & load_color1 intrinsics and shader_info

  • ac,radeonsi: move lowering to load_color0/1 to ac_nir_lower_ps_early

  • radeonsi: remove si_shader_selector::*_descriptors_index fields

  • radeonsi: move info fields from si_shader_selector to si_shader_info

  • rusticl: call nir_opt_intrinsics

  • radv: fix halved pixel throughput for a few non-blended 16bpp/32bpp formats

  • radeonsi: fix halved pixel throughput for a few non-blended 16bpp/32bpp formats

  • ac,radeonsi: move SX PS downconversion code into ac_formats.c

  • radv: use ac_set_sx_downconvert_state_for_mrt

  • nir/clip_cull_distance_utils: fix assertion failures with GL_EXT_mesh_shader

  • nir/clip_cull_distance_utils: add more assertions validating the type & sizes

  • ac/gpu_info: don’t read uninitialized dev_filename

  • ac/lower_ngg_mesh: fix a segfault accessing out_variables out of bounds

  • radeonsi: remove the PointSize output if it has no effect

  • radeonsi: fix slightly incorrect assertions in si_shader_ps

  • radeonsi: fix incorrect PS shader key with sample shading

  • radeonsi: fix clip/cull distance gathering for mesh shaders

  • amd: demystify various optimizations we already have for memory channels

  • ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES

  • ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers

  • ALL: use #define and a copy helper to check and copy build_id

  • anv: use SHA1_DIGEST_LENGTH

  • util: use SHA1_DIGEST_STRING_LENGTH in fossilize_db

  • util: increase SHA1_DIGEST_LENGTH to 32 (BLAKE3_KEY_LEN)

  • util: remove SHA1, use BLAKE3 in its functions to switch everything to BLAKE3

  • gallium/util: print task/mesh statistics in util_end_pipestat_query

  • radv,radeonsi: don’t set LINE_STIPPLE_TEX_ENA on gfx12

  • ac: remove never enabled gfx12 HiS

  • radv: rename hiz_his to gfx12_*hiz

  • radeonsi: set WALK_ALIGN8_PRIM_FITS_ST=0 for 64K rendering

  • radeonsi: set FORCE_STENCIL_VALID less often on gfx12

  • radeonsi: rename hiz_his to gfx12_*hiz

  • radeonsi: use deprecated fb_cbufs and fb_zsbuf less

  • radeonsi: move most si_surface color fields into new si_cb_surface_info

  • radeonsi: move most si_surface z/s fields into new si_zs_surface_info

  • radeonsi: stop using si_surface::base

  • radeonsi: remove si_surface::dcc_incompatible

  • radeonsi: remove dead code in si_create_surface

  • radeonsi: move si_surface::width0/height0 code into si_initialize_color_surface

  • radeonsi: stop using create_surface

  • radeonsi: remove si_surface & create_surface

Mario Kleiner (12):

  • hk: Enable VK_KHR_present_id[2] and VK_KHR_present_wait[2]

  • wsi/display: Accept 0 nits for HDR light level properties for “undefined”

  • wsi/display: Initially set default HDR metadata from EDID for HDR modes

  • wsi/display: Allow atomic modeset for change of Colorspace or HDR poperties

  • wsi/wayland: Zero min_luminance, max_luminance HDR light levels are valid.

  • util/format: Add util_format_is_unorm16()

  • dri,gallium: Add support for RGB[A]16_UNORM display formats.

  • egl/wayland: Support RGB[A]16_UNORM formats for display.

  • egl/drm: Support RGB[A]16_UNORM formats for display.

  • egl/surfaceless,device: Support RGB[A]16_UNORM formats for pbuffers.

  • ci/deqp: Pull in a fix for EGL render tests for rgba16 and rgb16 unorm

  • util/driconf: Disable EGL RGB[A]16 unorm configs on panfrost for now

Martin Roukala (né Peres) (12):

  • radv/ci: update the expectations of pre-merge jobs

  • zink/ci: update the expectations of RADV-based pre-merge jobs

  • ci: disable mupuf’s farm during the planned electric outtage

  • Revert “ci: disable mupuf’s farm during the planned electric outtage”

  • ci: disable mupuf’s farm

  • Revert “ci: disable mupuf’s farm”

  • freedreno/ci/a750: switch to the linux-firmware-provided gpu fw

  • freedreno/ci: update the a750 expectations

  • turnip/ci: update the vkd3d expectations

  • zink/ci: update the a750 expectations

  • ci: disable the valve-kws farm

  • Revert “ci: disable the valve-kws farm”

Mary Guillemard (30):

  • asahi/libagx: Stop exposing fake entrypoint _libagx_prefix_sum

  • asahi/libagx: Do not expose anything not use externaly

  • nir: Rename stat_query_address_agx to stat_query_address_poly

  • compiler: rename vs.tes_agx bit to vs.tes_poly

  • asahi/gs: Remove agx_nir_* prefix around static functions

  • asahi: Move compiler preprocess out of agx_nir_lower_gs

  • asahi,nir: Stop relying on zero and scratch page in GS/TESS code

  • asahi/gs: Reuse GS shader compiler options

  • poly: Migrate AGX’s GS/TESS emulation to common code

  • mr-label-maker: Add poly

  • mr-label-maker: Remove mapi label

  • hk: Fix maxVariableDescriptorCount with inline uniform block

  • hk: Disable 1x in sampleLocationsSampleCounts

  • hk: Remove unused allocation in queue_submit

  • hk: Make width and height per block in HIC

  • hk: Allocate the temp tile buffer in copy_image_to_image_cpu

  • asahi: Update CI expectations

  • mailmap: Update my email

  • people: Update my email

  • nvk: Implement ISBE space sharing on vertex stage

  • panvk: Move FAU space info to panvk_compile_nir

  • panvk: Move late lowering to panvk_compile_nir()

  • nvk: Use rendering state attachment count when setting SET_CT_SELECT

  • docs/features: add anv to VK_EXT_shader_uniform_buffer_unsized_array

  • hk: Advertise VK_EXT_shader_uniform_buffer_unsized_array

  • docs/features: Update info on VK_KHR_pipeline_binary

  • hk: Uses vk_device::mem_cache

  • hk: Advertise VK_KHR_pipeline_binary

  • hk: Hash the multiview mask for both vertex and fragment stages

  • nvk: Reenable compression support with nouveau 1.4.2

Matt Turner (2):

  • meson: Fix sysprof-capture-4 dependency

  • meson: Let -Ddraw-use-llvm=false work for R300 on non-x86

Mauro Rossi (3):

  • util: Fix gnu-empty-initializer error

  • radv/rt: Fix gnu-empty-initializer error

  • radv/rt: Fix gnu-empty-initializer error in radv_pipeline_rt.c

Maíra Canal (4):

  • teflon: Improve dumped graph formatting

  • teflon: List all supported operations on tflite_builtin_op_name()

  • docs/envvars: Document Teflon environment variables

  • docs/teflon: Update documentation with more recent output

Mel Henning (62):

  • nvk: Really fix maxVariableDescriptorCount w/ iub

  • nvk: VK_DEPENDENCY_ASYMMETRIC_EVENT_BIT_KHR

  • vulkan: Add vk_collect_dependency_info_src_stages

  • treewide: Use vk_collect_dependency_info_src_stages

  • docs/nvk: Add a list of external hardware docs

  • docs/nvk: Add some developer hardware docs

  • docs/nvk: Update hardware support

  • docs/nvk: Document NVK_DEBUG=trash_memory

  • docs/envvars: Remove references to nine

  • nak/nvdisasm_tests: Test plop3

  • nak/opt_lop: Don’t handle modifiers in dedup_srcs

  • nak/nvdisasm_tests: Turn sm_list() into a function

  • nak/nvdisasm_tests: Skip SM70 on cuda 13

  • docs/nvk: Fix description of supported GPUs

  • zink: Return zink_device in create_logical_device

  • zink: Make screen->queue_lock a pointer

  • zink: Create one queue lock per device

  • zink: Lock queue_lock in zink_destroy_screen

  • zink: Lock around screen_debug_marker_{begin,end}

  • nvk: Use the OS page size in nvk_AllocateMemory

  • nouveau/headers: Use 906f defines for nv_push.c

  • nouveau: Deduplicate drf.h

  • nouveau/headers: Use drf defines in nv_push.c

  • nouveau/headers: Use drf and cl906f.h in nv_push.h

  • nak: Split LegalizeBuilder into its own type

  • nak: DCE after legalize

  • nak/legalize: Use ConstTracker to skip some movs

  • nir: Add nir_deref_instr_is_arr() helper

  • treewide: Use nir_deref_instr_is_arr()

  • nir: Use instr_clone in rematerialize_deref_in_block

  • nak: Handle CS2R latencies in SSA form

  • nak: Add a Dst::file() helper function

  • nak: Set variable_latency=0 for !needs_scoreboard

  • nak: Add ShaderModelInfo

  • nak: Replace &dyn ShaderModel w/ &ShaderModelInfo

  • nak: Don’t box ShaderModelInfo

  • nak: Use the hardware’s max warps_per_sm value

  • nak: Factor out prev_multiple_of

  • nak: Reserve capacity in LiveSet::from_iter,extend

  • nak: Add a prepass instruction scheduler

  • nvk: Disable compression for image import/export

  • nvk: Set maxStorageBufferRange = maxBufferSize

  • nvk: Use semaphore helper for BufferMarker2AMD

  • nvk: Skip barriers if engine is not present

  • novueau/winsys: nv_device_info.has_transfer_queue

  • nouveau/winsys: Set channel_alloc.tt_ctxdma_handle

  • nvk: Expose transfer-only queues

  • nak: impl fmt::Debug for SSAValue

  • nak: Take &ShaderModelInfo in instr_sched_common

  • nak: Use .file() helper in sm120_instr_latencies

  • util/rmq: Fix test upper bound

  • util/rmq: Fix uninitialized read in preprocess

  • util/rmq: Remove unused header

  • nouveau/drm-shim: Implement new getparam values

  • nak/copy_prop: Split out prop_to_ssa_values helper

  • nak: Copy-prop bindless cbuf handles

  • nak/instr_sched_prepass: Fix RegOut special case

  • nvk: Ignore meta ops in occlusion queries

  • nvk: Disable large pages for now

  • nvk: Initialize SET_ALPHA_TO_COVERAGE_OVERRIDE

  • nvk: Report additional host_image_copy layouts

  • zink: Emit float controls for preserve_denorms too

Michael Cheng (2):

  • anv: Add VMA allocator for shader binaries

  • anv: Switch shaders to dedicated VMA allocator

Michael Tretter (3):

  • r600: remove obsolete option for experimental NIR support

  • r600: fix documentation of preoptir

  • r600: remove LLVM dependency

Michal Krol (1):

  • lavapipe: Bump maxGeometryInputComponents to 128.

Michal Vanis (2):

  • glsl: replace gl ctx direct access

  • mesa: replace gl ctx direct access

Mike Blumenkrantz (28):

  • zink: consistently set/unset msrtss in begin_rendering

  • zink: disable primitiveFragmentShadingRateMeshShader feature

  • zink: set gfx_pipeline_state::mesh_pipeline when updating pipeline

  • zink: collapse gfx pipeline fetching and binding conditionals

  • zink: collapse mesh pipeline fetching and binding conditionals

  • zink: don’t destroy old push layout when enabling fbfetch descriptor

  • lavapipe: maintenance10

  • zink: return mesh pipeline when creating mesh pipelines

  • zink: add back atomics for internal refcounts

  • zink: correctly use GENERAL layout for dynamic texture clears

  • zink: allow rendering to emulated alpha images for clears

  • zink: flatten out params to nir_to_spirv()

  • zink: move xfb stride off zink_shader_info struct

  • zink: move ntv params into zink_shader_info

  • zink: move the ntv sparse checks into ntv

  • zink: use vk enum members for ntv util returns

  • zink: move ntv shader info to single-use screen member

  • zink: delete all the no-op checks when rewriting clears

  • zink: automatically rewrite clears where possible to avoid using format views

  • zink: rename msaa_expand to attachment_shadow

  • zink: improve checks for srgb mutability

  • zink: flag immutable handles as such when creating resources

  • zink: create new transient image if the sample count doesn’t match

  • zink: explicitly null pipe_resource::next when creating transients

  • zink: reuse transient attachments for format view shadowing

  • zink: re-allow transient images during blitting

  • ntv: emit demote extension/capability when emitting demote

  • ntv: emit ViewIndex with flat for fragment stage

Mohamed Ahmed (6):

  • nouveau/winsys: Store the nouveau kernel version

  • nouveau/winsys: Retrieve and store the PTE kind in the nouveau_ws_bo

  • nvk/nvkmd: Fix alignments

  • nil, nvk: Add plumbing for compression

  • nvk: Move non-sparse image plane VA allocation to bind time

  • nvk: Enable compression

Nanley Chery (17):

  • anv: Limit the SCANOUT flag to color images

  • anv: Allow modifiers on depth images

  • anv: Don’t allow STORAGE + CCS for Y_TILED mod

  • intel/isl: Only assert surface addresses on gfx9+

  • iris: Fix pipe control around fast-clears

  • iris: Add comments from Bspec fast-clear preamble page

  • intel/isl: Fix miptail selection for compressed textures

  • blorp: Fix Tile64 clear redescription assertion

  • intel/isl: Fix QPitch of arrayed MCS

  • iris: Set missing flags on clear color changes

  • iris: Use the CLEAR state on Xe2+ for MCS

  • anv: Update predicated resolve documentation

  • anv: Fix the fast clear type for FCV writes

  • anv: Don’t return the Xe2+ fast-clear type early

  • anv: Fix clear state of WSI blit sources during presentation

  • anv: Treat non-WSI PRESENT_SRC as TRANSFER_SRC

  • anv: Don’t set the display flag on WSI blit sources

Natalie Vock (69):

  • nir/lower_shader_calls: Repair SSA after wrap_instrs

  • aco: Add preload_preserved pseudo instruction

  • aco/ra: Add utility to clear PhysRegInterval

  • aco/ra: Also consider blocked registers as not containing temps

  • aco/ra: Skip blocked regs in get_reg_impl

  • aco/ra: Don’t clear fixed operand sources if they were blocked

  • aco/ra: Handle callee ABI preserved register constraints

  • aco/ra: Handle call ABI constraints

  • util/bitset: Wrap __size in braces

  • util: Add sparse bitset data structure

  • nir: Use sparse bitset for liveness information

  • radv: Fix PSO history with RT pipelines

  • aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard

  • radv/rt: Keep updated nodes always active

  • radv/rt: Correctly copy culling flags when updating to separate AS

  • radv: Move VMID reservation to vkCreateDevice

  • radv/rt: Refactor and split radv_nir_rt_shader.c

  • radv/rt: Use traversal vars for object origin/direction in ahit/isec

  • aco/live_var_analysis: Count linear VGPRs as always preserved by calls

  • aco: Remove unused p_reload_preserved def

  • aco: Record required call spills during live-var analysis

  • aco/spill: Handle calls

  • aco/spill: Reset scratch_rsrc on calls

  • aco/ra: Handle linear VGPRs allocated by p_startpgm

  • aco/spill: Create linear VGPRs for spilling ABI-preserved SGPRs

  • aco/spill: Restore registers spilled by call immediately

  • aco/lower_to_hw_instr: Add scratch size in call lowering

  • aco/util: Add aco::unordered_set

  • aco: Add pass for spilling call-related registers

  • radv/rt: Use subgroup invocation for stack index

  • radv/rt,aco: Always dispatch 1D workgroups for RT

  • aco: Swizzle ray launch IDs in the RT prolog

  • aco: Include arbitrarily fixed registers in max_reg_demand

  • aco/spill_preserved: Only reload linear VGPRs at end

  • aco: Don’t insert p_reload_preserved in loops

  • aco/lower_to_hw_instr: Preserve linearity of lowered linear VGPRs

  • aco/insert_waitcnt: Don’t determine linearity by reg number

  • aco/spill: Fix preserved reload operand update

  • aco/spill_preserved: Preserve linear VGPRs even if they aren’t p_spill operands

  • radv: Add traversal stack size to cache

  • aco/spill_preserved: Fix spilled VGPR overflow handling

  • nir/intrinsics: Add incoming/outgoing payload load/store instructions

  • aco/ra: Move register preservation logic in last block to p_return

  • aco: Remove bypass_reg_preservation

  • aco: Note if a parameter needs to be explicitly preserved

  • radv: Temporarily disable RT pipelines

  • radv: Refactor RT lowering decisions and add RADV_PERFTEST CPS override

  • radv/rt: Use function call structure in NIR lowering

  • radv,aco: Use function call structure for RT programs

  • radv: Re-enable RT pipelines

  • docs: Document RADV/ACO function calls

  • nir,aco: Clean up useless lowering of sbt_base_amd

  • radv: Use wave32 for RT on gfx11+

  • aco: Put boolean parameters inside SGPRs

  • aco: Tweak ABI register param limits

  • radv/rt: Don’t consider non-internal INTERSECTION shaders as the traversal shader

  • radv/nir: Add and use radv_nir_return_param_from_type helper

  • radv/nir: Make nir_lower_intersection_shader public

  • radv/rt: Fix terminate_ray handling for intersection shaders

  • radv/rt: Compile ahit/isec shaders to asm

  • radv/rt: Call ahit/isec shaders

  • aco: Add and use nir_abi_to_aco helper

  • aco: Add parameter assignment hints

  • aco: Use parameter assignment hints for any-hit shaders

  • aco: Fix parameter stack size calculation

  • radv/rt: Refactor shader group stack size calculation to include traversal stack

  • aco: Don’t exclude discardable parameters from register preservation

  • radv/rt: Fix some tail-call compatibility checks

  • radv/rt: Fix discardable attributes on chit and traversal shaders

Nick Hamilton (8):

  • pvr: Fix staging buffer realloc usage

  • pvr: Fix missing frees in error exit paths

  • pvr: Fix missing sample mask test instructions

  • pco: Fix encoding of branch to an empty block

  • pco: Fix for shadow sampler comparison not clamping the compare value

  • pvr: Temporarily disable the buffer device address extension

  • pco: Fix for atomic operations on an image buffer

  • pvr: Fix the isp samples per tile calculation

OPNA2608 (2):

  • vc4: Fix printing of get_tiling.modifier

  • rocket: Fix printing of rknpu_mem_create.dma_addr

Olivia Lee (15):

  • panfrost: fix cl_local_size for precompiled shaders

  • hk: fix data race when initializing poly_heap

  • panvk/csf: fix uninitialized read in draw context

  • panvk/csf: explicitly set ls_sb_slot in set_fbds_provoking_vertex

  • panvk/csf: put precomp syncobj behind PANLIB_BARRIER_CSF_SYNC option

  • panvk/csf: add PANLIB_BARRIER_CSF_WAIT, to insert WAIT after precomp

  • panvk/csf: factor out cs_match_iter_sb helper macro

  • panvk/csf: merge v10 and v11 paths in issue_fragment_jobs

  • poly: add messages to static_assert calls

  • panvk/csf: implement VK_EXT_primitives_generated_query except primitive restart

  • panvk/csf: implement dynamic precomp dispatch size

  • panvk/csf: implement VK_EXT_primitives_generated_query primitive restart

  • panvk: advertise VK_EXT_primitives_generated_query on v10+

  • Revert “panvk: advertise VK_EXT_primitives_generated_query on v10+”

  • hk: fix hk_passthrough_gs_key size computation

Patrick Lerda (12):

  • r600: fix r600_draw_rectangle refcnt imbalance

  • r600: update nplanes support

  • r600: limit pre-evergreen predicate ready size

  • r600: fix rv770 read scratch compatibility

  • r600: fix error filters compatibility

  • r600: improve cayman scissor 1x1 workaround

  • r600: fix cayman msaa shading behavior

  • r600: fix rv770 dot4 operations

  • r600: make vertex r10g10b10a2_sscaled conformant on palm and beyond

  • r600: fix rv770 clamp to max_texel_buffer_elements

  • r600: update cubearray imagesize calculation

  • r600: improve vs_as_ls switch reliability

Paul Gofman (1):

  • driconf: add a workaround for Investigation Stories : gunsound

Paulo Zanoni (9):

  • hasvk: restore anv_is_aligned()

  • blorp: fix argument indentation

  • blorp: replace magic ‘2’ with BLORP_NUM_BT_ENTRIES

  • blorp: reorganize struct blorp_params

  • intel/blorp: blorp_blit_vars_init() doesn’t need ‘key’

  • intel/blorp: generate the fast_clear_surf shaders later

  • intel/blorp: unionize blorp_params->wm_inputs

  • intel/blorp: add blorp_shaders.cl

  • meson: crocus and intel_hasvk now require clc

Pavel Ondračka (13):

  • r300/ci: update expectations

  • r300: fix dummy_vs leak

  • r300: fix overflow in r300_draw_elements_immediate

  • r300: fix locked_zbuffer leak

  • r300: fix contant remap table leak

  • r300/ci: asan testing

  • r300/ci: remove RV530 and RV380 non-asan deqp jobs

  • r300: program explicit scissor around viewport

  • r300: pop-free clipping

  • r300: enable guardband for draw

  • nir/opt_algebraic: improve dot product narrowing

  • r300: add explicit late lowering for a + -0

  • r300: invalidate texture cache when clearing texture bound for sampling

Peyton Lee (2):

  • radeonsi/vpe: correct tone mapping parameters

  • radeonsi/vpe: correct format setting

Pierre-Eric Pelloux-Prayer (25):

  • radeonsi: limit the sqtt buffer size

  • radeonsi: set VS dirty bit from si_vs_key_update_inputs

  • radeonsi: propagate shader updates for merged shaders

  • ac/virtio: remove dead code

  • ac/virtio: fix incorrect NULL check

  • ac/info: get vm_always_valid support through ac_linux_drm

  • radv: enable global BO list if vm_always_valid is supported

  • radeonsi/sqtt: clear out sqtt bo on resize

  • mesa: fix function prototype

  • mesa: remove unused image debug code

  • mesa: consider Attrib.MinLayer in do_blit_framebuffer

  • hud: only increase y if the pane contains graphs

  • hud: add new ‘dev’ pseudo-graph

  • ac/descriptors: account for num_storage_samples for gfx10

  • mesa: add assert to validate the no atomic path

  • Revert “glthread: mark internal bufferobjs for the ctx they belong to”

  • ci: enable shader-db test for radeonsi

  • ac/sdma: fix ac_sdma_get_tiled_header_dword for older gen

  • ac/sdma: fix src/dst pitch for sdma < 4

  • radeonsi: add a si_set_barrier_flags helper

  • radeonsi: fix references to sctx->flags in documentation

  • radeonsi: add a si_clear_and_set_barrier_flags helper

  • radeonsi: add extra flags param to si_emit_barrier_direct

  • radeonsi/sqtt: restore barrier_flags in si_sqtt_init_cs

  • radeonsi: add asserts to validate emit functions use of atoms

Piotr Masłowski (5):

  • hk: promote VK_EXT_robustness2 to VK_KHR_robustness2

  • hasvk: promote VK_EXT_robustness2 to VK_KHR_robustness2

  • nvk: promote VK_EXT_robustness2 to VK_KHR_robustness2

  • tu: promote VK_EXT_robustness2 to VK_KHR_robustness2

  • lvp: promote VK_EXT_robustness2 to VK_KHR_robustness2

Pohsiang (John) Hsu (24):

  • mediafoundation: add stats resource pool so we can use pool for QP map as well

  • mediafoundation: fix sporadic build failure with u_inlines.h not found on test target

  • mediafoundation: for low latency, change stats pool size to 2, this is because there is no synchronization btwn returning MF sample and ProcessInput

  • mediafoundation: periodic clang-format, no code changes

  • mediafoundation: setup wpp logging in more of the files and add some error handling on dpb manager and reference frame tracker

  • mediafoundation: add support for initial pool size and max pool size for stats pool

  • mediafoundation: periodic clang-format

  • mediafoundation: remove private CODECAPI_AVEncVideoEnableFramePsnrYuv as this is published

  • mediafoundation: remove unused code

  • mediafoundation: propagate input timestamp / duration to output

  • mediafoundation_frontend: update version to 1.08

  • mediafoundation: log warning if dx11 device is not created with multithread protected

  • d3d12: Fix lack of flushing when encoding h264 with SVC

  • mediafoundation: turn on slice auto on frames with dirty rect only

  • mediafoundation: propagate PrepareForEncode error up.

  • mediafoundation: add some end of function error logging for diagnosing error

  • mediafoundation: remove unneeded memset (~34KB for hevc)

  • mediafoundation: remove unused templ and small code cleanup

  • mediafoundation: handle the case where output sample is returning after MFT has been released.

  • d3d12: add missing updating of pMetadata

  • mediafoundation: add logging

  • mediafoundation: rename VideoEncodeReconstructedPicture to VideoEncodeD3D12ReconstructedPicture

  • d3d12: fix slice support for setting number of coding units per slice

  • mediafoundation: set rc mode in GetCodecPrivateData for 2 pass rc mode

Qiang Yu (74):

  • radeonsi: enlarge SI_NUM_SHADERS for mesh shader

  • radeonsi: handle mesh shader when si_create_shader

  • radeonsi: add context shader state for mesh shader

  • radeonsi: inline uniform support mesh shader

  • radeonsi: add si_mesh_resources_add_all_to_bo_list

  • radeonsi: add task/mesh shader info to si_shader_info

  • radeonsi: calc workgroup size for mesh shader

  • radeonsi: init mesh shader args

  • radeonsi: init pm4 state for mesh shader

  • radeonsi: no ngg culling for mesh shader

  • radeonsi: add task info to screen

  • radeonsi: lower task/mesh shader io to mem

  • radeonsi: kill outputs for mesh shader

  • radeonsi: share some vertex pipe function with mesh pipe

  • radeonsi: update scratch va for mesh shader

  • radeonsi: si_get_vs support mesh shader

  • radeonsi: simplify si_update_rasterized_prim while handle mesh shader

  • radeonsi: save mesh shader when blit

  • mesa,radeonsi: add comments about vertex and mesh pipeline shader states

  • gallium/blitter: no need to save TS state

  • mesa,gallium: not touch TS when internal draws

  • radeonsi: call si_shader_change_notify when vs bind

  • radeonsi: emit shader pointer for mesh shader

  • radeonsi: export si_set_user_data_base for mesh shader usage

  • radeonsi: add mesh shader state create/delete/bind

  • radeonsi: add mesh shader debug options

  • radeonsi: select key for mesh shader

  • radeonsi: support mesh shader per vertex output

  • radeonsi: si_get_output_prim_simplified support mesh shader

  • radeonsi: si_select_hw_stage support mesh shader

  • radeonsi: compile mesh shader with ACO only

  • radeonsi: dump shader key for mesh shader

  • radeonsi: add mesh shader bits for dirty_shaders_mask

  • radeonsi: compute vs_output_ps_input_cntl for mesh shader

  • radeonsi: support mesh shader per primitive output

  • radeonsi: support fragment shader per primitive input

  • radeonsi: handle primitive indices for mesh shader

  • radeonsi: lower mesh shader outputs

  • radeonsi: add si_emit_buffered_gfx_sh_regs_for_mesh

  • radeonsi: add radeon_emit_alt_hiz_packets for mesh shader

  • radeonsi: don’t put descs in user sgpr for task shader

  • radeonsi: init task shader args

  • radeonsi: change arg for si_cp_dma_prefetch

  • radeonsi: export si_setup_compute_scratch_buffer for task shader

  • radeonsi: add si_upload_shader_descriptos

  • radeonsi: add si_emit_task_shader_pointers

  • winsys/amdgpu: support gang submit for kernel queue

  • radeonsi: add task/mesh shader context states

  • radeonsi: implement task ring nir intrinsic lower

  • radeonsi: log cs support mesh shader

  • radeonsi: export si_init_compute_preamble_state for task shader

  • radeonsi: move shared_size to si_shader_variant_info

  • radeonsi: add si_create_compute_state_for_nir

  • radeonsi: init mesh shader ngg info

  • radeonsi: implement nir_intrinsic_load_ring_mesh_scratch_amd

  • radeonsi: increase task wait count when emit barrier

  • radeonsi: add task shader queries support

  • radeonsi: lower mesh shader local id and workgroup id

  • radeonsi: si_emit_buffered_compute_sh_regs support gang cs

  • radeonsi: compute culldist_mask and clipdist_mask for mesh shader

  • radeonsi: add si_update_shaders_shared_by_vertex_and_mesh_pipe

  • radeonsi: add si_update_shaders_for_mesh

  • radeonsi: add si_emit_rasterizer_prim_state_for_mesh

  • radeonsi: add mesh shader functions

  • radeonsi: handle maybe per primitive input for fragment shader

  • radeonsi: si_calculate_max_simd_waves support task and mesh shader

  • radeonsi: enable EXT_mesh_shader

  • doc: mark GL_EXT_mesh_shader as done

  • dri: avoid sending too many present reuqests when app start or pause

  • glsl: support barrier() for task and mesh shader

  • ac/llvm: workaround legacy fma intrinsic crash on gfx12

  • radeonsi: fix primitive restart gpu hang for pre gfx10

  • radv: fix primitive restart gpu hang for pre gfx10

  • radeonsi: fix mesh shader outputs kill

Radu Costas (1):

  • pvr: Add calculation for spill/scratch buffers

Reilly Brogan (1):

  • amd,compiler: fix const errors found with C23 glibc support

Rhys Perry (63):

  • amd/lower_mem_access_bit_sizes: don’t create subdword UBO loads with LLVM

  • amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering

  • amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load

  • nir/lower_mem_access_bit_sizes: increase chunk limit

  • amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8

  • ac/nir: stop using NIR_PASS in ac_nir_lower_ngg_nogs()

  • radv: remove NIR_PASS in radv_nir_lower_rt_abi

  • radv: stop rallocing objects which don’t belong to the shader under it

  • radv: remove NIR_PASS in insert_rt_case

  • nir/lower_shader_calls: reobtain impl after NIR_PASS

  • nir/lower_tex: optimize txd(coord, ddx/ddy(coord))

  • ac/nir: refactor move_coords_from_divergent_cf a bit

  • ac/nir: optimize txd(coord, ddx/ddy(coord))

  • radv,radeonsi: use optimize_txd

  • ac/nir: don’t consider quads incomplete inside loops

  • aco/scheduler: fix register demand check

  • ac/nir: add some tests for ac_nir_lower_mem_access_bit_sizes

  • aco/ra: copy vector_info to affinities

  • aco/ra: add first loop header phi operand to temp_to_phi_resources

  • aco: print large p_parallelcopy using several lines

  • ac/nir: fix calculation of aligned_new_size

  • ac/nir: fix check for increasing size of non-descriptor loads

  • ac/nir: don’t vectorize 16-bit shared loads to 8-bit

  • aco: micro-optimize ray launch ID swizzling

  • aco: use correct addition opcodes in gfx6-8 RT prolog

  • aco/ra: refactor update_renames slightly

  • aco/ra: omit renaming when necessary when moving copy definitions

  • aco: always run RA validation during tests

  • aco: add RA validation for p_call

  • aco/ra: remove dead code in split_blocking_vectors

  • aco/ra: discard tmp_file after get_regs_for_copies fails

  • aco/ra: fix operands when recreating blocking vectors

  • aco/ra: use original name for blocking vectors rename

  • aco/ra: update register file when recreating blocking vectors

  • aco/ra: fix split_blocking_vectors with some subdword vectors

  • aco/ra: emit p_split_vector after p_parallelcopy

  • aco/tests: add function call regalloc tests

  • radv/rt: cleanup phis after lowering parameter variables to SSA

  • radv/rt: lower non-return load_param to variable loads

  • aco: track number of post-RA spilled vgprs/sgprs

  • aco: don’t try to preserve SCC in callees

  • aco/ra: don’t use update_vgpr_sgpr_demand in increase_register_file

  • aco: move update(fixed_reg_demand) into update_vgpr_sgpr_demand

  • aco: increase max_reg_demand to help avoid preserved VGPRs

  • aco/ra: always prefer earlier regs in get_reg_impl() if costs are the same

  • aco/ra: always abort loop in get_regs_for_copies() if candidate is worse

  • aco/ra: refactor get_reg_impl and get_regs_for_copies using tuples

  • aco/ra: prefer clobbered registers in callees

  • aco/ra: prefer clobbered registers in get_reg_specified()

  • aco/ra: consider already-used preserved registers to be free

  • aco/sched: don’t use previously unused preserved registers

  • aco: don’t spill no-op copies of input parameters in preserved registers

  • lavapipe,nv50/ir,lima: run nir_opt_algebraic_late

  • nir: add fcanonicalize

  • aco/ra: copy precolor affinities to p_create_vector/p_split_vector

  • aco/ra: move split_blocking_vectors higher

  • aco/ra: split blocking vectors if needed when handling fixed operands

  • aco: remove dead p_call code in live_var_analysis

  • aco/tests: remove vcc definitions from p_call

  • aco: use size_t for monotonic_buffer_resource

  • aco: reduce memory usage of live_var_analysis

  • aco/insert_fp_mode: remove incorrect assertion

  • radv: fix when incomplete rt pipeline libraries are loaded from cache

Ritesh Raj Sarraf (3):

  • ci: Use Linux 6.17.3 for mesa gfx-ci

  • freedreno/ci: Drop KERNEL_TAG retargeting the new Linux 6.17.3

  • ci/virgl: Mark test job for Linux 6.16

Rob Clark (165):

  • freedreno/a6xx: Additional handle import logging

  • loader: Ignore empty override strings

  • freedreno: Move *_POWER_CNTL to raw_magic_regs

  • freedreno: Move TPL1_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Move GRAS_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Move SP_CHICKEN_BITS to raw_magic_regs

  • freedreno: Move UCHE_CLIENT_PF to raw_magic_regs

  • freedreno: Move PC_MODE_CNTL to raw_magic_regs

  • freedreno: Move SP_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Move HLSQ_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Move VPC_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Move UCHE_UNKNOWN_0E12 to raw_magic_regs

  • freedreno: Move RB_CCU_DBG_ECO_CNTL to raw_magic_regs

  • freedreno: Flatten fd_dev_info props

  • freedreno: Move magic/magic_raw out of props

  • freedreno: Collapse A6XXProps/A7XXProps

  • freedreno/a6xx: Fix UB in convert_color()

  • freedreno: Fix internal VBO reference leak

  • freedreno: Remove use of FDL_MIN_UBWC_WIDTH

  • freedreno/registers: Fix definition of CP_COND_EXEC

  • freedreno/crashdec: Dump cmdstream at end

  • freedreno/crashdec: Log IBs to snapshot

  • freedreno/registers: Convert events to hex

  • freedreno/registers: Event cleanups

  • freedreno/registers: Move FLAGS_REGID

  • freedreno/a6xx: Move VFD_RENDER_MODE emit

  • freedreno/a6xx: Use with_crb() helper

  • freedreno: flip template param order

  • freedreno/a6xx: genx helper for additional template param

  • freedreno/decode: Drop summary override for CRB

  • freedreno/a6xx: Add RB_DBG_ECO_MODE helper

  • freedreno: More ergonomic cs casting

  • freedreno/a6xx: Pass cs to fd6_clear_lrz()

  • freedreno/a6xx: Drop emit_marker6()

  • freedreno/a6xx: Drop fd6_emit_blit()

  • freedreno/a6xx: Rework where we emit ccu cache cntl

  • freedreno/a6xx: Emit RB buffer setup for sysmem too

  • freedreno/a6xx: Split preamble for gmem vs sysmem

  • freedreno/a6xx: Be more precise about CP_SET_MARKER

  • freedreno/a6xx: Actually use lrz fast clear

  • freedreno/a6xx: Add helper to set render mode

  • freedreno/a6xx: Add helpers for preamble const loads

  • freedreno/a6xx: Fix debug comment

  • freedreno/decode: Fix bindless descriptor dumping

  • freedreno/decode: Print mode for compute shaders

  • freedreno/decode: Add extra indent levels

  • freedreno/registers: Fix a few field names

  • freedreno/registers: Rename SP_HLSQ_MODE_CNTL

  • freedreno/registers: Name RB_LRZ_CNTL2

  • freedreno/registers: Name HYSTERESIS regs

  • freedreno/registers: Fix GRAS_LRZ_CNTL definition

  • freedreno: Add chip range template helpers

  • freedreno/registers: Extend ncrb builder for new gens

  • freedreno/lrz: Extend lrz fc helpers for gen8

  • freedreno/event: Extend event helpers for gen8

  • freedreno: Add gen8 device info

  • freedreno/common: Make max tile dimensions a param

  • freedreno/common: Add placeholder a8xx device

  • freedreno/drm-shim: Add a830

  • freedreno: Add gen8 chip template-fu

  • freedreno/registers: pm4 updates for gen8

  • freedreno/registers: Fix gen8 swizzle enum

  • ir3: Skip non-bindless ldc warmups

  • ir3: Fix gen8 instruction timings

  • ir3: Fix cat3 latency

  • ir3: Limit CS lock/unlock quirk

  • ir3: Extract out helper for nop flags

  • ir3: Add (sy) before end of preamble when necessary

  • ir3: Add disasm test macro for gen8

  • ir3: Add (eostsc)

  • ir3: Add cat1 (sat) bit

  • ir3: Add cat3 alt immed encoding

  • ir3: Add cat3 flut src encoding

  • ir3: Add mova .u bit

  • ir3: Use ldc.u in preamble

  • ir3: Add mova.r encoding

  • ir3: Fix gen8 ldc encoding

  • ir3: Add new cat2 instructions

  • ir3: dp2acc is removed in gen8

  • ir3: Add new cat3 instructions

  • freedreno/registers: Fix gen8 UBWC array pitch

  • freedreno/registers: Add TPL1_MODE_CNTL bitfields

  • freedreno/fdl: Fix gen8 TEX_LINE_OFFSET

  • freedreno/fdl: Fix gen8 buffer depth

  • freedreno/a6xx: Handle tess_bo size differences for gen8

  • freedreno/registers: More gen8 prep

  • freedreno/registers: gen8 support

  • freedreno/a6xx: Drop log_pipeline_stats()

  • freedreno/a6xx: Add gen8 query support

  • freedreno/a6xx: Fix VSC_BIN_SIZE for gen8

  • freedreno/computerator: gen8 support

  • freedreno: gen8 support

  • freedreno/common: Add A840 and X2-85

  • ir3: Fix early-preamble (sy)

  • gallium/aux: Add debug option to force u_upload rollover

  • gallium: Make upload_cb0 return a releasebuf

  • asahi: Set prefer_real_buffer_in_constbuf0

  • freedreno/devices: Add num_slices

  • freedreno/a6xx: Fix GRAS_LRZ_BUFFER_PITCH

  • freedreno/a6xx: Fix GRAS_LRZ_BUFFER_SLICE_PITCH

  • freedreno/lrz: Add gen8 lrz layout support

  • freedreno/a6xx: Fix layered lrz

  • freedreno/a6xx: gen8 lrz support

  • freedreno/a6xx: Set FD_BO_NO_HARDPIN from meson

  • freedreno/registers: Mark LOAD_IMMED as a5xx

  • freedreno/a6xx: Drop legacy CP_EVENT_WRITE builders

  • freedreno/registers: Move ‘unknown’ last

  • freedreno/registers: Reintroduce FD_NO_DEPRECATED_PACK

  • tu: Drop tu_cs_image_*_ref

  • tu: Drop use of legacy reg offset macros

  • tu: Rework pipeline stat queries

  • tu: Convert tu_clear_bit deprecated reg builders

  • tu: Convert tu_cmd_buffer deprecated reg builders

  • tu: Convert tu_shader deprecated reg builders

  • tu: Rework emit_xs_config()

  • tu: Rework emit_vpc()

  • tu: Convert rest of tu_pipeline deprecated reg builders

  • tu: Drop FD_NO_DEPRECATED_PACK

  • freedreno/a6xx: Move assert

  • freedreno/a6xx: Extract out GMEM cache helper

  • tu: Use GMEM cache helper

  • tu: Convert viewport state to CRB

  • tu: Convert emit_lrz_buffer to CRB

  • freedreno/fdl: Fix gen8 buffer descriptors

  • freedreno/fdl: Add STRUCTSIZETEXELS arg

  • tu: Replace A6XX_TEX_CONST_DWORDS

  • tu: Plumb CHIP thru descriptor set building

  • tu: Use more fdl6_buffer_view_init()

  • tu: Extract out descriptor helpers

  • tu: Fix TU_DRAW_STATE_VB size

  • tu: Fix zero length pkt4

  • freedreno/a6xx: Fix gen8 blitter resolve

  • freedreno/computerator: Use correct CP_SET_RENDER_MODE

  • tu: Use correct LRZ flush events on A7XX

  • tu: Track dirty TCS state

  • tu: Move PC_DS_PARAM emit after early-exit

  • tu: Move CP_SET_SUBDRAW_SIZE out of SDS

  • freedreno/rnn: Track min/max offset

  • freedreno/decode: Add regex support for query-mode

  • freedreno: Disable has_rt_workaround for gen8

  • freedreno: Disable supports_double_threadsize for gen8

  • tu: Convert foveat state to CRB

  • freedreno/fdl: Fix gen8 MUTABLEEN

  • freedreno/fdl: Fix gen8 sRGB buffers

  • freedreno/registers: Fix gen8 UV_PITCH

  • freedreno/registers: Add subpass fence events

  • freedreno/registers: Fix gen8 GRAS_SU_STEREO_CNTL

  • freedreno/registers: Fix gen8 TPL1_MODE_CNTL

  • freedreno/registers: Fix gen8 TPL1_A2D_BLT_CNTL

  • freedreno/registers: Fix GRAS_LRZ_CB_CNTL

  • freedreno/registers: Fix py array reg offsets

  • freedreno/registers: Update gen8 FDM regs

  • freedreno/registers: Update gen8 VRS registers

  • ir3: Avoid narrowing int conversions from GPR on SALU

  • ir3: Skip shading_rate lowering when unneeded

  • ir3: Limit 64b atomic 16b offset quirk to a7xx

  • tu: Support acceleration_structure for wave64

  • tu: gen8 descriptor support

  • tu: Add helper to set render mode

  • tu: gen8 sampler support

  • tu: gen8 support

  • freedreno/common: Fix gen8 EFU float control

  • freedreno: Force single wavesize if double threadsize is unsupported

  • freedreno/lrz: Correct lrz fc layout for gen8

  • freedreno/a6xx: Better program state size calc

Rohan Garg (2):

  • anv: program STATE_COMPUTE_MODE to flush the L1 cache

  • anv: implement resource barrier emissions

Roland Scheidegger (3):

  • llvmpipe: do bounds checking for shared memory

  • llvmpipe: implement strict d3d11 rules for centroid interpolation

  • llvmpipe: optimize the centroid implementation

Romaric Jodin (10):

  • pan/va: make valhall_parse_isa input explicit

  • aux/trace: remove -I argument

  • pan/bi: improve bi_alu_src_index to avoid bi_make_vec when possible

  • pan/bi: improve vectorization of 8bit alu

  • pan/bi: do not vectorize nir_op_f2{i,u}8

  • pan/bi: do not vectorize nir_op_f2fmp

  • pan/bi: fix destination of v4i8 instruction returning only v2i8

  • pan/bi: bi_alu_src_index: remove invalid assert

  • pan/va: Add missing 8bit widen swizzles

  • pan/bi: Keep vectorized phis

Rudi Heitbaum (1):

  • mesa: retain const qualifier from pointer

Ryan Houdek (2):

  • freedreno/fdl: Fix typo in tiled_to_linear_2cpp

  • freedreno/fdl: Optimize linear_to_tiled with avx2

Ryan Mckeever (10):

  • mailmap: update my name and email

  • people: update my name/email

  • nir: add support for pixel_local_storage variables

  • compiler/glsl: replace tabs with spaces

  • glapi: add EXT_shader_pixel_local_storage extension

  • glsl, mesa: add EXT_shader_pixel_local_storage extension

  • gallium, mesa: keep track of pixel local storage state

  • pan/bi: introduce EXT_shader_pixel_local_storage support to compiler

  • pan/lib: prepare for pixel local storage support

  • panfrost: enable EXT_shader_pixel_local_storage

Sagar Ghuge (18):

  • anv: Call brw_nir_lower_rt_intrinsics_pre_trace lowering pass

  • brw/rt: Move nir_build_vec3_mat_mult_col_major helper to header

  • brw/rt: fix ray_object_(direction|origin) for closest-hit shaders

  • vulkan/runtime: Fix typo in stack size calculation

  • anv: Use correct engine class for companion RCS

  • anv: Drop unwanted untyped flush for AS query

  • intel/common: Consider 0 threads while setting TG

  • intel/genxml: Update CS_CHICKEN1 register for gfx20

  • anv: Replay mode is only available on Gfx < 20

  • anv: Convert indirect to direct dispatch

  • vulkan/runtime: Account for pipeline libraries stage count

  • anv/rt: Increment block count only for valid children

  • blorp: Set persample_msaa_dispatch for render shader

  • blorp: Handle 2D MSAA array image copies on compute shader

  • anv: Stop using RCS companion for MSAA copy/clear on Xe3+

  • anv: Add host barrier while dumping out BVH data

  • anv/rt: Don’t always set disableOpacityCull bit

  • anv/rt: Drop atomic operations on opacity flags

Samuel Pitoiset (313):

  • radeonsi: use ac_emit_write_data_imm() more

  • radv: use ac_emit_cond_exec() more

  • amd,radv,radeonsi: add ac_emit_cp_set_predication()

  • amd: add a predicate parameter to ac_emit_cp_pfp_sync_me()

  • radv: use ac_emit_cp_pfp_sync_me() more

  • amd,radv,radeonsi: add ac_emit_cp_gfx11_ge_rings()

  • amd,radv,radeonsi: add ac_emit_cp_tess_rings()

  • amd,radv,radeonsi: add ac_emit_cp_gfx_scratch()

  • amd,radv,radeonsi: add ac_emit_cp_acquire_mem()

  • amd,radv,radeonsi: add ac_cmdbuf_flush_vgt_streamout()

  • radv/ci: uprev kernel to 6.17.3 + drm/buddy backported fixes for zerovram

  • radv/ci: use the custom 6.17.3 kernel for NAVI21/NAVI31

  • radv/ci: use the custom 6.17.3 kernel for POLARIS10

  • radv/ci: drop RADV_PERFTEST=video_decode,video_encode for NAVI31

  • radv/ci: bump number of deqp-runner jobs to 32 for GFX1201

  • radv/ci: set RADV_DEBUG=novideo for NAVI21

  • radv/ci: set RADV_DEBUG=novideo for NAVI31 too

  • radv: remove an useless check when destroying descriptor sets

  • radv: add a small helper to destroy descriptor pool entries

  • radv: simplify allocating pool entries for descriptor sets

  • radv: use vk_zalloc2() for allocating the descriptor pool

  • radv: simplify error handling when creating descriptor pools

  • radv: pass int_sel to radv_cs_emit_write_event_eop()

  • radv: remove useless parameter to gfx10_cs_emit_cache_flush()

  • radv: simplify L2 cache flushes on < GFX12

  • radv: remove an obsolete comment about SMEM stores

  • radv: use ac_emit_cp_copy_data() more for perfcounters

  • amd,radv: add ac_emit_cp_atomic_mem()

  • amd: add missing _cp_ to some emit helpers

  • amd,radv,radeonsi: add ac_emit_cp_nop()

  • amd,radv,radeonsi: add ac_emit_cp_load_context_reg_index()

  • amd: add a predicate parameter to ac_emit_cp_copy_data()

  • radv: use ac_emit_cp_copy_data() more

  • amd,radv,radeonsi: add ac_emit_cp_write_data_{head}()

  • amd,radv: move SDMA utility helpers to common code

  • amd: move CP emit helpers to ac_cmdbuf_cp.c/h

  • radv: gather push constant size from shaders for ESO

  • radv/rt: radv: gather push constant size from shaders for RT

  • radv: gather push constant size from shaders for pipelines

  • radv: remove radv_shader_layout::push_constant_size

  • radv: remove radv_pipeline_layout::push_constant_size

  • radv: bump maxImageArrayLayers to 8192 on GFX10+

  • radv: bump maxImageDimension3D to 8192 on GFX10+

  • radv: initialize image properties earlier

  • radv: configure the screen scissor to the maximum image dimension

  • radv: bump image limit properties on GFX12

  • amd,radv,radeonsi: add ac_pm4_emit_commands()

  • radv/amdgpu: use common emit helpers in radv_amdgpu_cs_chain_dgc_ib()

  • amd,radv: add ac_emit_cp_indirect_buffer()

  • radv/amdgpu: remove now unused radeon_emit helpers

  • amd,radv,radeonsi: add and use more ac_cmdbuf_XXX helpers

  • amd,radv,radeonsi: add ac_emit_cp_inhibit_clockgating()

  • amd,radv,radeonsi: add ac_emit_cp_spi_config_cntl()

  • amd,radv,radeonsi: add ac_emit_spm_setup()

  • amd,radv,radeonsi: add ac_emit_cp_release_mem()

  • radv/ci: stop skipping dEQP-VK.descriptor_indexing.* on Cezanne

  • radv/ci: update comments around video failures

  • radv: dirty dynamic descriptors when required

  • radv: add radv_bind_{graphics,rt,compute}_pipeline() helpers

  • radv: use a linked-list for storing descriptor pool sets

  • radv: implement a new descriptor sets allocator

  • vulkan: update spec to 1.4.330

  • vulkan: exclude non-existant Shader64BitIndexingEXT SPIR-V capability

  • spirv: Update the JSON and headers

  • radv: use radv_buffer_get_va() more

  • radv/amdgpu: use radv_amdgpu_bo_va_op() for BOs from pointer

  • radv/amdgpu: add a way to wait for VM updates at alloc time

  • radv: add radv_wait_for_vm_map_updates drirc and enable for Forza Horizon 5

  • amd,radv,radeonsi: move some GFX12 emit helpers to common code

  • amd,radv,radeonsi: add ac_{gfx11_reg_pair,gfx12_reg}

  • amd,radv,radeonsi: add ac_buffered_sh_regs

  • amd,radv,radeonsi: move GFX12 push SH REGS helpers to common code

  • radv: advertise VK_EXT_shader_uniform_buffer_unsized_array

  • radv: remove some RADV_DEBUG deprecated options

  • radv: fix reserving enough space for emitting the SPM setup

  • radv: ignore dual-source blending when blending isn’t enabled for MRT0

  • radv: implement vkCmdEndRendering2KHR()

  • radv: allow NULL pSamplesMask with vkCmdSetSampleMaskEXT()

  • radv: add support for depth/stencil resolves with vkCmdResolve2()

  • radv: reverse the logic for NO_CONCURRENT_WRITES_BITS_MESA

  • radv: implement new input attachment information for dynamic rendering

  • radv: allow ds<->color copies on compute/transfer queues

  • radv: add support for controlling sRGB transfer function with resolves

  • radv: advertise VK_KHR_maintenance10

  • radv,vulkan: replace VK_RENDERING_INPUT_ATTACHMENT_NO_CONCURRENT_WRITES_BIT_MESA

  • radv: add a workaround for illegal depth/stencil descriptors with No Man’s Sky

  • radv: fix creating linked graphics ESOs with a compute shader

  • radv: use radv_get_shader_layout() more with ESO

  • radv/sqtt: do not try to resize the SQTT buffer for per-submit captures

  • aco: fix reserving VGPRs for 64-bit attributes in VS prologs

  • radv,aco: wait for all VMEM loads when the prolog loads large 64-bit attributes

  • amd,radeonsi: add GFX11 packed context registers helpers to common code

  • radv: add GFX11 packed context registers helpers

  • radv: add separate functions for emitting framebuffer on GFX11-11.5

  • radv: use GFX11 packed context regs

  • radv: support more tessellation parameters with TCS for ESO unlinked shaders

  • radv/ci: remove RADV_PERFTEST=video_encode,video_code for GFX6-7

  • radv/tests: use vkGetPipelineKeyKHR() instead of compiling pipelines

  • radv: move back ac_sqtt_{init,finish}() to the right places

  • ac/surface: ban 256KB swizzle modes for non-MSAA images on GFX11+

  • radv: add vk_wsi_disable_unordered_submits and enable for GTK

  • radv/meta: remove useless blit2d_src_temps

  • radv/meta: split radv_meta_blit2d() into two separate functions

  • radv/meta: remove radv_meta_blit2d_rect

  • radv/meta: remove multiple aspects in radv_gfx_copy_memory_to_image()

  • radv/meta: simplify radv_gfx_copy_memory_to_image() even more

  • radv/meta: simplify aspect/formats in radv_gfx_copy_image()

  • radv/meta: rework radv_meta_nir_texel_fetch_build_func

  • radv/meta: fuse depth/stencil aspects copy with the GFX path

  • radv/amdgpu: add a way to identify preamble/postamble when dumping CS

  • radv: add RADV_DEBUG=dumpibs to dump command buffers

  • ac/parse_ib: decode SDMA_OPCODE_POLL_REGMEM

  • radv: fix supporting more tess parameters with TCS for ESO unlinked shaders

  • radv: bump maxRayDispatchInvocationCount to 2^30

  • radv: fix gathering push constants from shaders with ESO

  • radv: add a workaround for color<->stencil only copies on SDMA4-5

  • spirv: Update the JSON and headers

  • vulkan: update spec to 1.4.333

  • ac,radv: add ac_emit_sdma_constant_fill()

  • ac,radv,radeonsi: add ac_emit_sdma_copy_linear()

  • radv: remove unnecessary handling of SDMA in radv_cs_emit_write_event_eop()

  • ac,radv,radeonsi: add ac_emit_sdma_copy_linear_sub_window()

  • ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()

  • ac,radv: add ac_emit_sdma_copy_t2t_sub_window()

  • radv: remove now unused SDMA helpers

  • vulkan: add support for vkCustomResolveCreateInfoEXT

  • radv: implement VK_EXT_custom_resolve

  • radv: advertise VK_EXT_custom_resolve

  • ci: build drm-shim for RADV tests in debian-vulkan

  • radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY

  • radv: always use MALL for CP DMA operations on GFX12

  • radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()

  • radv: fix RB+ for depth-only with unused attachments

  • amd/drm-shim: export a function that allows to select a different device

  • meson: require drm-shim for ACO tests

  • aco/tests: switch to drm-shim

  • vulkan: stop excluding Shader64BitIndexingEXT SPIR-V cap

  • radv: allocate the SQTT BO in GTT for faster readback

  • ac/spm: add cache counters configuration for GFX12

  • ac/spm: adjust the granularity of SPM results on GFX12

  • ac/spm: use hardware names for performance counters

  • radv: enable RADV_THREAD_TRACE_CACHE_COUNTERS on GFX12

  • radv: remove the ability to create NULL devices with RADV_FORCE_FAMILY

  • ac/spm,radv,radeonsi: configure the SPM sample interval in common code

  • radv: only reset SPM when cache counters are enabled with RGP

  • ac,radv,radeonsi: add more SPM helpers to common code

  • radv: reformat debug/perftest options arrays

  • radv: use a separate parameter for radv_rt_wave64

  • radv: use a separate parameter for radv_disable_dcc

  • radv: ignore radv_disable_dcc{_mips} drirc options on GFX12

  • radv: fix per-submit RGP captures on video queues

  • radv: add a new dirty state for the VRS surface state on GFX11+

  • radv: implement VRS for flat shading on GFX11+

  • radv: enable VRS for flat shading on GFX11+

  • radv: fix resetting descriptor pool since the new descriptor sets allocator

  • radv: add radv_hide_rebar_on_dgpu and enable for Red Dead Redemption 2

  • radv: make sure to reset uses_fbfetch_output for NULL fragment shaders

  • radv: fix fbfetch output with ESO

  • ac/surface: do not use tile swizzle for replayable/aliased FMASK surfaces

  • radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3

  • ci: uprev vkd3d

  • Revert “radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3”

  • ci: uprev VKCTS main to 211e452358f5cafd14bdd76d78342b62741e94aa

  • radv: reduce maxTexelBufferElements to 1<<29

  • vulkan: update spec to 1.4.335

  • radv: add support for computeDerivativeGroupQuads on < GFX12

  • radv: enable conservativeRasterizationPostDepthCoverage on GFX10+ when possible

  • radv: remove redundant buffered regs emission for dispatches on GFX12+

  • radv: constify radv_gfx12_emit_buffered_regs()

  • radv: decouple RT and compute dispatches paths

  • radv: add radv_cmd_state::emitted_rt_pipeline

  • radv: only include executable size when capturing shaders with RGP

  • radv: fix race condition when getting the blit queue

  • radv: add RADV_DEBUG=vm option

  • radv: rename RADEON_FLAG_VA_UNCACHED to RADEON_FLAG_GL2_BYPASS

  • radv: constify radv_{cb,ds}_buffer_info parameters

  • radv/meta: inject image view usage info

  • radv/meta: stop passing a stencil attachment for depth decompress

  • radv: create descriptors for color/depth-stencil surfaces earlier

  • zink/ci: add two tests to the skip lists

  • ac,radeonsi: move si_tracked_reg to common code

  • ac/cmdbuf: add new slots to ac_tracked_reg

  • radv: switch to AC_TRACKED_xxx

  • ac,radv,radeonsi: add ac_tracked_regs

  • radeonsi: remove dead code in si_set_tracked_regs_to_clear_state()

  • ac,radv,radeonsi: add functions to initialize tracked regs

  • radv: remove redundant assertions in radeon_emit_{array}()

  • ac,radv: add more cmdbuf emit helpers

  • ac,radv: add ac_cmdbuf::context_roll and use it

  • ac,radv,radeonsi: add tracked register macros to common code

  • radv: add the SQTT relocated shaders BO to the cmdbuf list

  • radv/nir: fix front_face opts for points/lines and unknown prim

  • ac/perfcounter: add a separate group for GFX10.3

  • ac/perfcounter: adjust the number of events for TD on GFX10.3

  • ac/perfcounter: add GCEA block description on GFX10-11

  • ac/spm: adjust configuration of some GPU blocks

  • ac/spm: add an assertion to check the number of global instances

  • ac/spm: fix programming more than one counter slot

  • ac/spm: print an error message when a group is unknown

  • ac/spm: add an ID to raw performance counters

  • ac/spm: implement the new derived SPM chunk for performance counters

  • ac/spm: add support for new LDS counters in RGP 2.6

  • ac/spm: add support for new Memory bytes counters in RGP 2.6

  • ac/spm: add support for new Memory percentage counters in RGP 2.6

  • ac/spm: add support for Ray Tracing counters in RGP

  • ac/rgp: enable new performance counters for RGP 2.6 on GFX10-GFX11

  • radv: change the default value of RADV_TRACE_CACHE_COUNTERS on < GFX10

  • radv: fix capturing performance counters with SPM

  • amd,radv,radeonsi: add a new function to update windowed perf counters

  • ac/perfcounter: move configuration for GFX12 in a separate file

  • ac/perfcounter: define a distribution mode for all perf blocks on GFX12

  • ac/perfcounter: update the number of events for GRBME_SE on GFX12

  • ac/perfcounter: fix the number of static instances for some blocks on GFX12

  • ac/perfcounter: rework computing the number of block instances on GFX12

  • ac/perfcounter: update configuration of many blocks on GFX12

  • ac/spm: fix GRBM broadcasting for global blocks

  • ac/spm: select correct broadcasting mode for CPF/GCEA blocks

  • ac/spm: prevent selecting invalid brodcast mode for SPM blocks

  • radv: increase the reserved CS space size for SPM

  • ac/perfcounter: remove setting unused fields for GFX12 blocks

  • ac/perfcounter: fix configuration of SQ/SQ_WGP blocks on GFX12

  • ac/perfcounter: define new GPU blocks on GFX11+

  • ac/perfcounter: move configuration for GFX11 in a separate file

  • ac/perfcounter: define a distribution mode for all perf blocks on GFX11

  • ac/perfcounter: fix the number of static instances for some blocks on GFX11

  • ac/perfcounter: update configuration of many blocks on GFX11

  • ac/perfcounter: compute the number of block instance properly on GFX11

  • ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY

  • radv: use 2D swizzle modes for 3D CB render targets when optimal

  • radv: add new drirc radv_prefer_2d_swizzle_for_3d_storage

  • radv: enable radv_prefer_2d_swizzle_for_3d_storage for TLOU1

  • ac/perfcounter: fix number of instances for GCEA

  • ac/perfcounter: move configuration for GFX10.3 in a separate file

  • ac/perfcounter: define a distribution mode for all perf blocks on GFX10.3

  • ac/perfcounter: fix the number of static instances for some blocks on GFX10.3

  • ac/perfcounter: update configuration of many blocks on GFX10.3

  • ac/perfcounter: compute the number of block instance properly on GFX10.3

  • ac/perfcounter: move configuration for GFX10 in a separate file

  • ac/perfcounter: define a distribution mode for all perf blocks on GFX10

  • ac/perfcounter: fix the number of static instances for some blocks on GFX10

  • ac/perfcounter: update configuration of many blocks on GFX10

  • ac/perfcounter: compute the number of block instance properly on GFX10

  • ac/spm: fix a crash with the RT counters on GFX10

  • ac/perfcounter: add new GCEA_CPWD block definition on GFX12

  • ac/perfcounter: add new GCEA_SE block definition on GFX12

  • ac/spm: rework indexing of the derived groups/counters/components

  • ac/spm: update the cache group on GFX12

  • ac/spm: add support for new LDS counters in RGP 2.6 on GFX12

  • ac/spm: add support for new Memory bytes counters in RGP 2.6 on GFX12

  • ac/spm: add support for new Memory percentage counters in RGP 2.6 on GFX12

  • ac/spm: add support for Ray Tracing counters in RGP on GFX12

  • ac/rgp: enable the new derived SPM chunk for performance counters on GFX12

  • ac/spm: use GPU block distribution mode to determine broadcasting

  • ac/spm: use GPU block distribution mode to determine instances

  • ac/perfcounter: add num_{16,32}bit_spm_counters to GPU blocks

  • ac/perfcounter: rename ac_pc_block::num_instances to num_scoped_instances

  • ac/perfcounter: rename ac_pc_block::num_global_instances to num_instances

  • Revert “radv: allocate the SQTT BO in GTT for faster readback”

  • radv/sqtt: add a comment about the allocation strategy of the SQTT BO

  • ci: uprev vkd3d

  • radv: fix flushing gang semaphore with SDMA/ACE

  • radv/ci: document a regression with transfer queue on RENOIR

  • radv/rt: fix a compilation warning about uninitialized fields

  • radv: use UNREACHABLE for illegal texture filter

  • radv: remove extra instructions after UNREACHABLE

  • ac/spm: fix typo in one GPU perf block name

  • ac/spm: define new per-shader engine blocks

  • ac/perfcounter: fix number of 32-bit SPM counters

  • ac/perfcounter: fix computing number of 16-bit/32-bit SPM counters

  • ac/perfcounter: define more GPU blocks on GFX12

  • ac/perfcounter: re-order GPU perf blocks on GFX12

  • ac,radv,radeonsi: rename num_spm_counters to num_spm_modules

  • ac/perfcounter: add missing configuration for GCEA on GFX11

  • ac/perfcounter: fix number of scoped instances for RMI block

  • ac/perfcounter: define more GPU blocks on GFX11

  • ac/perfcounter: re-order GPU perf blocks on GFX11

  • radv/sqtt: use VkCommandBuffer objects for SQTT start/stop sequences

  • radv/sqtt: rework allocating the SQTT buffer

  • radv/sqtt: use a staging buffer for faster reads on dGPUS

  • radv/spm: rework allocating the SPM buffer

  • radv/spm: use a staging buffer for faster reads on dGPUS

  • ac,radv: sample and set correct shader/memory clocks for RGP

  • ac/perfcounter: use GFX11 definition for GFX11.5

  • radv: enable SPM for GFX11.5

  • ac/sdma: fix stencil only copies on GFX9

  • ci: uprev VKCTS main to 4d3bedc74e2258c483cf968753207cff84d9e4fc

  • zink/ci: document a GLX crash on RADV/POLARIS10

  • radv/sqtt: rework radv_emit_sqtt_userdata() to support gang CS

  • radv/sqtt: emit userdata in the gang CS when needed

  • radv: fix missing SQTT markers for task+mesh draws

  • radv/dgc: adjust task+mesh SQTT markers

  • ac/cmdbuf: disable ENABLE_PING_PONG_BIN_ORDER on GFX11.5

  • radv/sqtt: delay VMID reservation at capture time

  • radv/sqtt: fail if GPU clocks can’t be sampled

  • radv/meta: use 2D array for color resolves with compute

  • radv/meta: batch resolving all color image layers with compute

  • radv/meta: always use mip level 0 for source image resolves

  • ac/debug: add a function that dumps texture descriptors

  • radv/meta: fix layered depth stencil resolves with compute

  • radv: always fast-clear color image with comp-to-single on GFX11-11.5

  • radv/meta: add support for fast clearing color images with non-zero baseArrayLayer

  • radv: optimize layered fast clear colors when comp-to-single is supported

  • ac/nir: fix computing cube derivatives when the major axis is negative

  • vulkan: fix missing begin debug marker for HPLOC

  • radv: fix applying radv_ssbo_non_uniform=true for Crysis 2/3 remastered

  • radv: add a workaround for a synchronization bug in Strange Brigade Vulkan

  • radv: zero-initialize image view objects

  • radv: fix tracking of pipelines used in secondaries

  • radv: disable unordered submits when SQTT queue events are enabled

  • radv: emit pending flushes after late decompressions with fbfetch

  • radv/meta: fix the key for DCC decompress on compute

  • radv: fix late decompressions for fbfetch with more corner cases

  • radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE

Saroj Kumar (1):

  • radeonsi: Move binary upload, dump code to new file

Serdar Kocdemir (3):

  • gfxstream: Check host allocation mode for external memory

  • gfxstream: Enable VK_EXT_blend_operation_advanced

  • gfxstream: Add VK_EXT_frame_boundary support

Sergi Blanch Torne (18):

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • crnm: default wo coloring when unknown GitLab job status

  • crnm: clean uncolored job status

  • ci,piglit: update expectations from piglit nightly

  • ci,crnm: enable attempts ctr include status

  • ci,crnm: warning message when a job can’t be enabled

  • ci,crnm: enhancement within a GitLab job

  • ci,crnm: inhibit single target trace dump

  • ci,crnm: refresh_wait_job as argument

  • ci,tools: know if running within a GitLab job

  • ci,crnm: inhibit pretty_wait within a GitLab job

  • ci,crnm: fix round error in pretty_duration

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci,piglit: update expectations from gc2000 piglit nightly

Sharjeel Khan (1):

  • gfxstream: [C++23] Fixes for C++23 issues

Silvio Vilerino (84):

  • p_video_codec::encode_bitstream_sliced: Add last_slice_completion_fence for PIPE_VIDEO_SLICE_MODE_AUTO

  • mediafoundation: Helpers ConfigureBitstreamOutputSampleAttributes/ConfigureStatsMetadataOutputSampleAttributes

  • mediafoundation: Add Resolve completion fence to stats IDXGIBuffers

  • mediafoundation: Set ConfigureBitstreamOutputSampleAttributes earlier for async subregion notifications do not need resolved metadata for it

  • mediafoundation: Attach async stats DXGI buffers without CPU fence wait

  • mediafoundation: emit subregions samples before pAsyncFence wait to reduce latency

  • mediafoundation: Add support for setting CODECAPI_AVEncSliceGenerationMode

  • mediafoundation: Prepare for multi sample multi slice

  • mediafoundation: Emit multiple MFSamples per slice when CODECAPI_AVEncSliceGenerationMode = 1i

  • mediafoundation: Add some more trace logging

  • mediafoundation: Attach stats deferred buffers to all samples for simplicity

  • d3d12: Implement last slice signal by splitting Encode/Resolve in two ECL

  • mediafoundation: Refactor frame, multi slice and combine slice IMFSample emission to make it simpler

  • mediafoundation: Add pLastSliceFence shortcircuit wait for auto slice mode async slices mode

  • mediafoundation: Only attach stats to last slice mfsample

  • d3d12: Use a separate queue for encode resolve operations

  • d3d12: Optimize d3d12_video_encoder_flush

  • d3d12: Remove multiple index calc in d3d12_video_encoder_begin_frame

  • d3d12: Remove multiple index calc in d3d12_video_encoder_prepare_input_buffers

  • d3d12: Cache ID3D12VideoDevice4 instance if supported

  • d3d12: Cache ID3D12VideoEncoderHeap1 instance if supported

  • d3d12: Cache ID3D12VideoEncodeCommandList4 instance if supported

  • d3d12: Remove per frame allocation slice_sizes(picture->num_slice_descriptors)

  • d3d12: Only call CheckFeatureSupport(D3D12_FEATURE_FORMAT_INFO when video format changes

  • d3d12: Only check HEVC video caps if configuration changed between frames

  • d3d12: Remove unused d3d12_video_encoder::m_transitionsBeforeCloseCmdList

  • d3d12: Use cached heap allocations for barriers instead of allocating per frame

  • d3d12: Use cached heap allocations for output bitstreams instead of allocating per frame

  • d3d12: Video Encode - Make some parameters const & instead of by value

  • d3d12: Use readback heaps for staging bitstream allocations

  • d3d12: d3d12_video_encoder_get_slice_bitstream_data use regular Map/Unmap

  • d3d12: Only check H264 video caps if configuration changed between frames

  • d3d12: Make output metadata frame buffer READBACK and use direct Map() in get_feedback

  • d3d12: d3d12_promote_to_permanent_residency to accept res array batch

  • d3d12: Only check for GetDeviceRemovedReason in debug builds

  • mediafoundation: SliceGeneration=1: Zero copy IMFSample output with wrapped ID3D12Resource frame/slice buffers

  • mediafoundation: Only use sliced mode when CODECAPI_AVEncSliceGenerationMode is set, disregarding num slices configured

  • mediafoundation: Allocate pro-rated buffer sizes for multi-slice encoding

  • mediafoundation: Only wait on pSyncObjectQueue for stats completion if any stat was enabled

  • mediafoundation: Also set pSyncObjectQueue = m_spStagingQueue when DX11 input sample

  • d3d12: Fix d3d12_video_enc.cpp(4794,33): Error C4244: initializing: conversion from uint64_t to SIZE_T, possible loss of data

  • d3d12_video_encoder_nalu_writer_hevc: Reuse per frame scratch allocations

  • d3d12_video_encoder_nalu_writer_h264: Reuse per frame scratch allocations

  • d3d12: d3d12_video_encoder_references_manager_hevc remove double resize() and add reserve() to cached vectors

  • d3d12: Fix d3d12_promote_to_permanent_residency always making resident

  • mediafoundation: Fix width/height typo in alignment calculation

  • mediafoundation: encode.cpp: Remove redundant lock() and memset()

  • mediafoundation: Optimize STL usage in reference_frames_tracker_hevc.cpp

  • mediafoundation: Cleanup MaxL1References variables

  • mediafoundation: Remove unused AllocatePipeResourceFromAllocator

  • d3d12: Use EnqueueMakeResident with GPU Wait for video permanent residency promotions

  • d3d12: Remove redundant d3d12_promote_to_permanent_residency overload

  • d3d12: Video Encode - Remove unnecessary resource waits and syncs since we sync batch fence

  • d3d12: Video Encode - Remove redundant buffer barriers

  • d3d12: Video Encode - Flush the pipe context async while submitting encode

  • pipe: Add PIPE_VIDEO_CAP_ENC_READABLE_RECONSTRUCTED_PICTURE

  • d3d12: Implement PIPE_VIDEO_CAP_ENC_READABLE_RECONSTRUCTED_PICTURE

  • d3d12: d3d12_video_proc - Use async residency functions

  • d3d12: Add get_video_enc_last_slice_completion_fence interop

  • d3d12: Support d3d12_video_buffer_creation_mode::place_on_resource in d3d12_video_buffer_from_handle

  • d3d12: Optimize d3d12_video_proc heap allocations

  • d3d12: Support PIPE_BIND_SHARED resource creation

  • d3d12: video_processor: Use d3d12_video_buffer subresource indices

  • d3d12: d3d12_video_buffer - Expose associated data with subresource idx

  • mediafoundation: Add m_bHWSupportReadableReconstructedPicture

  • mediafoundation: Add AVEncVideoReconstructedPictureOutputMode and MFSampleExtension_VideoEncodeReconstructedPicture

  • d3d12: Fix hang in d3d12_video_encoder_extract_encode_metadata with PIPE_VIDEO_SLICE_MODE_AUTO

  • d3d12: Fix max slice worst case estimation for PIPE_VIDEO_SLICE_MODE_AUTO

  • mediafoundation: Fix num_output_buffers for PIPE_VIDEO_SLICE_MODE_AUTO

  • mediafoundation: Add a min slice buffer size stopgap

  • d3d12: Bump min size in d3d12_video_encoder_calculate_max_output_compressed_bitstream_size

  • mediafoundation: Remove stale call to MFCreateMemoryBuffer

  • d3d12: Add buffer size check to d3d12_video_encoder_get_slice_bitstream_data

  • mediafoundation: Check for PIPE_VIDEO_CODEC_UNIT_LOCATION_FLAG_MAX_SLICE_SIZE_OVERFLOW in calls to get_slice_bitstream_data

  • d3d12: Video Encode - Do not flush on direct buffer maps

  • d3d12: Video Encode - Reduce unnecessary syncs between encoder and context queues

  • mediafoundation: Move dpb_buffer_manager::get_read_only_handle into d3d12 driver and cache resource

  • mediafoundation: Remove redundant fence openings in ProcessInput

  • mediafoundation: Take m_EncoderLock only for work submission in ProcessInput

  • d3d12: Add video encode bitstream buffer full frame size check in get_feedback

  • mediafoundation: Copy and remove padding gaps in output IMFMediaBuffer if necessary

  • d3d12: Prefer video encode suballocated buffer mode for subregion notification mode

  • d3d12: Add missing using Microsoft::WRL:ComPtr in d3d12_context_common

  • d3d12: Add HAVE_GALLIUM_D3D12_VIDEO guards for d3d12_video_encoder_set_max_async_queue_depth/d3d12_video_encoder_get_last_slice_completion_fence

Simon McVittie (2):

  • vulkan: Don’t emit library_arch if the library_path is just a basename

  • vulkan: Optionally share one JSON manifest per driver between architectures

Simon Perretta (4):

  • nir: commonize barycentric intrinsic opt pass

  • pvr: temporarily disable gs_rta_support on all cores

  • pco: restrict shadow sampler comparator clamping to unorm formats

  • pco: update formatless skip check

Simon Richter (1):

  • anv, hasvk: Fix reported CPU page size

Steev Klimaszewski (1):

  • tu: Stop printing descriptor pool allocation failures

Sushma Venkatesh Reddy (7):

  • intel/dev: Add geometry, color and depth pipes count

  • intel/perf: Update perf scripts to get additional performance counters

  • drirc: Add anv_assume_full_subgroups for Detroit: Become Human

  • brw: Add BRW_TYPE_BF8 and BRW_TYPE_HF8 for float8

  • brw: Add EU assembler support for float8

  • compiler: Add FP8 types to GLSL type decoder

  • brw: Use lookup tables for Gfx12+ 3src type encoding/decoding

Sviatoslav Peleshko (4):

  • mesa,driconf: Add WA to initialize vertex program outputs to vec4(0,0,0,1)

  • driconf: Add vertex_program_default_out option for Penumbra: Overture

  • nir/normalize_cubemap_coords: Handle the projector before the normalization

  • mesa/main/ff_frag: Don’t generate the projector for cubemap sampling

Tapani Pälli (25):

  • anv: bring back some lost game drirc workarounds for subgroups

  • intel/dev: update mesa_defs.json from internal database

  • intel/genxml: add registers handling autostrip for gfx200

  • iris: implement autostrip disable for Wa_14024997852

  • anv: implement autostrip disable for Wa_14024997852

  • anv: fix issues found with indirect data stride

  • anv: throw anv_finishme warnings only on debug builds

  • anv: remove own GetRenderingAreaGranularityKHR

  • drirc/iris: add drirc to disable threaded context

  • drirc: set intel_disable_threaded_context for Amnesia The Bunker

  • compiler/glsl: validate input blocks with opaque/booleans

  • anv: add furmark workaround layer

  • anv: add vk_wsi_disable_unordered_submits and enable for GTK

  • crocus: add struct crocus_scissor_state to clamp values to 16bit

  • anv/drirc: disable Xe2 CCS drm modifiers for GTK engine

  • anv: hand over ANV_PIPE_RT_BTI_CHANGE to pipe control

  • crocus: make sure we have at least 1x1 surface to create null surf

  • anv: fix setting emitted_flush_bits

  • anv: fix queue check in anv_blorp_execute_on_companion on xe3

  • blorp: fix asserts hit with msaa blorp blits on xe3

  • anv: route clear operations on compute to companion

  • intel/genxml: add CHICKEN_RASTER_2 with required bit for Xe3

  • anv: set DisableAnyMCTRresponsefix to zero on init

  • iris: set DisableAnyMCTRresponsefix to zero on init

  • anv: skip compressed flag for bo if not supported by modifier

Taras Pisetskyi (1):

  • drirc/anv: force_vk_vendor=-1 for Wuthering Waves

Thong Thai (5):

  • meson: add libva wrap and fallback option

  • frontends/va: get libva api version from va_version.h

  • meson: add jpeg as a video-codec

  • meson: add mpeg12dec as a video-codec

  • frontends/va: include picture_*.c based on selected codec

Tim Van Patten (1):

  • docs/envvars: Add section: Android System Properties

Timothy Arceri (7):

  • mesa: skip redundant uniform update optimisation if unsafe

  • glsl: assign block indices in the order they appear

  • mesa: fix _mesa_update_texture_matrices()

  • util/driconf: Add linux version of Penumbra fixes

  • util/driconf: add Cursemark workaround

  • driconf: add a way to override GLX_CONTEXT_RESET_ISOLATION_BIT_ARB

  • util/driconf: add workaround for Interstellar Rift

Timur Kristóf (60):

  • ac/nir/ngg_mesh: Lower num_subgroups to constant

  • ac/nir/ngg: Fix scratch space for NGG GS streamout

  • ac/nir/ngg: Use align() instead of ALIGN()

  • radeonsi/ci, zink+radv/ci: Remove GS primitive_counter tests from flakes

  • radv: Disable sparse mapping when unsupported by VM

  • ac/gpu_info: Disable sparse VM mappings pre-Polaris, for now

  • radeonsi: Inline si_choose_spi_color_formats

  • radeonsi: Respect if rbplus is allowed when choosing color formats

  • radv, radeonsi: Move GFX6-7 CB clamp issue to ac_gpu_info

  • ac: Improve description of some HW workarounds

  • ac/gpu_info: Rename has_sparse_vm_mappings to has_sparse

  • ac/gpu_info: Fix determining when CP DMA supports sparse

  • ac/surface: Use ADDR_TM_PRT_TILED_THIN1 on GFX6-8

  • ac/gpu_info: Add different sparse features

  • radv: Advertise sparse features pre Polaris with perftest flag

  • radv: Check RADV_PERFTEST=sparse for image formats and sparse queue

  • aco: Use only VGPR offset on buffer atomics on GFX6-7

  • radv: Use zero-filled BO for GFX6 and GFX10 null index buffer bug

  • ac/nir/lower_taskmesh_io_to_mem: Don’t hardcode num_entries in shaders

  • ac/nir/lower_taskmesh_io_to_mem: Don’t hardcode payload entry size in shaders

  • radv, radeonsi: Don’t pass task ring info to mesh/task payload lowering

  • ac/nir/lower_taskmesh_io_to_mem: Use AC_TASK_DRAW_ENTRY_BYTES

  • radv: Bypass L2 for gang semaphore BO with SDMA/ACE

  • radv: Add function to determine if SDMA supports an image.

  • radv: Require gang submit and compute for transfer queues

  • radv: Update comments for gang semaphores

  • radv: Implement gang semaphores for transfer queues.

  • radv: Use SDMA fence packet when flushing gang semaphores

  • radv: Declare some gang submit functions in radv private header.

  • radv: Initialize transfer queue gang when needed

  • nir/opt_vectorize_io: Fix allow_holes option

  • radv: Lower 64-bit VS inputs to 32-bit

  • radv: Scalarize and re-vectorize unlinked shader I/O

  • radv: Only run some optimizations when scalarization made progress

  • radv: Don’t call nir_opt_combine_stores anymore

  • radv: Don’t call nir_compact_varyings anymore

  • radv: Don’t call nir_remove_unused_varyings anymore

  • radv: Don’t call nir_link_opt_varyings anymore

  • nir: Add new nir_remove_outputs pass

  • radv: Use nir_remove_outputs with the noop FS.

  • radv: Remove radv_remove_varyings.

  • radv: Add layout argument to transfer_copy_buffer_image.

  • radv: Use compute for transfer operations unsupported by SDMA

  • radv: Use compute copy for emulated formats

  • radv/ci: Adjust expected failures list for transfer queues

  • nir: Add pass to lower workgroup size

  • ac/cu_info: Add GFX6-7 SMEM OOB bug

  • ac/nir: Add pass to fixup SMEM on GFX6-7

  • radv/amdgpu: Add ability to pad BOs with a read-only VM page

  • radv: Mitigate GFX6-7 SMEM bug for NULL and mutable descriptors

  • radv: Mitigate GFX6-7 SMEM bug for robust OOB access

  • mesa: Require at least 512 variable invocations for ARB_compute_variable_group_size

  • radeonsi: Limit variable workgroup size to 256 for CS regalloc bug

  • radv: Lower larger workgroups to 256 for CS regalloc bug

  • radeonsi: Lower larger workgroups to 256 for CS regalloc bug

  • radv: Allow using compute queue with CS regalloc hang bug on GFX7

  • radeonsi: Allow using compute queue with regalloc hang bug on GFX7

  • radv: Remove previous mitigation of CS regalloc hang bug

  • radeonsi: Remove previous mitigation of CS regalloc hang bug

  • ac/gpu_info: Remove FIXME from regalloc hang description

Tomeu Vizoso (1):

  • dril: don’t build a rocket_dri.so

Tomoki Imai (1):

  • lavapipe: Support VkDrmFormatModifierPropertiesList2EXT

Utku Iseri (19):

  • panfrost,panvk: rename pan_fb_info::extent to draw_extent

  • panfrost,panvk: distinguish fbd bounding box from framebuffer size

  • panvk: prevent aliased images from using AFBC

  • panvk: only add storage usage without AFBC

  • panvk: explicit fallback to linear for legacy scanout images

  • panvk: change AFBC subresource layout pitches to byte sizes

  • pan/mod: allow non-tiled modifiers to be optimal

  • panvk: allow TILING_DRM_MODIFIER_EXT with AFBC

  • panvk: advertise support for AFBC WSI behind a debug flag

  • panvk: fix for clearing render targets with 8+ layers

  • panvk: set allow_forward_pixel_to_be_killed for draws

  • panfrost: add earlyzs FPK condition for v6-

  • zink: fix layer count with cubemaps

  • st/pbo: set src_type on the upload path

  • zink: set gfx_pipeline_state.dirty for blit rp changes

  • zink: don’t set gfx_pipeline_state.dirty if min_samples didn’t change

  • zink: don’t set pipeline_state.dirty for halfz with full_ds3

  • zink: use gfx_pipeline_state.dirty as a pipeline update condition

  • zink: handle split DS blits with zink_blit calls

Val Packett (1):

  • tu: support driconf option force_vk_vendor

Valentine Burley (48):

  • docs: Update LAVA caching setup

  • ci/deqp: Also print logs to logcat on Android

  • tu: Fix indexing with variable descriptor count

  • tu: Fix maxVariableDescriptorCount with inline uniform blocks

  • zink/ci: Document ANV flake

  • venus/ci: Skip slow test on ANV with Cuttlefish

  • ci: Update linux-firmware version to pick up more ARM firmware

  • panfrost/ci: Drop redundant KERNEL_IMAGE_NAME for rock-5b

  • panvk/ci: Add a VKCTS job on G925

  • panvk/ci: Add an ANGLE job on G925

  • panfrost/ci: Drop redundant PAN_MESA_DEBUG variables

  • panfrost: Don’t dump shader disassembly by default on CSF

  • panfrost/ci: Enable G610 piglit job

  • turnip/ci: Increase coverage of a660-vk job

  • freedreno/ci: Move a660-gl-cl job back to pre-merge

  • ci: Remove Piglit replayer from test-vk container/rootfs

  • venus/ci: Add missing Collabora farm rules to ANV jobs

  • ci/lava: Use a660_zap.mbn from linux-firmware

  • intel/ci: Drop timeout overrides for pre-merge jobs

  • lavapipe/ci: Run vkd3d job in parallel

  • anv/ci: Run vkd3d job in parallel

  • ci/android: Build zink for arm64 as well

  • egl: Disable kopper on Android

  • Revert “anv/ci: Run vkd3d job in parallel”

  • ci: Drop hardware-job prerequisite check jobs

  • anv/ci: Increase timeout for nightly JSL job

  • ci: Uprev VKCTS

  • ci: Uprev GL & GLES CTS

  • ci/deqp: Backport Android logcat commit

  • zink/ci: Document recent Turnip flakes

  • panfrost/ci: Fix GitLab rules after YAML split

  • ci: Allow PIGLIT_TAG to be unset in deqp-runner script

  • lavapipe/ci: Add a nightly ASAN job

  • zink/ci: Mark new TGL glx failures as flakes

  • ci/android: Update to Android 16

  • ci/android: Remove custom kernel

  • ci/android: Reduce Cuttlefish log verbosity

  • ci/android: Quieten extracting Mesa artifacts

  • Revert “ci/android: add sudo to EPHEMERAL deps for debian/x86_64_test-android.sh”

  • lavapipe/ci: Move android-angle-lavapipe-cts job to nightly

  • venus/ci: Switch Alder Lake job to Xe KMD

  • ci: Uprev Vulkan Validation Layers

  • ci: Update deqp-runner to pull in gtest suite support

  • radeonsi/ci: Convert libva-utils job to deqp-runner suite

  • radeonsi/ci: Remove redundant radeonsi-vaapi-fluster-rules

  • radeonsi/ci: Merge VA-API jobs

  • tu: Handle VkDrmFormatModifierPropertiesList2EXT

  • tu: Fix memory leak of patchpoints_ctx in dynamic rendering

Vinson Lee (6):

  • gfxstream: Fix GfxStreamVulkanMapper.cpp build error

  • bin/symbols-check: Fix undefined symbol detection on macOS

  • util/u_printf: Fix const correctness in util_printf_next_spec_pos

  • util/blob: Fix const correctness warning in blob_read_string

  • compiler/clc: Fix const correctness in libclc_add_generic_variants

  • freedreno/decode: Fix const correctness in get_tex_count

Xaver Hugl (2):

  • vulkan/wsi: require extended target volume support for scRGB

  • vulkan/wsi: remove support for VK_COLOR_SPACE_EXTENDED_SRGB_NONLINEAR_EXT

Yiwei Zhang (113):

  • panvk: fix to advance vs driver_set properly

  • panvk: fix to advance vs res_table properly

  • panvk: minor cleanup in cmd_prepare_push_uniforms

  • panvk: use cs_move_reg32 and lower to cs_add32 if needed

  • panvk: support VK_EXT_external_memory_acquire_unmodified

  • venus: skip feedback cmd record on incompatible queue families

  • venus: add vn_queue_family_can_feedback helper

  • venus: allow fence feedback to suspend and resume

  • venus: update sfb cmd lookup to follow ffb

  • venus: rename async_wait_mtx to counter_mtx

  • venus: allow timeline semaphore feedback to suspend and resume

  • venus: enable sparse only queue family along with feedback

  • panvk: use nir_log_shader to log NIR on Android

  • panvk: support VK_EXT_device_memory_report

  • panvk: fix sample shading of internal blend shader for MSAA

  • llvmpipe: zero is also a valid fd

  • llvmpipe: fix udmabuf mmap error check

  • llvmpipe: add a missing alloc error handling in fd import

  • llvmpipe: misc fixes for sparse binding

  • llvmpipe: support sparse resource with LLVMPIPE_MEMORY_FD_TYPE_DMABUF

  • llvmpipe: handle mmap failure for lp_texture

  • llvmpipe: handle os_dupfd_cloexec failure

  • llvmpipe: refactor dmabuf and opaque fd handling

  • util: add get_fd_header helper in os_memory_fd

  • util: add os_map_memory_fd_placed for placed mapping support

  • llvmpipe: add fd type INVALID and ANONYMOUS

  • llvmpipe: split sparse binding part to llvmpipe_resource_bind_sparse

  • llvmpipe: refactor llvmpipe_resource_bind_sparse

  • llvmpipe: support sparse resource with LLVMPIPE_MEMORY_FD_TYPE_OPAQUE

  • venus: enable sparse resource support on lavapipe

  • glcpp/meson: fix libglcpp generated header dependency

  • llvmpipe: add missing util/os_file.h header

  • panvk: fix mem alloc size for VkBuffer backed by imported blob AHB

  • pps/meson: amend missing util deps for os_get_option usage

  • pps/meson: minor refactor for pps_deps

  • venus: use seq_cst for ring cs and tail update ordering

  • venus: add a wsi image log

  • venus: avoid re-imported dma-buf to have a larger map size

  • venus: properly fix the blob mem mapping size

  • venus: add error log coverage for virtgpu backend

  • venus: fix racy semaphore feedback counter update

  • ci/venus: skip Android incremental and shared present tests

  • ci/venus: skip those causing oom killer to kill deqp

  • venus: sync to latest protocol for v1.4.334

  • venus: enable promoted VK_KHR_robustness2

  • docs: add VK_KHR_robustness2 and supported drivers

  • venus: add renderer support for placed mapping

  • venus: implement VK_EXT_map_memory_placed

  • venus: sync protocol for sorted VkCommandTypeEXT enum defines

  • venus: sync latest protocol for more shader extensions support

  • venus: support VK_KHR_cooperative_matrix

  • venus: support VK_KHR_shader_bfloat16

  • venus: support VK_KHR_shader_untyped_pointers

  • venus: support VK_EXT_shader_float8

  • venus: support VK_EXT_shader_uniform_buffer_unsized_array

  • venus: device create to filter promoted swapchain_maintenance1

  • venus: sync protocol for VK_EXT_mesh_shader support

  • venus: add VK_EXT_mesh_shader support

  • ci: uprev virglrenderer

  • pan: fix pan_blend_reads_dest to consider special min/max funcs

  • nir: suppress clang warnings for cooperative matrix lowering

  • venus: add missing VKAPI_ATTR/CALL

  • kk: add mtl_device_get_gpu_timestamp bridge

  • kk: support VK_(KHR|EXT)_calibrated_timestamps

  • vulkan: update ALLOWED_ANDROID_VERSION for api level 36

  • venus: hide unsupported wsi extensions on Windows

  • venus: hide unsupported device extensions on Windows

  • venus: hide unsupported external extensions on Windows

  • venus: disable TLS ring prio forwarding on Windows

  • venus: refactor meson to be more flexible for additions

  • venus: hide vtest from Windows build

  • venus: use vk_common_GetPhysicalDeviceCalibrateableTimeDomainsKHR

  • venus: adopt vk_common_GetCalibratedTimestampsKHR

  • venus: amend missing VKAPI_ATTR/CALL for render pass APIs

  • venus: track renderer driver version for driver workaround

  • venus: allow hw wsi for newer Nvidia proprietary driver

  • zink: tighten up export paths that require true dmabuf support

  • panvk: fix to defer disk cache init after vk_physical_device_init

  • venus: respect VK_SUBOPTIMAL_KHR returned from wsi image acquire

  • venus: remove TP in vn_ResetDescriptorPool

  • venus: refactor to avoid nesting vn_QueueSubmit entrypoint

  • venus: vn_GetFenceFdKHR no need to block wait

  • venus: add vn_wsi_sync_wait to handle implicit sync workaround

  • venus: track swapchains

  • venus: prepare chain access for async present

  • venus: add chain lock helpers for async present

  • venus: add back vn_QueuePresentKHR

  • venus: add a deep copy helper for VkPresentInfoKHR

  • venus: prepare to flush async queue present

  • venus: implement async present

  • venus: vn_wsi_sync_wait to relax chain acquire for async present

  • venus: enable async presentation along with a perf option

  • venus: vn_wsi_sync_wait to relax queue access for async present

  • venus: only preserve 12 bits for VkBufferCreateFlagBits

  • venus: refactor vn_buffer_get_cache_index

  • venus: cache VkBufferUsageFlags2CreateInfo

  • venus: amend missing logs for image format cache dump

  • venus: workaround to consider ALIAS for image mem req cache

  • venus: refactor vn_QueueSubmit2

  • venus: allow vtest to properly wait for present

  • venus: fix aliased image memory requirement caching

  • vulkan/wsi: avoid host stage when blit to foreign queue

  • vulkan/wsi: fix present wait support and present id creation condition

  • vulkan/wsi: rename khr_present_wait to has_present_wait

  • vulkan/wsi: improve present wait enablement tracking

  • venus: track prime blit dst buffer memory in the wsi image

  • venus: track dedicated image during mem alloc

  • venus: add vn_renderer_bo_export_sync_file helper

  • venus: refactor vn_AcquireNextImage2KHR

  • venus: properly handle wsi implicit in-fence

  • venus: refactor Android ANB tracking to avoid confusions with WSI

  • venus: remove obsolete asserts for ANB image creation

  • pan/kmod: drop pan_kmod_bo_check_import_flags validation

Yogesh Mohan Marimuthu (10):

  • winsys/amdgpu: use correct vm_timeline_point for userq creation

  • winsys/amdgpu: fwm packet pre-emption for gfx 11.5

  • winsys/amdgpu: add assert that if kernel fence passes then user fence must pass

  • winsys/amdgpu: enable userq reg shadowing for gfx11.5

  • ac: update amdgpu_drm.h for uq metadata query info

  • winsys/amdgpu,ac: get eop and csa size,alignment from kernel query

  • util/log: add MESA_LOG_FILE_AUTO to generate log file

  • winsys/amdgpu: use mesa_log functions instead of fprintf

  • winsys/amdgpu: print userq job info

  • winsys/amdgpu: userq job log fwm packet debug count

Yonggang Luo (36):

  • treewide: Use os_get_option_secure instead secure_getenv

  • util: Add new function os_get_option_internal to improve os_get_option*

  • util: Add function os_get_option_dup and os_get_option_secure_dup for latter use

  • d3d12/dozen: Use os_get_option_dup for passing to ID3D12SDKConfiguration_SetSDKVersion

  • util,vulkan,llvmpipe: Use os_get_option_dup instead getenv

  • util: Add PRAGMA_DIAGNOSTIC_IGNORED_CLANG PRAGMA_DIAGNOSTIC_IGNORED_GCC for latter use

  • nir: Disable gcc warning -Wstringop-overflow for nir_intrinsic_set_* for latter commit

  • freedreno: Do not use align as variable name, as it’s a function in u_math.h and will be used

  • freedreno: Use align64 instead ALIGN for 64 bits input

  • panfrost/drm-shim: Use align_uintptr instead of ALIGN for size_t input

  • brw: Do not use align as variable name, as it’s a function in u_math.h and will be used

  • anv: use align/align64 instead ALIGN, as the input is size_t/uint64_t

  • aco: Use align64 instead ALIGN for 64 bits input

  • radeon/drm: use align64 for 64 bits input instead of ALIGN

  • radeon/drm: Replace all usage of ALIGN to align and remove ALIGN macro

  • treewide: Replace calling to function ALIGN with align

  • util: Remove unused ALIGN function to prevent future use

  • treewide: strip unneeded inc_gallium inc_gallium_aux

  • util: Getting util_align_npot to be same with ALIGN_NPOT so it can be merged latter

  • ci/microsoft: Downgrading WinFlexBison.win_flex_bison to version 2.5.24

  • ci: MSVC 2019 is not support anymore, remove it.

  • ci: update image tags for windows container

  • docs: Update the minimal MSVC version requirements

  • gfxstream: Use VK_DRIVER_FILES instead of VK_ICD_FILENAMES as VK_ICD_FILENAMES is deprecated for a while.

  • gfxstream: Use os_get_option_dup(VK_DRIVER_FILES)

  • util,asahi,vulkan,panfrost: Replace the remaining usage of getenv with os_get_option

  • util: Add function os_unset_option/os_set_option for latter use

  • util: Update os_get_option* comments to match os_set_option

  • treewide: Replace the usage of setenv manually and #include “util/os_misc.h” when needed

  • treewide: Use regexp to replace usage of unsetenv with os_unset_option.

  • treewide: Use regexp to replace usage of setenv with os_set_option.

  • gfxstream: os_set_option can be used on windows now

  • util,wgl: Replace usage of putenv with os_unset_option,os_set_option

  • meson: Use /Zc:enumTypes enables C++ conforming enum underlying type and enumerator type deduction

  • meson: Remove VK_ICD_FILENAMES totally from source tree.

  • meson: do not reconstruct ICD paths

Yurii Kolesnykov (2):

  • loader: Wrap nouveau_zink_predicate with HAVE_LIBDRM

  • apple_cgl.c: Fix error: call to undeclared function ‘os_get_option’

Yuxuan Shui (1):

  • wsi/display: Set atomic client cap in Acquire{Drm,Xlib}DisplayEXT as well.

Zan Dobersek (11):

  • tu: don’t advertise sample location support for VK_SAMPLE_COUNT_1_BIT

  • tu: emit PC_DGEN_SO_CNTL for any shader type during streamout setup

  • tu/a7xx: use DI_SRC_SEL_AUTO_XFB for CmdDrawIndirectByteCountEXT

  • tu: remove data size assert in tu_GetQueryPoolResults

  • driconf: use vk_dont_care_as_load workaround for Spilled!

  • tu: use application name matching for Yooka-Laylee driconf option

  • tu: enable storageBuffer8BitAccess on all a7xx hardware

  • freedreno/registers: add a custom build target for adreno_pm4.xml.gz

  • tu: handle DS_DEPTH_BOUNDS_TEST_BOUNDS state under TU_DYNAMIC_STATE_RB_DEPTH_CNTL

  • tu: allocate transient attachments used for LRZ

  • tu/kgsl: wait-only submit handling should not ignore sparse bind commands

anonymix007 (3):

  • venus: Expose deviceLUID in props if available

  • venus: Guard Linux-specific code against being compiled on Windows

  • tgsi/nir: Store output variables before each TGSI_OPCODE_RET

hmtheboy154 (1):

  • v3dv: add support for driconf

hwandy (2):

  • anv: fix a memory leak in slab allocator.

  • anv/tests: Add a slab test to cover the memory leak issue.

jaap aarts (1):

  • radv/sqtt: Prevent concurrent submit when sqtt is enabled

jglrxavpok (1):

  • radv/aco: Print source location debug info inside ACO disassembly if we have the information

leonperianu (2):

  • pvr: Change has_fbcdc_algorithm to 1-bit bit-field

  • pvr: feature promotion to core from derived

llyyr (4):

  • Revert “drirc/anv: force_vk_vendor=-1 for Wuthering Waves”

  • vulkan/wsi/headless: populate VkSurfacePresentModeCompatibilityKHR

  • vulkan/wsi/headless: add stub for VkSurfacePresentScalingCapabilitiesKHR

  • vulkan/wsi/headless: implement vkReleaseSwapchainImagesKHR for headless

spencer-lunarg (8):

  • llvmpipe: Fix warning casting 32-bit int to 8-bit

  • lavapipe: Remove trailing whitespace

  • llvmpipe: Remove trailing whitespace

  • llvmpipe: Remove unnecessary includes

  • lavapipe: Fix crash when using zero queues

  • lavapipe: Add VK_KHR_copy_memory_indirect formats

  • lavapipe: Expose EXT version of global_priority

  • lavapipe: Check for VkCopyMemoryIndirectCommandKHR::size of zero

stefan11111 (3):

  • glx: Add some NULL pointer checks

  • gallium/frontends/dri: Don’t force dri cursor buffers to be 64x64

  • gbm: Make documentation for `gbm_bo_map` more explicit

volodymyr (4):

  • mesa: ctx->API != API_OPENGL_COMPAT –> !_mesa_is_desktop_gl_compat(ctx)

  • mesa: ctx->API != API_OPENGLES –> !_mesa_is_gles1(ctx)

  • mesa: ctx->API != API_OPENGLES2 –> !_mesa_is_gles2(ctx)

  • mesa: ctx->API != API_OPENGL_CORE –> !_mesa_is_desktop_gl_core(ctx)