Mesa 26.0.0 Release Notes / 2026-02-11¶
Mesa 26.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 26.0.1.
Mesa 26.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.0 implements the Vulkan 1.4 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.
SHA checksums¶
SHA256: 2a44e98e64d5c36cec64633de2d0ec7eff64703ee25b35364ba8fcaa84f33f72 mesa-26.0.0.tar.xz
SHA512: d39d190d0a17306f0aa69033e38dd8cf458dbf8da483b768841e2dc681dd670735999b212fbe0b29be839702a20750c87d6587bd925dca10693950830a17cd55 mesa-26.0.0.tar.xz
New features¶
VK_KHR_relaxed_block_layout on pvr
VK_KHR_storage_buffer_storage_class on pvr
VK_EXT_external_memory_acquire_unmodified on panvk
VK_EXT_discard_rectangles on NVK
VK_KHR_present_id on HoneyKrisp
VK_KHR_present_id2 on HoneyKrisp
VK_KHR_present_wait on HoneyKrisp
VK_KHR_present_wait2 on HoneyKrisp
VK_KHR_maintenance10 on ANV, NVK, RADV
VK_EXT_shader_uniform_buffer_unsized_array on ANV, HK, NVK, RADV
VK_EXT_device_memory_report on panvk
VK_VALVE_video_encode_rgb_conversion on radv
VK_EXT_custom_resolve on RADV
GL_EXT_shader_pixel_local_storage on Panfrost v6+
VK_EXT_image_drm_format_modifier on panvk/v7
VK_KHR_sampler_ycbcr_conversion on panvk/v7
sparseResidencyImage2D on panvk v10+
sparseResidencyStandard2DBlockShape on panvk v10+
VK_KHR_surface_maintenance1 promotion everywhere EXT is exposed
VK_KHR_swapchain_maintenance1 promotion everywhere EXT is exposed
VK_KHR_dynamic_rendering on PowerVR
VK_EXT_multisampled_render_to_single_sampled on panvk
VK_KHR_pipeline_binary on HoneyKrisp
VK_KHR_incremental_present on pvr
VK_KHR_xcb_surface on pvr
VK_KHR_xlib_surface on pvr
VK_KHR_robustness2 on panvk v10+
VK_KHR_robustness2 on HoneyKrisp
VK_KHR_robustness2 on hasvk
VK_KHR_robustness2 on NVK
VK_KHR_robustness2 on Turnip
VK_KHR_robustness2 on lavapipe
Bug fixes¶
3c5c96fe raster performance regression on doom eternal
A commit within the last few days of this writing causes hasvk to only display black.
ACO: assertion in insert_exec_mask()
ACO: fix a hazard when the number of attributes loaded/consumed don’t match with VS prologs
ACO: loading 64-bit attributes can override the fetch index in VS prologs
ADL, ANV: Wuthering Waves leads to gpu reset on Alder Lake iGPU
After 25.3 update some app windows became glitchy on uhd 620
Amnesia: The Bunker (2023) OpenGL graphics glitch on Intel graphics
CI: It’s not enough to start build tests to run CI, rustfmt must also be started manually
Clarify gallium-rusticl-enable-drivers build option
Commit bc1a6b0a4121d09cab70506ad0addf70a18730bf breaks Chromium > Save As
Ethos gallium driver does not build on 32bit
Firefox crashes in some Gallium drivers since mesa 25.3.0
Flat bool variables (GLSL_TYPE_BOOL) are not properly managed
FurMark gets glitchi graphics when using Vulkan API on UHD 620 (mesa 25.2.6 and 26.0)
GL_AMD_framebuffer_multisample_advanced freezes the whole system
GTT memory leak when running OpenGL games/software on an AMD RX 6600 XT
Ghost of Tsushima page fault
Intel BDW regression due to load_push_data_intel intrinsic
Issue with blit from framebuffer using texture view to array texture layer
JSON manifest compatibility with multiarch systems
KHR-GL46.geometry_shader.limits.max_output_components
LLVM crashes when loading specific Minecraft Shaderpacks
LLVM instruction selection compilation error
LLVMPipe’s `VkPhysicalDeviceAccelerationStructurePropertiesKHR::maxPrimitiveCount` is lower than Vulkan requires.
MR 37884 breaks Encoding via VAAPI with FFMPEG on RX 9070XT
Main branch cannot be built with Python 3.10 now
Missing definition of __builtin_ia32_clflush since “util/cache_ops: Add some cache flush helpers”
Penumbra: Overture OpenGL game has graphical glitch for ice
Polaris, amdgpu: Application using VCE wedges GPU
Propely unlock global_bufmgr_list_mutex on error conditions in iris_bufmgr_get_for_fd()
RADV: ANGLE deqp regression
RADV: gfx12 RGP trace questionable utilization and time duration
RFE: Use _mesa_is_foo(ctx) helpers more
RX Vega 64 driver hang when processing a large amount of vertex shaders (OpenGOAL: Jak And Daxter 1)
Radv nir lowering seg-faults if given ray query proceed before initialize
Regression in Vulkan driver for Intel iGPU.
Regression: MSVC fails to build 32 bit binaries
RustiCL: fence fd leak on CL-GL interop
Shader inputs/outputs for vertex/pixel shaders that have the integer (int) type are broken on RDNA 3 and 4 graphics cards
Steam Deck/9060 XT Consistently hang with game demo “Cursemark”
Texture matrix stack pops do not seem to always update the texture matrix
Transcoding mpeg2video with ffmpeg h264_vulkan on Intel cause Conversion failed!
UB in NIR when using reallocated range_minimum_query_table
Uniform variable not updated correctly with shared contexts
Update Vulkan-Profiles and re-enable zink_check_requirements
VkPhysicalDeviceLimits.minMemoryMapAlignment uses hardcoded page size
Zink on Android: failed to create dri2 screen
[26.0.0~rc1] d3d12_screen.cpp:1165:(.text._ZL31d3d12_interop_query_device_infoP11pipe_screenjPv+0x4b): undefined reference to `d3d12_video_encoder_get_last_slice_completion_fence(pipe_video_codec*, void*, pipe_fence_handle**)’
[ANV][BMG] Regression - Flickering objects in Resident Evil Village
[ANV][DG2/LNL] SolarBay extreme RT regression
[ANV][EXT_debug_utils] descriptor set object_name leak when not calling vkFreeDescriptorSets
[ANV][LNL] - Alan Wake II (EGS) - Water surface at the beginning of the game has blocky textures
[ANV][LNL] - Detroit: Become Human (1222140) - Flickering horizontal artifacts across the screen
[ANV][LNL] - Eternal Strands (1491410) - Colorful graphical aberrations are present whenever a 3D asset is visible.
[ANV][PTL] R.E.P.O. GPU Hang
[ANV][PTL][DG2] Flickering textures in Assassin’s Creed Valhalla benchmark
[ANV][Slab][Low-ram] GPU fragmentation in low-memory device when turning on slab.
[BMG] Metro Exodus Enhanced Edition (1449560) - Crash
[RADV] [Performance] Indiana Jones TGC - lags very badly sometimes
[RADV]: cooperative matrix regression
[RADV][ACO][Feature Request] Allow op_sel in v_alignbit_b32 etc in GFX9 and GFX10
[RADV][GFX12] increase max image dimensions to 32768
[RADV][bisected][regression] - Doom: The Dark Ages (3017860) - Square flickering artifacts around Hebeth
[RX 9060 XT / gfx1200] VCN page fault & ring timeout during VAAPI HEVC encode with scale_vaapi
[anv] mpv video playback blacks out when resized larger than video resolution
[bisected][iris] - Celeste - Lighting artifacts during gameplay
[bsw][regression][bisected][hasvk] various crashing tests
[radv] - WITCH ON THE HOLY NIGHT (2052410) - Flickering squares on some UI elements with gfx1150/1151
[radv] Borderlands 4 triggers a consistent GPU page fault on RDNA2
[radv] Regression causes Resident Evil 4 crashes with instruction QA checks in vkd3d-proton
[radv] Regression causes glitches in Strange Brigade (Vulkan renderer)
[radv] Stuttering with latest mesa git (21 sept) on radv/6900 XT
[radv] [feature request] Add an env var to not expose resizable bar to app
[radv][bisected][regression] GhostwireTokyo RT gpu hangs with HPLOC commit
[regression] [bisected] RuneLite GPU Experimental - GPU crash
[venus] Many functions lack the VKAPI_CALL modifier, which results in compilation failure on the Win32 i686 platform.
[wsi_common_headless] `VkSurfacePresentModeCompatibilityKHR` is not populated when using `VK_EXT_headless_surface`
a8f5ced6 regression on Silent Hill 2 Remake
amdgpu: ring gfx_0.0.0 timeout, in vr when opening apps
android/driconf: sysprops get truncated
anv/intel-brw: enable SIMD32 shaders with ray queries
anv: Support VK_KHR_pipeline_binary
anv_finishme warning spam in journalctl
asahi: DMABuf import of multi-plane YCbCr (NV12 from ISP) not renderer correctly
brw: Gfx9 sampler messages violate r127 rule
ci: Add a full version of lavapipe-vkcts-asan
corrupted video when using pRefList0ModOperations on radv h264
es1-ABI-check and es2-ABI-check test fails
freedreno, tu: resource leak
freedreno: Fix KHR-GLES31.core.texture_cube_map_array.color_depth_attachments
game Interstellar Rift does not run on AMD while it does work on Nvidia (all models, all software versions)
gnome-control-center hitting assert
hk: NIR validation failed after nir_lower_vars_to_ssa
intel: Expose `XVE Pipelines XMX active (%)` performance counter
iris: OpenGL: GL_ARB_texture_cube_map: Broken reflections in Unreal Tournament 2004
lavapipe doesn’t expose VK_FORMAT_FEATURE_2_COPY_IMAGE_INDIRECT_DST_BIT_KHR
loader.c:156:14: error: call to undeclared function ‘drmCommandWriteRead’
lp_texture.c:1523:19: error: call to undeclared function ‘os_dupfd_cloexec’
lvp: VkDrmFormatModifierPropertiesList2EXT is not supported
mediafoundation: Sample leak/freeze with MFSinkWriter and DX11 usage
mesa: deleting a buffer bound only to an index also undoes the associated general target binding
mesa: regression caused by hash_table sizing
meson: When building radeonsi without llvm, it fails without setting amd_with_llvm to false explicitly
meson: mesa is double linking now?
nak: prepass instruction scheduler liveness assert fails (with nvk)
nir: Unit-test nir_opt_algebraic
nvk, nak: Broken icons in ENDLESS Legend 2 on a RTX 4080
nvk: CTS failures in sample_locations_ext.verify_interpolation.samples_1
panvk: Handle DRLR with more locations than attachments
panvk: Insufficient barriers for fragment self-dependencies
panvk: fau compute bug
r600/sfn: Assertion `cir.alu_vec.empty()` failed
radeonsi: crash with NIR_DEBUG=serialize
radv vulkan video encode does not process used_by_curr_pic_lt_flag correctly
radv, regression : Crysis 2 Remastered raytracing blocky reflections
radv: Forza Horizon 5 can trigger page fault on valid, mapped memory
radv: Hit assert when creating mix Shader Object
radv: Hit assert when over maxFragmentDualSrcAttachments but vkCmdSetColorBlendEnableEXT is set to false
radv: Is radv_wsi_get_prime_blit_queue bugged?
radv: Kingdom Come Deliverance 1 RDNA4 RGP capture has missing cache counters for dispatch
radv: No Man’s Sky XESS page fault GPU reset
radv: RB+ for depth-only is broken with unused color attachments
radv: RE4 Separate Ways DLC hangs RDNA2 GPU
radv: Strange perf delta in a particular CS in TLOU1
radv: don’t include constant data in RGP captures
radv: incorrect vectorization of 8-bit/16-bit causes random GPU hangs with DXVK
radv: shader miscompilation triggering a freeze
src/intel/blorp/meson.build:12:4: ERROR: Unknown variable “prog_mesa_clc”.
static linking regression since !37495 - spirv-tools shared library required at runtime if exists at build time
tu: GPU faults during LRZ clears on unallocated transient attachments in gmem mode
tu: resource leak
v3d: Build fails when ENABLE_SHADER_CACHE is disabled due to unconditional disk_cache access
v3d: green screen when rpivid hevc decoder is used
va no longer correctly converts YUV to RGB
venus: random failures in dEQP.api.info.image_format_properties2.1d
venus: synchronization tests sometimes get stuck in semaphore/fence wait
vulkan/runtime: Bad assertion for RT pipelines
win_bison random failure extern_stdin:40: ERROR: end of file in string
zink/radv: new cts fails on rdna3
Changes¶
Aaron Ruby (4):
device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr
Revert “device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr”
gfxsteam: Support QNX-native swapchain in host codegen
gfxstream: Partial revert of “gfxstream: revert “gfxstream: Add Vulkan func/structs for passing debugging data to host””
Adam Jackson (1):
iris: Stop hardcoding 0:2:0 for the PCI bus address
Adrián Larumbe (2):
mesa: gallium: make GL object maximum label length a pipescreen cap
panfrost: match a GL object’s maximum label length to KMD uAPI limit
Ahmed Hesham (2):
panfrost/lima/panvk: Define a common vendor ID
panfrost: fix get_image_width for 1D buffer images
Aitor Camacho (79):
nir: Add KosmicKrisp required utilities
kk: Add KosmicKrisp
mr-label-maker: Add KosmicKrisp
CODEOWNERS: Add KosmicKrisp owners
ci: Add KosmicKrisp Linux build
kk: Fix Linux build valgrind dependency
kk: Hash vertex input state
kk: Expose missing BC formats
kk: Set drawID in root descriptor table
docs: Add KosmicKrisp to Vulkan
docs: Reorder VK_EXT_image_robustness
kk: Reorder physical device extensions and features
kk: Fix Xcode GPU capture crash
kk: Add env variables to enable Xcode GPU capture
kk: Use our own driverID value
kk: Avoid Metal validation error due to empty calls
kk: Fix addressModeW for unnormalized coordinates
kk: Ignore depth clear value if load op is not clear
kk: Force vertex attribute rebinding when pipeline changes
docs,kk: Add KosmicKrisp documentation
kk: Add MESA_KK_DISABLE_WORKAROUNDS to disable workarounds
util: Introduce HAVE_BUILD_ID for build id utils
util: Add build_id for macOS
kk: Fill driverUUID
kk: Merge io type modifying passes into one
kk: Add multiViewport and EXT_shader_viewport_index_layer support
kk: Fix image to image copy
kk: Use residency sets for user allocations
kk: Move all resource tracking to the residency set
kk: Exposes more extensions/features we already supported
kk: Mark root buffer as not dirty after updating it
kk: Remove mem leaks in NIR->MSL, device/sampler create and cmdbuf release
kk: Track fragment helper status since Metal does not correctly demote them
kk: Remove mem leaks in cmd buf destroy and residency set creation
kk: Force attachment load as temp solution to preserve attachment
kk: Handle memory coherency for textures and buffers
kk: Clamp negative array indices to 0
vulkan/cmd_queue: Use vk_strdup and free allocated string memory
vulkan/wsi: Fix double destroy of present_id_timeline at swapchain create
docs,kk: Add KosmicKrisp environment variable documentation
kk: Guard writes after fragment demote
kk: Apply robustness only when requested
kk: Expose more features/extensions we already support
wsi/metal: Fix command buffer release at destroy
wsi/metal: Fix blit_imate_to_image’s pool selection for cmd buffer alloc
kk: Expose shader storage image read/write without format
kk: Expose shaderImageGatherExtended
kk: Match float formats to actual Metal features (union of Apple and Mac2)
kk: Expose ASTC HDR formats
kk: Fix emulated format’s swizzle
kk: Expose 4444 and ycbcr 2plane 444 formats
kk: Enable fragmentStoresAndAtomics
kk: Enable float16 and int8
kk: Account for dynamic VI when flushing draw state
kk: Mark graphics descriptors’ root dirty when dirtying graphics state
kk: Remove unneeded entrypoints in kk_encoder.h
kk: Split internally encoder fence signal and end
kk: Simplify compute and blit encoder start
kk: Change queue writes timing for easier compute merge for Metal4 upgrade
kk: Update query availability only if it has availability
kk: Propagate availability before we reset it in vkCmdResetQueryPool
kk: Remove signal and end from upload writes not to end compute encoders
kk: Remove render pass logic in event set/reset entrypoints
kk: Attachmentless render passes start postponed to pipeline bind
kk: Expose occlusionQueryPrecise
kk: Add environment variable to force robustness on all shaders
kk: Fix maxTexelBufferElements value
nir/opt_varyings: Support implementations that cannot compact 16-bits
kk: Fix compilation error when viewMask is 0
util: Fix HAVE_BUILD_ID ifdefs
kk: Expose extendedDynamicState required by VK_EXT_extended_dynamic_state
kk: Fix reported maxInlineUniformBlockSize to match spec expectations
kk: Remove unneeded member in kk_descriptor_set_binding_layout
kk: Handle unbound sets that contain dynamic buffers
kk: Fix texturequerylod
kk: Disable KHR_shader_maximal_reconvergence since subgroups are broken
nvk: Handle unbound sets that contain dynamic buffers
hk: Handle unbound sets that contain dynamic buffers
kk: Fix disabling workaround 4
Aksel Hjerpbakk (3):
panvk: refactor vk_stage_to_subqueue_mask
panvk: cull semaphores in unrelated subqueues
panvk: include cmd stages for semaphores on submit
Alejandro Piñeiro (4):
panfrost: cleanup outputs_read/outputs_written at pan_shader_info
mesa/st: add a warning if can’t set SoftFP64
panfrost/job: avoid shadowing variable name
pan/bi: report stats only if the shaders got compiled
Aleksi Sapon (7):
nir, vk: fix MSVC unused variable warning
llvmpipe: doc fixes
llvmpipe: use half-even rounding in norm and fixed mul
llvmpipe: use half-even rounding in lerp
llvmpipe: fix 64bit unpack on x86
llvmpipe: lerp rounding test
llvmpipe, virgl: update CI traces
Alessio Belle (2):
pvr: add device info for GE7800 (15.5.1.64)
pvr: add device info for GE8300 (22.67.54.30)
Alexander von Gluck (1):
egl/haiku/meson: Include shared libglapi code for dispatch functions
Allen Ballway (1):
android: support longer property names
Alyssa Milburn (1):
nv50,nvc0: Don’t set caps.max_texture_mb
Alyssa Rosenzweig (65):
brw: use the right int8/int16 division lowering
util: require typeof support
util/dynarray: infer type in append
anv: use D3D-compatible texturing for Proton
nir/lower_two_sided_color: cleanup
util: add util_ptr_is_aligned helper
nir: use alignment helpers more
intel: do not NIH util_is_aligned
intel: use util_is_aligned more
asahi: do not NIH util_is_aligned
panfrost,tu: use util_is_aligned
pvr: don’t NIH alignment helpers
people: add Yonggang
asahi,ail: fix multi-plane imports
util: add UTIL_DYNARRAY_INIT sentinel
treewide: use UTIL_DYNARRAY_INIT
util: add BITSET_BYTES helper
util: add BITSET_RZALLOC
treewide: use BITSET_BYTES, BITSET_RZALLOC
asahi: clang-format
brw,elk: drop unused spirv->nir routines
agx: use sparse live-sets
poly: fix cull distance
util: fix container_of on MSVC
pan/bi: initialize variable to fix warning
pan/bi: clean up NIR
nir/sweep: fix use-after-free with dominance LCA
nir/lower_wrmasks: drop support for I/O
nir/lower_wrmasks: drop callback
nir/lower_wrmasks: clean up & deprecate pass
brw: only initialize sample mask flag if needed
brw: only lower flrp once
people: update Marek’s email
nir: print nir_tex_instr::backend_flags if present
util/bitset: allow BITSET_*_RANGE(x, 0, -1)
util: fix (amusing) find-n-replace fail
util: add BITSET_*_COUNT macros
treewide: use BITSET_*_COUNT
asahi: clang-format
hk: fix flrp lowering
brw: constant fold before texture lowering
agx: fix AGX_MESA_DEBUG=nopreamble
asahi: test tilebuffer offsets
asahi: tightly pack tilebuffer
asahi: use flat tile size encoding
asahi: inline agx_shared_layout_from_tile_size
asahi: fix garbage with query reads
hk: hide vertexPipelineStoresAndAtomics
asahi/ci: skip fp64 subgroup tests
panfrost,nir: drop my lonely Authors tags
pan/mdg: clean up csel typing pass
nir: add nir_is_shared_access helper
brw: use nir_is_shared_access
agx: use nir_is_shared_access
pan/mdg: use nir_is_shared_access
ac/nir: use nir_is_shared_access
nir/builder: infer txf_ms/txl/txb opcodes
brw/nir_lower_fs_load_output: unify texture builders
vk/meta_copy_fill_update: simplify tex builder
radv: cleanup texture builder
asahi: use nir_txf
hk: unify tex builders
agx: fix SSA repair with phis with constants
brw: combine peephole select calls
nir: disable fast-math for lowering conversions
Alyssa Ross (1):
rocket: fix building for musl
Andrew Sinclair (2):
gfxstream: revert “gfxstream: Add Vulkan func/structs for passing debugging data to host”
gfxstream: revert “gfxstream: Remove unnecessary tag to simplify perfetto trace config”
André (1):
nouveau: fix memory leak by freeing drm version before returning
Andy Hsu (3):
meson: Support intel tools on Android.
u_trace: remove redundant char* to string conversion (v2)
intel/decoder: make libvulkan_intel to depend on stub decoder when buildtyle=release.
Anna Maniscalco (7):
nir/lower_tex: copy `is_sparse` when lowering txd
radv: recalculate legacy_gs_info on bind
radv: consistently use the value in bytes for esgs_itemsize
freedreno/fdl: add astc hdr formats
tu: advertise EXT_texture_compression_astc_hdr
docs/features: advertise GL_KHR_texture_compression_astc_hdr on zink
zink: fix use_reusable_pool condition
Antonio Ospite (3):
mesa: replace most occurrences of getenv() with os_get_option()
nouveau/drm-shim: remove double ‘/’ in include path
meson/android: bump platform-sdk-version to Android 15
Arcady Goldmints-Orlov (10):
kk: enable dualSrcBlend
kk: enable logicOp
kk: enable shaderDrawParameters
kk: enable shaderStencilExport
kk: Enable VK_EXT_shader_atomic_float
kk: enable VK_KHR_workgroup_memory_explicit_layout
kk: enable VK_KHR_vertex_attribute_divisor
kk: Enable independentBlend
nir: Use nir_shader_intrinsics_pass in nir_lower_io_to_scalar
kk: enable shaderClipDistance
Arjob Mukherjee (1):
pvr: Fixup for deqp-vk.api 2d.optimal.* conformance
Arzaq Naufail Khan (1):
anv: eliminate dead code
Ashish Chauhan (7):
pvr: Make display node optional
pvr: store arch in device-info
pvr: move PVR_TEX_FORMAT_COUNT to pvr_limits.h
pvr: split pvr_spm.c
pvr: split pvr_formats.c
pvr: mark pvr_queue.c as multi-arch
pvr: prepare for multi-gen compilation
Ashley Smith (1):
panfrost,panvk: Enable shader_realtime_clock on panthor 1.6
Augustin Cavalier (1):
renderdoc: Add Haiku platform support
Autumn Ashton (1):
radv/video: Implement VK_VALVE_video_encode_rgb_conversion
Benjamin Cheng (24):
radv/video: Fill maxCodedExtent caps first
radv/video_enc: Cleanup slice count assert
radeonsi/vcn: Check and override primary_ref_frame
radv/video: Override H265 SPS block size parameters
radv/video: Override H265 SPS unaligned resolutions
vulkan/video: NULL check codec-specific chain
radeonsi/vcn: Re-enable AV1 unidir for new FW
radv/video: Fix dummy DPB addresses
ac,radeonsi/vcn,radv/video: Drop signature param
radv/video: Align each layer of encode DPB to 256
ac/parse_ib: Implement VCN dec message parsing
radv/video: Fix num_ref_idx_l{0,1} related overrides
radv/video: Fix H264/H265 reference selection
radv/video: Support two L0 refs on VCN3+
radv/video: Override direct_spatial_mv_pred to 1
radv/video: Fix force_integer_mv=1 on intra frame
radv/video: Always end ref pic modification list
radv/video: Move probability table filling to bind
radv/video: Enable write combine for decode
radeonsi/vcn: Factor out rec_alignment
radeonsi/vcn: Allocate DPBs aligned to rec_alignment
radv/video: Allow aliasing of video images
radv/video_enc: Remove CTS WA
radv/video: Use a more reliable way of computing tile sizes
Benjamin Otte (1):
radv: Limit GTK workaround to affected versions
Bernd Kuhls (1):
blake3: add blake3_neon.c only for little endian archs
Bohan Yu (1):
Panfrost: Fix un-split 64-bit address for store_scratch instruction
Boris Brezillon (40):
nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering
nir: Teach nir_lower_io_vars_to_temporaries() about PLS vars
nir: Add a pass to downgrade inout PLS vars to {in,out} only ones
panvk/bifrost: Fix YCbCr texture/sampler array indexing
pan/cs: Fix cs_extract_tuple()
pan/cs: Fix bitop helpers
pan/cs: Rename cs_select_sb_entries_for_async_ops()
pan/decode: Print defer mode in deferrable instructions
panvk/csf: Make sure we don’t get the same iter SB assigned twice in a row
panvk/csf: Prepare for more complex scoreboard transitions
panvk/csf: Make sure FINISH_FRAGMENTs are properly ordered
panvk/csf: Use cs_vt_{start,end}()
pan/ci: Bump kernel versions for platforms testing panvk
pan/ci: Disable THP on panfrost-g52-piglit
people: Add Christoph Pillmayer to the list
pan/kmod: Cache the device props at the pan_kmod_dev level
pan/kmod: Expose the IO coherency property
pan/kmod: Enforce PAN_KMOD_BO_FLAG_NO_MMAP
panvk: Don’t allocate memory for a buffer descriptor in CreateBufferView()
panvk: Add a panvk_priv_mem_check_alloc() helper and use it
panvk: Rely on supported_bo_flags to mask PAN_KMOD_BO_FLAG_GPU_UNCACHED
panvk: Add a debug flag to force CPU-uncached mappings
panvk: Add a debug flag to force CPU map syncs through the kernel
panvk: Flush pending map syncs before submission
panvk: Force a cacheline alignment when allocating objects from WB shared pools
panvk: Use WB mappings for the global RW and executable memory pools
panvk: Fix a memory leak in the descriptor set logic
pan/bi: Fix leak in bi_iterator_schedule()
panvk: Don’t leak shader binaries when loaded from the cache
pan/cs: Don’t leak builder resources
panvk: Free the decode context in the create_device() error path
pan/ci: Update the g610 flakes to avoid UnexpectedImprovement(Pass)
pan/ci: Extend g610-vk pre-merge test coverage
ci: Add panfrost drivers to debian-arm64-asan
pan/ci: Replace the g610-vk-full job by a g610-vk-asan one
zink/ci: Add tests to the anv-tgl fails list to reflect CI state
panvk/csf: Fix BY_REGION dependencies
panvk: Fix set_compute_sysval()
pan/ci: Keep THP enabled on the g52-piglit job
panvk: Fix the deviceID reported by the driver
Bram Stolk (1):
loader: fix UB in wayland helper code.
Caio Oliveira (32):
mesa/st: Lower to ALU scalar after fp64 subgroup lowering
intel/mda: Allow to specify directories with `-f`
brw: Consolidate late lowering of int64 operations
iris: Enable GL_KHR_shader_subgroup_* extensions for Gfx >= 9 when possible
brw: Fix EU validation of VxH and Vx1 region
brw: Fix MOV_INDIRECT lowering for various platforms
brw: Set relevant immediate bits for Gfx9-11 in JIP and UIP helpers
brw: Don’t set destination of branch instructions
anv, hasvk: Don’t assert on alignment if the value is known to be zero
brw: Remove 3src_exec_size from the field macros
brw: Properly set ‘desc as register’ for SEND in assembler
intel/mda: Use function to read content of objects
intel/mda: Handle better processing a lot of archives
brw: Move MUL related validation
brw: Move AVG related validation
brw: Move ADD related validation
brw: Drop asserts for brw_SRND
brw: Remove LINE from brw_builder and brw_generator
brw: Make LINE normalization into validation
brw: Move PLN/LINE normalization
brw: Add EU validation for ROR/ROL
brw: Move MATH related validation
nir/gcm: Consider dead code elimination done by GCM as progress
brw: Perform mark_last_urb_write_with_eot optimization after CFG
brw: Move normalization of 3-src instructions swizzles to a single place
brw: Move LRP related validation
brw: Consolidate generator code for emitting “regular” instructions
brw: Rework UIP and JIP setting code
brw/scoreboard: Use a predicate helper for the nomask workaround
brw/scoreboard: Disable nomask workaround for Xe2+
brw: Fix and properly use increment_a64_address()
brw: Fix cooperative matrix constant sources other than src0
Calder Young (16):
brw: fix SIMD lowering of fp16 sampler message data with multiple components
anv: Fix ray query shadow stack buffer size
intel: Fix calculation of max_scratch_ids on fused devices
anv: Fix missing const qualifiers on some params in anv_blorp.c
anv: Add shorthand for executing on the companion cmd buffer
anv: Use companion cmd buffer for CCS and MCS image barriers
anv: Fix scratch pool buffer allocation sizes
anv: Fix misplaced assertion in anv_scratch_pool_alloc
anv: Fix typo when checking if async rt scratch size changed
anv: Fix valgrind errors on batch buffers allocated from bo_pool
anv: Fix load factor for batch buffer allocation
anv/rt: Disable compaction for updatable acceleration structures
anv,brw: Allow multiple ray queries without spilling to a shadow stack
anv,brw: Add helper to get stack ids per dss for ray queries
Revert “anv,brw: Allow multiple ray queries without spilling to a shadow stack”
anv: Avoid dumping BVH before command buffer is submitted
Carlos Santa (5):
intel/tools: intel_hang_replay refactoring
intel/hang_replay: move common code into a lib
intel/tools: Handle new replay properties in the Xe KMD error dump file
intel/hang_replay: add Xe support
intel/hang_replay: add option to dump VM state as part of the dump
Casey Bowman (2):
anv: Remove vf_flush for start of command buffers
anv: Make pipeline mode switches show which mode is being entered
Caterina Shablia (9):
panvk: move sparse blackhole stuff to panvk_sparse.{c,h}
pan/lib: introduce row_align_B and array_align_B constraints
panvk: sparse partially-resident image -related queries
panvk: align rows and layers of sparse resident images
panvk/csf: implement sparse image non-opaque binds
panvk: report support for sparseResidencyImage2D
panvk: do not access the image in image view’s destructor
panvk: remove AFBC header zeroing
panvk: fix sparse image non-opaque binds
Chia-I Wu (4):
panfrost: make RUN_COMPUTE.ep_limit configurable
panvk: set compute_ep_limit on v12+
panvk: fix calculate_task_axis_and_increment
panvk: rework calculate_task_axis_and_increment
Christian Gmeiner (39):
bin/ci: Fix SyntaxWarning about return in finally block
bin/ci: Update python-gitlab to 5.x for Python 3.14 compatibility
anv: Convert DEBUG_PIPE_CONTROL logging to use mesa_log_stream
etnaviv: isa: Add norm_mul instruction
anv: Convert DEBUG_SPARSE logging to use mesa_log
anv: Convert DEBUG_HEAPS logging to use mesa_log
anv: Fix needs_temp_copy() incorrectly matching depth/stencil formats
util/log: Add MESA_LOG_PREFIX environment variable to control log prefixes
etnaviv: Disable trilinear filtering for shadow samplers
etnaviv: blt: Add S8_UINT_Z24_UNORM format translation
etnaviv: blt: Add Z16_UNORM format translation
mesa: OES_texture_stencil8 requries OpenGL ES 3.1
meson: require sysprof-capture-4 >= 4.49.0
anv: Convert DEBUG_SPARSE logging to use mesa_logi
etnaviv/ci: Add KHR-GLES2 conformance testing
etnaviv: Add support for ARB_vertex_type_2_10_10_10_rev
etnaviv: Improve flatshading
lavapipe: Trivially expose VK_GOOGLE_user_type extension
etnaviv: rs: Move RS_SINGLE_BUFFER control to per-operation basis
etnaviv: Defer GPU state reset until first draw call
lavapipe: Advertise variableMultisampleRate
vulkan/wsi: Add wsi_common_is_swapchain_image() helper
treewide: Use wsi_common_is_swapchain_image() helper
etnaviv: isa: Print parser error
etnaviv: isa: Add type suffixes to immediate value encoding
etnaviv: isa: Remove dual16 mode parameter from parser API
etnaviv: isa: Fix f16 immediate encoding
etnaviv: isa: Add assembler support for infinity and NaN immediates
pvr: Use BUILD_ID_EXPECTED_HASH_LENGTH
etnaviv: Update headers from rnndb
etnaviv: blt: Set 64BPP_FORMAT flag for clears and copies
etnaviv/ci: Add gitlab-ci-inc.yml to file list
ci: Describe imagination farm
ci: Build imagination vulkan driver
pvr/ci: Add dEQP-VK testing for BXS-4-64 on TI AM68 SK
pvr/ci: Increase timeout to prevent job failures
pvr/ci: Update CI expectations
meson: Restore .clang-format for ninja clang-format target
pan/compiler: Fix progress reporting in pan_nir_lower_store_component
Christoph Pillmayer (20):
pan: Enable rematerialization for more ops
pan: Fix bi_load_tl dst arg name
pan: Pull out normal block logic from compute_w_entry
pan: Add spill cost metric
pan: Make W_entry loop aware
nir: Fix preseved metadata in sort_unstructured_blocks
nir: Update progress info in nir_sort_unstructured_blocks
pan: Avoid some redundant SSA spills
pan: Copy nir_dominance.c to bi_dominance.c
pan: Adapt calc_dominance from nir to bi
pan: Fix bi_find_loop_blocks
pan: Use bitset instead of bool array in bi_find_loop_blocks
pan/bi: Add missing 8bit widen swizzles
pan/decode: Fix indent in pandecode_dcd
pan/preload: Prepare for reading from single sampled view
panvk: Create MS shadow images and views
panvk: Setup attachments for ms to ss rendering
panvk: Implement VkSubpassResolvePerformanceQueryEXT
panvk: Expose EXT_multisampled_render_to_single_sampled
pan/bi: Fix bi_find_loop_blocks for single block loops
Collabora’s Gfx CI Team (10):
Uprev Piglit to 2ac68e5fb59215ecf89049ec15f3f7494b51a589
Uprev Piglit to ec76cc7a31f03c4f4f9d6e3b00f8a70c8ee0fb32
Uprev ANGLE to e9626fbced6841d804e7eaf48bb078770822032b
Uprev Piglit to 5309e3401d6b03e8a0bb7bfdc1e0f5bc1ad754af
Uprev ANGLE to 127a84404b88dbc4327ffb7f831a9a36c3b111bc
Uprev ANGLE to ee05836a4934129527544385203ecf420afc5dd1
Uprev ANGLE to 2ed4b049c064add3109c7b1e0c954a0bce856df8
Uprev Piglit to 2842979ebe03b99c33c3e49af5960c69be6c6d46
Uprev ANGLE to b406401e42080c2f8fe479e6c5fa48dfae97c482
Uprev Piglit to 62d499d63d2b8b29a67efd9d93ed9b6a94d4950e
Connor Abbott (60):
tu: Fix corner case with clearing input attachment
tu: Remove useless tu_image_view_init parameter
tu: Don’t patch GMEM for input attachments never in GMEM
tu: Don’t resolve twice in between subpasses
tu: Clear RB_MRT_BUF_INFO::LOSSLESSCOMPEN for stencil
tu: Fix 3d load path with D24S8 on a7xx
tu: Also disable stencil load for attachments not in GMEM
tu: Make blit setup take source and destination samples
tu: Add CCU_RESOLVE_CLEAN workaround
tu: Remove tu_attachment_info
tu: Make r*d_src_depth and r*d_src_stencil generic
tu: Add support for “unresolve” ops
tu: Implement VK_EXT_multisampled_render_to_single_sampled
tu: Fix RT count with remapped color attachments
tu: Rename tu_render_pass_attachment::clear_views to used_views
tu: Fix attachment stores with subpasses with partial views
tu: Zero MSRTSS temporary image before creating it
freedreno: Document BV BIN_PREAMBLE usage
freedreno/a7xx: Document GRAS_LRZ_CB_CNTL
freedreno: Expand a7xx LRZ metadata definition
freedreno/registers: Fix encoding fields in 64b registers
freedreno/crashdec: Add support for CP_BV_MEMPOOL
freedreno: Add synchronization-related control registers
freedreno: Decode CP_RESOURCE_LIST
freedreno/a7xx: Add BV registers for ROQ status
tu: Refactor VSC bo initialization
tu: Use scratch mem for conditional loads/stores on a7xx
tu: Add tu7_thread_control helper
tu/cs: Allow conditional execution in substreams
tu: Initialize registers for BV
tu: Rewrite visibility stream allocation
tu: Correctly set GRAS_LRZ_CB_CNTL
freedreno: Add has_pred_bit feature bit
tu: Use predicate bit for perf queries
tu/a7xx: Support concurrent binning
freedreno: Make BV ROQ registers a7xx-only
tu: Handle case where pipeline writes unused color attachments
editorconfig: Set for glsl files
glsl/float64: Fix fmax with NaNs
nir, glsl: Add support for softfloat32
tu: Expose preserving fp32 denorms via softfloat32
tu: Make softfloat shader compiled on demand
util/glsl2spirv: Use better glslang flag for -Olib
tu: Support softfloat64
tu: Stop setting RB_CCU_DBG_ECO_CNTL to 0 for GMEM passes
tu: Stop setting GRAS_LRZ_CB_CNTL before GMEM render passes
tu: Set GRAS_MODE_CNTL once
tu: Set 8E09 once
tu: Stop setting view_index_is_input
tu: Call nir_lower_sysvals_to_varyings once
spirv: Remove view_index_is_input
tu: Fix GRAS_BIN_FOVEAT* programming with more than 1 layer
tu: Fix FragCoord offset when HW viewport offset is enabled
nir, tu: Add and use load_frag_coord_gmem_ir3
ir3: Support addr0 align of 8
tu: Implement VK_QCOM_subpass_shader_resolve
tu: Implement VK_EXT_custom_resolve
tu: Fill render pass state when resuming
ir3: Fix condition for using uniform predicates
freedreno/crashdec: Fix crash with older kernels
Corentin Noël (1):
ci: Uprev crosvm and virglrenderer
Daivik Bhatia (4):
v3dv: move format helpers to new v3dv format table header files.
v3dv: replace raw integers with enum types in helper functions.
v3dv: centralize limit macros in v3dv_limits.h
v3dv: improve barrier handling for secondary command buffers
Daniel Lang (1):
etnaviv: Use FLOAT type for R32G32B32A32_{U,S}INT vertex formats
Daniel Schürmann (77):
nir: add nir_imul_nuw() and nir_imul_imm_nuw() helpers
nir: don’t use nir_build_alu() with incomplete sources
nir: guard nir_def_as_alu()
nir/constant_folding: switch to nir_shader_lower_instructions()
vulkan/nir: call nir_opt_constant_folding() during vk_spirv_to_nir()
nir/builder: add option to immediately constant-fold ALU instructions upon insertion
nir/lower_flrp: ad-hoc constant-fold ALU instructions
tree-wide: don’t call nir_opt_constant_folding after nir_lower_flrp
nir/algebraic: ad-hoc constant-fold ALU instructions
radv/shader_info: remove unused output_usage_mask
radv/shader_info: use union for precomputed register values of non-overlapping stages
radv/shader_info: rename gs_ring_info -> legacy_gs_info and use union with ngg_info
radv/shader_info: repack and compact struct radv_shader_info
radv: skip shader cache if trap handler is enabled
radv: hash keep_executable_info into shader key rather than device cache key
radv/null_device: don’t attempt to upload shaders
radv/null_device: set more options which affect compilation
radv/device: return early in radv_CreateDevice() if creating a null device
radv: remove radeon_winsys::get_chip_name() and use info->marketing_name directly
amd, radv: create null device without winsys
radv: delete winsys/null/*
radeonsi: use ac_null_device_create() when AMD_FORCE_FAMILY is set
amd/common: rename ac_fake_hw_db.h -> ac_surface_test.h
aco/scheduler: remove unused include
aco/scheduler: assert that the register demand stays within pre-determined bounds
aco/scheduler: remove MoveState::RAR_dependencies_clause
aco/scheduler: use hashmap for RAR_dependencies
aco/scheduler: refactor downwards dependency check
aco/scheduler: move clauses through RAR dependencies
nir/opt_load_store_vectorize: don’t add negative offsets to load/store_shared2_amd
amd: enable load/store_shared2_amd for GFX6
nir/opt_large_constants: Fix dead deref instructions accessing lowered variables
treewide: Never preserve nir_metadata_dominance without nir_metadata_block_index
radv: Only call nir_opt_memcpy once
radv: Only call nir_opt_dead_write_vars once
radv: call nir_opt_find_array_copies before first radv_optimize_nir()
radv: don’t lower_vars_to_ssa during optimization loop
nir/lower_vars_to_ssa: return early if there is no local variables to lower
radv: Only call nir_lower_alu_width once in radv_optimize_nir()
radv: move nir_opt_copy_prop_vars out of optimization loop
drm-shim: handle DRM_CAP_ADDFB2_MODIFIERS
amd/drm-shim: handle AMDGPU_INFO_HW_IP_COUNT
amd: remove radeon_info::dev_filename
amd: remove radeon_info::lowercase_name
amd: replace uses of radeon_info::name with ac_get_family_name()
amd: remove radeon_info::is_pro_graphics
amd: restrict radeon_info::marketing_name to 64 characters and copy it
Revert “radv: move nir_opt_copy_prop_vars out of optimization loop”
Revert “radv: Only call nir_opt_dead_write_vars once”
radv: remove precomputed registers from radv_shader_binary
amd: add newer small APUs to get_task_num_entries()
amd/common: link with libamdgpu_addrlib
radeonsi: use si_shader_encode_{sgprs|vgprs} in si_compute.c
aco: disable XNACK on all GPUs
ac/gpu_info: move some CU information into separate struct ac_cu_info
ac/gpu_info: correct some SGPR and VGPR allocation values in ac_cu_info
ac/gpu_info: create separate function ac_fill_cu_info() to fill out CU info
aco/tests: don’t pass CHIP_UNKNOWN to ACO
aco: pass aco_compiler_options to init_program()
aco: add ac_cu_info to aco_compiler_options
ac/gpu_info: add some more flags to ac_cu_info
aco: use additional flags from ac_cu_info
amd: add ac_cu_info::has_mad32 flag and use in ACO
amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
aco: remove radeon_family from aco::Program
aco/lower_to_hw: Fix SGPR Operand RegClasses of subdword copies
aco/lower_to_hw: Don’t use 2 SGPR operands before GFX10 in a single VOP3 instruction in do_pack_2x16()
aco/lower_to_hw: Fix SGPR Operand RegClasses for pack_2x16
aco/validate: Validate correct RegisterClasses after lowering to HW instructions
aco/tests: Add test for subdword extraction from SGPR
aco/tests: Add new test to pack 2x16 SGPRs into VGPR
aco/validate: validate constant bus limit after register allocation based on PhysReg
nir/loop_analyze: determine for all ALU whether it can be constant-folded
nir/loop_analyze: determine whether all control flow gets eliminated upon loop unrolling
nir/opt_load_store_vectorize: delay aliasing test in try_vectorize_shared2()
Danylo Piliaiev (28):
tu/lrz: Fold disable_write_for_rp check into tu_lrz_disable_write_for_rp
tu/lrz: Disable LRZ when CmdSetRenderingAttachmentLocations is used
tu/lrz: Disable LRZ writes when draw doesn’t write to all attachments
tu: Faster descriptor set allocator
vulkan: Always fill DS state for EXT_dynamic_rendering_unused_attachments
tu: Use cmd->rp_trace u_trace for draw calls
tu: Fix renderpass-level tracepoints not showing up in binning
tu: Add concurrent_binning_barrier tracepoint
tu: Add a reason for concurrent binning disablement to RP tracepoint
freedreno/fdl: Move LRZ FC size calculation to a separate function
tu/lrz: Try harder to have LRZ fast-clear enabled with FDM offset
tu: Fix CB barrier description
tu: Don’t CONCURRENT_BIN_DISABLE when there is no depth image
tu: Do not WAIT_FOR_BR if concurrent binning is disabled
tu/cs: Helpers to create a region that can be easily enabled/disabled
tu: Disable by default CB running alongside renderpasses
tu: Disable FLAG_WAIT_FOR_BR sync when CB is disabled
freedreno/layout: Use blocks for linear mipmap fallback where possible
tu: Handle mismatch in mip layouts for reinterpreted compressed images
freedreno: Update A7XX_RB_UNKNOWN_8E09 to be in line with blob
tu: Add custom resolve tracepoints
ir3: Generify helper_sched to support other flags
ir3: Schedule (eolm)/(eogm)
tu: Fix passing tmp arrays to tu_desc_set_swiz/fdl6_buffer_view_init
tu: Don’t use u_trace_address::bo, only raw iova
tu: Restore PC_TESS_BASE after BIN preemption save/restore
tu: Fix misleading lrz_disabled_at_draw values for RP
tu: Fix typo in min bounds calculation of FDM scissors
Dave Airlie (36):
lavapipe: drop lavapipe specific macro for generic one.
lavapipe: cleanup some whitespace in lvp_private.h
lavapipe: drop unused macro.
lavapipe: remove image pointer from lvp_image_view
lavapipe: drop device pointer from lvp_cmd_buffer
lavapipe: drop device pointer from pipeline object
lavapipe: drop device pointer from queue
lavapipe: drop device pointer from pipeline cache
lavapipe: drop unneeded physical device in sparse image format props
lavapipe: drop physical device pointer from lvp_device
lavapipe: drop instance pointer from lvp_device.
lavapipe: use vk_query_pool as the base for lvp_query_pool
intel/elk: drop a bunch of tables for unused elk gens.
c11/threads: fix build on c23
nir: add a cmat call instruction type.
nir: add a flag for functions that are used in cmat calls.
nir: add support for cooperative matrix reduction operations.
spirv: add support for cooperative matrix reduction operation
radv: add support for cooperative matrix reductions.
nir: add coopmat per element operations.
spirv: add initial support for cooperative matrix per-element ops
radv: add support for cooperative matrix per element operations.
dozen: return INCOMPATIBLE_DRIVER on instance create failure
nak/cmat: free the type mapping hash table.
device-select: add a layer setting to disable device selection logic
zink: use device select layer settings to disable device selection
lavapipe: drop apiVersion from instance
lavapipe: repack render attachment.
lavapipe: drop mem pointer and offset from buffer
lavapipe: drop data pointer from lvp_query
lavapipe: drop unused defines
radv/coopmat: fix deref stride
gallivm: swap 1d array coords before casting.
gallivm: let reduce ops use llvm intrinsics
lavapipe: add support for VK_KHR_cooperative_matrix.
gallivm: handle u16 correct on const loads.
David Heidelberg (1):
ci: implement debian-cross-riscv64
David Rosca (72):
frontends/va: Move encode functions to separate file
frontends/va: Move decode functions to separate file
frontends/va: Move remainig processing functions to postproc.c
radeonsi/vpe: Stop clearing embedded buffer on allocation
radeonsi/vcn: Don’t use temporary feedback buffer when not needed
radeonsi/uvd_enc: Don’t use temporary feedback buffer when not needed
radeonsi/video: Change si_vid_resize_buffer to take si_resource
radeonsi/vcn: Stop using rvid_buffer
radeonsi/vce,uvd_enc: Stop using rvid_buffer
radeonsi/vpe: Stop using rvid_buffer
radeonsi/video: Remove rvid_buffer
frontends/va: Support H264 encode pic_order_cnt_type 1
radeonsi/vcn: Support H264 encode pic_order_cnt_type 1
frontends/va: Always reset H264 slice ref modification and marking count
radeonsi/vce: Don’t check ref modification and marking flags
radv/video: Introduce two levels of write_memory support
radv/video: Only use write_memory for encode feedback with full support
radv/ci: Enable video tests on navi21 and navi31
radeonsi/vcn: Fix creating context buffer on VCN5
radeonsi/vcn: Fix AV1 bidir compound encode with order_hint disabled
radv/video: Don’t require encode FW version >= interface version
radv/video: Fix AV1 bidir compound encode with order_hint disabled
ac/parse_ib: Fix parsing multiple engine commands in one VCN IB
ac/parse_ib: Parse VCN_IB_COMMON_OP_RESOLVEINPUTPARAMLAYOUT
vulkan/video: Add chroma subsampling to video session
vulkan/video: Avoid NULL pointers in session parameters
radv/video: Correctly handle no feedback query for encode
radv/video: Add NULL checks for picture parameters
radv/video: Support intra only without dpb
radeonsi/vcn: Remove before_encode() func
radeonsi/vcn: Drop vcn_enc_2_0 encode() override
radeonsi/vcn: Only allow to enable pre-encode on first frame
radeonsi/vcn: Update spec, slice, quality and deblock params each frame
vulkan/video: Fix coding AV1 seq_choose_screen_content_tools = 1
radv/video: Fix coding allow_screen_content_tools and force_integer_mv
radv/video: Fix coding used_by_curr_pic_lt_flag
radeonsi/vce: Add workaround for unaligned input surface
frontends/va: Add AV1 encode high_bitdepth flag
radeonsi/video: Add VPS/SPS/PPS and sequence header functions to radeon_bitstream
radeonsi/uvd_enc: Use radeon_bitstream functions to code headers
radeonsi/vce: Use radeon_bitstream functions to code headers
radeonsi/vcn: Use radeon_bitstream functions to code headers
radeonsi/video: Make helper radeon_bitstream functions static
radeonsi/vcn: Cleanup HEVC encode deblock params handling
radeonsi/uvd_enc: Cleanup HEVC encode deblock params handling
radeonsi/vcn: Cleanup AV1 screen content tools coding
radeonsi/vcn: Remove unnecessary vars for AV1 encode
radeonsi/vcn: Reduce allocated size for pre-encode recon pics
radeonsi/vcn: Fix maybe uninitialized warning
frontends/va: Use util_dynarray for decode slice data buffers
radv/video: Remove enc_session from video session state
radv/video: Use radv_enc_aligned_coded_extent for session params overrides
radv/video: Remove tile config and skip mode from video session state
radv/video: Init session and update rate control in ControlVideoCoding
radv/video: Drop casts from vk_find_struct*
radv/video: Fix AV1 quantization map maxQIndexDelta value
radv: Enable DCC modifiers for multi plane formats on GFX12
radv/video: Use different dpb swizzle mode for 10 bit encode
radv/amdgpu: Only wait on queue syncobj when needed
frontends/va: Use correct pipe profile for VAProfileH264ConstrainedBaseline
radeonsi/video: Don’t report support for H264 Baseline profile
frontends/va: Support VA_PICTURE_H264_NON_EXISTING
radeonsi/vcn: Use is_non_existing H264 ref flag
frontends/va: Remove MPEG4 decode support
radeonsi/video: Remove MPEG4 decode support
r600: Remove MPEG4 decode support
virgl: Remove MPEG4 decode support
nouveau: Remove MPEG4 decode support
pipe: Remove MPEG4 decode support
frontends/va: Also treat PRI/TRC_RESERVED0 as unspecified
frontends/va: Fix RGB/YUV conversion in Get/PutImage
radv/video: Fix maxActiveReferencePictures for H265 decode
Dmitry Baryshkov (10):
ci: drop google-freedreno remnants
ci: describe my small lab
freedreno/ci: add a200 nightly jobs
ethosu: drop file names from the generated file
rocket: drop file names from the generated file
freedreno/ci: mark egl_chromium_sync_control tests as passing
freedreno/ci: update fails / flakes list for a750-gl-cl job
freedreno/ci: correct rules for a618-gles-asan
gfxstream: don’t dump genvk.py args to generated files
freedreno/ci: use third A200 runner
Dmitry Osipenko (3):
virtio/vdrm: Fix varying offsets of struct vdrm_device members
virgl: Implement resource_create_with_modifiers
virgl: Support new resource-layout command
Dorinda Bassey (1):
util/rust: Add handle type detection to descriptor API
Dylan Baker (47):
Version: Bump to 26.0
docs: reset new_features.txt
docs: update calendar for 25.3.0-rc1
intel/mda/tests: use an ASSERT on fread()
intel/mda: Fix potential underflow in printing code
intel/compiler/brw: fix potential unsigned overflow
intel/compiler/brw: Add assert that we don’t have a negative value
intel/mda: Use GTEST fixtures to manage File handles
intel/mda: Use a vector to track the contents variable
anv: Fix potential overflow from doing 32bit math on 64bit types
anv: try to help coverity understand we’re not racing
anv: assert that we don’t overflow
anv: prevent potential, but unlikely, overflow
docs: Extend calendar entries for 25.3 by 1 releases.
docs: update calendar for 25.3.0-rc2
docs: update calendar for 25.3.0-rc3
docs: update calendar for 25.3.0-rc4
docs: add release notes for 25.3.0
docs: Add sha sums for 25.3.0
docs: update calendar for 25.3.0
docs/relnotes/25.3.0: Remove duplicate bug fixes
docs/relnotes/25.3.0: Escape some rst language constructs
bin/gen_release_notes: Remove cast that does nothing
bin/gen_release_notes: Remove duplicate bug entires
meson: make dep_lua a disabler
meson: make libarchive a disabler
docs/release-calendar: Shift 25.3 releases by one week
docs: add release notes for 25.3.1
docs: Add checksums for 25.3.1
docs: update calendar for 25.3.1
anv/video: void cast array we intentionally read off the end of
anv/video: Read the right source for memcpy
anv/video: Cast intentional read past end of struct member to void*
iris: remove uses of pipe_surface as a pointer
docs: add release notes for 25.3.2
docs: Add checksums for 25.3.2
docs: update calendar for 25.3.2
docs: add release notes for 25.3.3
docs: Add 25.3.3 checksums
docs: update calendar for 25.3.3
anv: Use { 0 } to initialize struct
anv: initialize anv_address to ensure that the protection field is set
docs/releasing: fix which commit is cherry-picked
docs/releasing: Use a pull request instead of push for relnotes
docs/releasing: Add a section to update the website
docs/releasing: Use the GitLab CI as the test procedure
bin/pick: When the main widget is replaced, trigger a redraw
Ella Stanforth (13):
pvr: Avoid putting tile buffer allocators on the heap
pvr: Add routine for filling out usc_mrt_setup from dynamic rendering state
pvr: add pipeline handling to use dynamic rendering info
pvr: make pvr_get_tile_buffer_size static
pvr: move tile_buffer_size logic to pvr_device.c
pvr: move pvr_load_op to pvr_mrt.h
pvr: move pvr_load_op_state to pvr_mrt.h
pvr: move load_op_shader_generate to pvr_mrt
pvr: use linked list to back deferred clears
pvr: Convert format table to indexing with pipe_format
pvr: Fix bugs in the format table
pvr: fix suspend and resume for dynamic rendering
pvr/csbgen: fix packing multiple addresses
Emma Anholt (120):
wsi: Fix the flagging of dma_buf_sync_file for the amdgpu workaround.
virgl: Fix VIRGL_DEBUG=tgsi to work on debugoptimized builds.
nir/link_opt_varyings: Make it participate in NIR_DEBUG=print.
tu: Make sure we clear dead writes to vars before nir_link_opt_varyings().
nir/shrink_stores: Don’t shrink stores to an invalid num_components.
nir/copy_prop_vars: Mask out no-op writes to variables.
docs/perfetto: Add row for panvk support.
docs/perfetto: Be helpful and opinionated about config selection.
docs/perfetto: Give a hint on how to cross compile the tools.
docs/perfetto: Explain using tracebox, and put commands in the list.
docs/perfetto: Be more clear about the role of MESA_GPU_TRACES=perfetto
docs/perfetto: Put V3D at the same level of heading as other drivers.
pps: Remove the cpu.cfg file.
v3dv: Fix assertion failure for not-found primary_fd during enumeration.
docs: Give more reproducible instructions for how to build the docs.
tu: Fix leak of MSTRSS temporaries.
tu: Fix leak of compute shader pipeline->base.executables_mem_ctx;
tu: Fix buffer overflow optimizing MSRTT.
tu: Avoid buffer overflows during inline uniform block updates.
tu: Add a loop count to VK_pipeline_executable_properties.
ir3: Drop use of nir_lower_wrmasks().
ir3: Drop ir3_nir_lower_64b_intrinsics
ir3: Drop the vector splitting and simplify ir3_nir_lower_64b_global().
ir3: Fix incorrect use of predicated ifs on getlast.
ir3: Make the debug-print block numbers be the NIR block numbers.
ir3: Perform vectorization on ldg/stg just like other memory access.
ir3: Drop old comment about ldg vectorization limitation.
tu: Use a register pack for VPC_PS_CNTL.
tu: Template tu6_emit_window_scissor by CHIP.
tu: Template tu6_emit_rt_workaround() by CHIP.
tu: Use tu_cs_emit_regs() for SU_POLY_OFFSET setup.
tu: Template tu6_build_depth_plane_z_mode by CHIP.
tu: Template tu7_emit_tile_render_begin_regs by CHIP.
tu: Template r2d_coords by CHIP.
tu: Template tu_CmdBeginTransformFeedbackEXT() by CHIP.
tu: Template tu_CmdBindTransformFeedbackBuffersEXT by CHIP.
tu: Template tu_CmdBindIndexBuffer2KHR by CHIP.
tu: Use non-deprecated reg packing in tu6_setup_streamout()’s CRBs.
tu: Template fdm_apply_store_coords() by CHIP.
tu: Template update_vsc_pipe by CHIP.
tu: Template tu6_emit_msaa() by CHIP.
tu: Template tu7_emit_subpass_shading_rate by CHIP.
tu: Template tu6_emit_vpc_varying_modes() by CHIP.
tu: Template tu_pipeline_builder_parse_rasterization_order() by CHIP.
tu: Convert tu_init_cmdbuf_start_a725_quirk() to non-deprecated packing.
tu: Move VPC_SO_FLUSH_BASE to use reg packing.
tu: Move tu6_emit_gs() to use reg packing.
tu: Explicitly use 6XX scratch reg packing in perfcntrs_pass_cs_entries.
tu: Use non-deprecated names for scratch regs.
tu: Use appropriate chip variants for FOVEAT regs.
tu: Use appropriate chip variants for LRZ reg packing.
tu: Use appropriate chip variants for VRS reg packing.
tu: Use appropriate chip variants for SC_BIN_CNTL reg packing.
tu: Use appropriate chip variants for VPC/PC reg packing.
tu: Use appropriate chip variants for SP_CS reg packing.
tu: Use appopriate chip variants in SC scissor/viewport reg packing.
tu: Use appropriate chip variants in PS setup.
tu: Use appropriate chip variants for CONSERVATIVE_RAS_CNTL.
tu: Use appropriate chip variants for A2D reg packing.
tu: Use appropriate chip variants for RB regs.
tu: Only emit GRAS_SU_RENDER_CNTL and SP_RENDER_CNTL on >=a7xx.
tu: Use appropriate variants for GRAS_SU regs.
tu: Use a register pack for VPC_VARYING_LM_TRANSFER_CNTL_DISABLE[].
tu: Use non-deprecated packing for SP_DITHER_CNTL.
tu: use non-deprecated packing for GRAS_CL_ARRAY_SIZE.
tu: Use appropriate variants for other GRAS regs.
tu: Use appropriate variants for SP regs.
tu: Use proper reg packing in another place.
tu: Use appropriate variant for HLSQ regs.
tu: Pass around the new packing struct for GRAS_LRZ_CNTL.
tu: Use non-deprecated reg packing for RB_CLEAR_TARGET().
tu: Convert remaining tu_cs_emit_pkt4()s to avoid deprecated reg definitions.
freedreno/registers: Apply autopep8 to gen_header.py.
freedreno/registers: Simplify a bit of reg printing.
freedreno/registers: Restore reg definitions required by kernel.
tu: Drop emitting of deprecated packing.
nir/shader_bisect: Fix C code printing after review feedback changes.
nir/shader_bisect: Allow passing in a –lo / –hi to continue a run.
nir/loop_analyze: Use nir_unsigned_upper_bound for loop trip limits.
nir/uub: Use an optional max_samples from drivers for sample counts.
nir: Optimistically unroll loops using induction var as a sample id.
tu,freedreno: Drop the “.bo_write” flag.
tu: Add CRB builder.
tu: Move pipeline SO setup to the CRB builder.
tu: Move VFD CRBs to the CRB builder.
tu: Move tu6_emit_mrt() to use CRB.
tu: Move tu6_emit_window_offset() to use CRB.
tu: move tu6_emit_msaa() to use CRB.
tu: Move a bunch of program config to CRB.
tu: Split loading immediates for a program from the program config.
tu: Move tu_xs_config() to use the CRB builder.
nir: Drop the mode argument of nir_lower_vars_to_scratch().
nir: Introduce nir_lower_vars_to_scratch_global().
ir3: Move the compute shader threadsize forcing earlier.
ir3/ra: Make a helper to get RA register pressure limits.
ir3: Improve spilling of NIR vars to scratch.
freedreno/a3xx: Improve the name of CONSTFOOTPRINT and fix constlen==0 case.
freedreno/a3xx-a5xx: restore cbuf0 direct upload.
tu: Fix use-after-free in device destruction on old kernels
ir3: Fix leak in vars_to_scratch callback.
nir/opt_algebraic: Fix return type of fdot(vec(a, 0.0, …), b).
nir: Avoid UB of (int)0xff << 24 evaluating usadd_4x8_vc4.
nir/algebraic: Apply autopep8.
nir: Add a note on how load_sample_pos_from_id works.
ir3: Use the new NIR pass for load_barycentric_at_* optimization.
ir3: Rename the file for ir3_nir_lower_load_sample_pos().
nir: Fix constant evaluation of non-32-bit bitfield_extract.
nir: Let nir_eval_const_opcode() return a poison mask in case of UB.
nir/constant_expressions: Set the poison flag during i/ubitfield_extract.
nir: Specify f2i/f2u as undefined if the float is out of range of the int.
nir: Define extract/insert_i8 and friends to be UB if the shift is too large.
nir: Define udot_2x16_uadd_sat to have UB according to the SPIRV spec.
nir: Rename the unit_test_*_amd intrinics to be un-vendored.
nir/opcodes: Avoid technical UB left shifting ints.
nir/opcodes: Cast isub/iadd3’s args to uint to avoid UB integer underflow.
nir/search_helpers: Avoid UB in is_2x_16_bits()/is_neg2x_16_bits().
nir/algebraic: Fix typo in error message print.
nir/opt_algebraic_tests: Mark patterns as unsupported or xfails.
lima/ci: Remove erroneous skips.
ci/tu: Clear stale xfails from the nightlies.
Eric Engestrom (114):
mr-label-maker: fix label for mesa release MRs
ci: uprev vkd3d
docs: update/fix vk spec urls
docs: update calendar for 25.2.6
docs: add release notes for 25.2.6
docs: add sha sum for 25.2.6
asahi/virtio: fix memleak
util/meson: don’t build libmesa_util_clflushopt unless needed
util/meson: don’t build libmesa_util_clflush unless needed
lavapipe/ci: document fixed tests
lavapipe/ci: mark more tests as flaky
ci: track src/c11/ changes
ci: track src/android_stub/ changes
ci: uprev vkd3d
docs: update calendar for 25.2.7
docs: add release notes for 25.2.7
docs: add sha sum for 25.2.7
docs: add 25.2.8 to the calendar
broadcom/ci: automatically reboot rpi3 when they fail to find the root device
broadcom/ci: fix rpi4 retries
docs/release-calendar: add 26.0 branchpoint and release candidates
perfetto: use the new upstream repo
meson: auto-disable `amd-use-llvm` when `llvm=disabled`
meson: auto-disable `draw-use-llvm` when `llvm=disabled`
ci: use $CI_TRON_JOB_PRIORITY tag on all ci-tron jobs
broadcom/ci: apply “Cannot open root device” reboot workaround to all rpi boards
broadcom/ci: update device count in ci-tron farm
docs: update calendar for 25.2.8
docs: add release notes for 25.2.8
docs: add sha sum for 25.2.8
rust: configure clippy to only report issues relevant to our MSRV
ci: read the MSRV from clippy.toml to avoid having too many copies to keep in sync
meson: add rust_global_args for flags for all the rust compilations
meson/rust: allow `else { if {} }`
meson/rust: allow “needless lifetimes”
meson/rust: allow explicit `if x.is_none { return None }` instead of `x?`
rusticl/meson: deny all clippy lints before allowing global ones
rusticl: rewrite blocks using if/else for clarity
etnaviv: allow ISA struct to be spelled all uppercase
nil: drop duplicate lib in “liblibnil.a”
nak: set nir_shader_compiler_options one one step
nak: use filter() instead of open-coding it
nak: use `matches!()` instead of open-coding it
nak: avoid errors when generated code is empty
nak: silence clippy warning about `x * 0`
nak: remove unnecessary use of `format!()`
nak: drop empty string from `eprintln!()`
nak: remove conversion into the same type
nak: remove “reference which is immediately dereferenced by the compiler”
nak: rewrite `repeat().take()` into `repeat_n()`
nak: drop unnecessary reference on both sides of `==`
nak: use `assert_eq!(a, b)` instead of `assert!(a == b)`
nak: use `foo &= bar` instead of `foo = foo & bar`
nak: add all identical values in one step
nak: remove unused lifetime
nak: drop redundant closure
nak: drop unnecessary mutable reference
nak: replace `!foo.is_{none,some}()` with their positive counterpart
compiler/rust: replace `!first.is_none()` with `first.is_some()`
compiler/rust: rewrite `match` into a simpler `if let`
compiler/rust: remove unnecessary lifetimes
compiler/rust: allow CFG & BitSetStreamTrait to have a `len()` without also having an `is_empty()`
compiler/rust: drop “borrow of a value the compiler would automatically borrow”
rusticl: silence incorrect clippy error about re-implementing memcpy
nak: drop “reference which is immediately dereferenced by the compiler”
nak: use saturating_sub() instead of open-coding it
nak: drop clone of Copy-able types (RegOrigin & SSAValue)
nak: drop cast of u8 to u8
nak: allow LdCacheOp values to be named `Cache*`
nak: drop “reference which is immediately dereferenced by the compiler”
nak: drop “deref on an immutable reference”
nak: replace .get(0) with .first()
nak: merge identical if branches for blackwell, ampere and ada
nak: replace `.find(x).is_some()` with `.contains(x)`
nak: drop “unneeded `return` statement”
nak: use std::mem::size_of_val(data) instead of open-coding it
util/rust: cleanup derelict allow(dead_code) annotations
rusticl: drop collapsible_else_if annotation now that it’s allowed globally
rusticl: cleanup derelict allow(non_upper_case_globals) annotation
nak: cleanup derelict allow(dead_code) annotations
nil: cleanup derelict allow(dead_code) annotations
ci: fix path to clippy.toml
panvk: fix accidental assignment in assert
radv/ci: document recent flakes
broadcom/ci: document recent flakes
turnip/ci: document recent flakes
lavapipe/ci: document recent flakes
etnaviv/ci: document fixed tests
zink+nvk/ci: document fixed tests
virtgpu_kumquat: cleanup derelict allow(dead_code) & allow(unused) annotations
virtgpu_kumquat_ffi: mark the remaining allow annotations (all non_camel_case_types) as expected
virtgpu_kumquat_ffi: use `mutex.get_mut()` instead of `mutex.lock()` to get compile-time garantee that the mutex isn’t already locked
virtgpu_kumquat_ffi: use auto-deref instead of doing it by hand
virtgpu_kumquat_ffi: mark single-item match as expected
mr-label-maker: tag src/virtio/virtgpu_kumquat* as part of gfxstream
rusticl: fix ‘enable-drivers’ meson option
Revert “renderdoc: Add Haiku platform support”
vk/runtime,zink: only integrate renderdoc on supported platforms
docs: update url to ci-tron docs
docs: delay 26.0 branchpoint by a week
etnaviv: run rustfmt
ci: run rustfmt on all rust files
nir/meson: only try to generate the nir_opt_algebraic tests when requested
nir/meson: drop redundant –build-tests in favour of just checking if –out-tests is set
VERSION: bump for 26.0.0-rc1
.pick_status.json: Update to bed1576b141a5d4398c71abeec5af3674b390aa0
pick-ui: update for python 3.14 support
nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests
VERSION: bump for 26.0.0-rc2
.pick_status.json: Update to 248b8184078c6df2c00c987e499348532b52a6e5
.pick_status.json: Mark a66d19b691bb8531dd075861b6b103eb488d9237 as denominated
Revert “meson: static link spirv-tools for darwin”
VERSION: bump for 26.0.0-rc3
.pick_status.json: Update to d7814bcad0426c26e88874a3ef2d99d7220bcf48
Eric R. Smith (23):
panfrost/panvk: Add size calculations to compiler register code
panvk: sanity check block size for unorm format
panfrost: add explicit get_dmabuf_modifier_planes override
panfrost: update AFBC code to handle tiling for 64bpp formats
pan: Add 16 bit AFBC support (v10+ only)
mesa: Add R16G16_R16B16_UNORM and related formats
dri: check modifier in dri_create_image_from_winsys
panfrost: add 422 AFBC formats
nir: add intrinsics for pixel local storage
pan: fix a bifrost disassembly assert failure
panvk: fix ycbcr format issues on bifrost
panvk: enable ycbcr on bifrost
panfrost: do not allow skipping of fragment shader when alpha-to-coverage
panfrost: add benchmarking documentation
pan: add variant to shader name for G310 variants
pan drm-shim: add a way to specify the GPU variant in PAN_GPU_ID
pan: add actual register usage to the shaderdb stats
pan: pass a pointer to bi_compile_variant_nir, rather than a struct
pan: prettier output when statsfull flag is set
pan: pass pan_shader_info data to pan_stats_verbose
pan: refactor shader info setting
pan: move pan_shader_update_info call for bifrost
mesa: do not unbind general point when different indexed points are deleted
Erico Nunes (2):
ci: lima farm maintenance
Revert “ci: lima farm maintenance”
Erik Faye-Lund (115):
zink/ci: document a flake
zink/ci: document a nightly failure
radeonsi/ci: document flake
pvr: remove unused macros
pvr: remove needless include
pvr: move queue function to pvr_queue.c
pvr: break out pvr_instance and pvr_physical_device
pvr: factor out pvr_sampler
pvr: rename rogue_get_slc_cache_line_size
pvr: move non-rogue helpers to pvr_hw_utils.h
pvr: move static_asserts to source-files
pvr: rework pds_state array length logic
panfrost: initialize sig before use
panfrost: remove needless variable
pan/kmod: fix priority query logic
panvk: do not open-code debug_get_num_option
panvk: assert that shader_present isn’t zero
panfrost: remove stale code
pvr: split idep_pco_uscgen_programs_h in two
pvr: encapsulate border-table
pvr: encapsulate clear-state
pvr: factor out write_immutable_samplers
pvr: store has_pbe_stride_align_1pixel in pvr_device_features
pvr: respect has_pbe_stride_align_1pixel
pvr: limit availability of HW defs
mesa/main: correct formatquery error-handling
mesa/st: do not enable EXT_texture_buffer_object with rgba only
mesa/main: correct error message
mesa/main: do not check for ARB_texture_buffer_object for GL 3.1
v3d: only expose rgba buffer-textures
panfrost: only expose rgba buffer-textures
zink: only expose rgba buffer-textures
mesa: introduce and use _mesa_has_texture_buffer_range
panfrost/ci: remove some out-of-date xfails
mesa/st: do not drop binding prematurely
docs/panfrost: remove some stray newlines
pan: make S8_UINT code behave like the rest
pan: add support for float-formats
pvr: move include to source-file
pvr: do not store VkFormat in pvr_format
pvr: remove unused member
pvr: move border-specific format-code into pvr_border.c
pvr: split out pbe-details from main format-table
pvr: rework format binding flags
pvr: use strongly-typed enum instead of uint32_t
pvr: do not store compressed pbe-formats
pvr: add helpers to query limits based on device-info
pvr: fixup some includes
pvr: make queries arch-agnostic
pvr: replace constant-returning function with a macro
pvr: break out pvr_free_list into a separate module
pvr: rename colliding symbol
pvr: disable has_gs_rta_support for ge7800 as well
pvr: run clang-format
panfrost: do not over-estimate memory needed for dummy-rt
panfrost: factor out meat of pan_bytes_per_pixel_tib to helper
panfrost: do not over-estimate format tib-size
mesa/st: always override internal-format for 10-bit formats
pvr: add missing include
pvr: add missing forward-decl
pvr: store format-table in pvr_physical_device
pvr: limit availability of HW defs
pvr: factor out cmdbuf functions from pvr_query.c
pvr: factor out pvr_rt_dataset to separate module
pvr: factor out framebuffer-specific code
pvr: split pvr_device.c
pvr: split pvr_csb.c
pvr: split pvr_descriptor_set.c
pvr: mark pvr_border.c as multi-arch
pvr: mark pvr_pass.c as multi-arch
pvr: mark pvr_tex_state.c as multi-arch
pvr: mark pvr_job_compute.c as per-arch
pvr: mark pvr_cmd_buffer.c as per-arch
pvr: mark pvr_cmd_query.c as per-arch
pvr: mark pvr_job_render.c as per-arch
pvr: mark pvr_job_transfer.c as per-arch
pvr: mark pvr_job_context.c as per-arch
pvr: split pvr_image.c
pvr: mark pvr_hw_pass.c as per-arch
pvr: mark pvr_job_common.c as per-arch
pvr: mark pvr_sampler.c as per-arch
pvr: mark pvr_query_compute.c as per-arch
pvr: mark pvr_mrt.c as multi-arch
pvr: mark pvr_framebuffer.c as per-arch
pvr: prepare winsys files for multi-arch
pvr: only build pvr_dump_csb.c for rogue
pvr: make blit/clear-code rogue-specific
pvr: use rogue-prefix for rogue-specific code
pvr: pass device-info to a few winsys functions
pvr: build pvr_arch_*.c as a multi-arch sources
pvr: make some winsys files multi-arch
pvr: limit hw-defs to rogue
panfrost: enable texel-buffers for three-component formats
pvr: add missing include
pvr: add missing forward-declaration
pvr: clean up include
pvr: use per-arch macro for pvr_device_init_spm_load_state
pvr: use pvr_arch aliases
pvr: do not use PVR_PER_ARCH for static function
pvr: do not use alias in definition
pvr: rename PVR_PER_ARCH aliases to pvr_arch_ for clarity
pvr: use pvr_arch defines in implementations
docs/pvr: document the multi-arch approach
pvr: encapsulate format-table
docs: upgrade bootstrap to 5.3.8
docs: use option-directive
panfrost/ci: remove fixed CTS-flakes
panfrost/ci: remove fixed failures
panfrost/ci: add warning about g720 results
docs: remove ancient stuff from faq
docs/faq: do not recommend basing drivers on i965
lima: update unknown field
panvk: move cmd_resolve_attachments to panvk_vX_cmd_meta.c
panvk/meta: make helpers static
panvk: promote VK_EXT_robustness2 to VK_KHR_robustness2
Faith Ekstrand (201):
panvk: Fix integer dot product properties
util: Don’t advertise cache ops on x86 without SSE2
util: Build util/cache_ops_x86.c with -msse2
nvk: Include the chipset in the pipeline/binary cache UUID
nvk: Disable sampleLocationsSampleCounts for 1x MSAA
nvk: Emit inactive vertex attributes
nvk: Look at the right pointer in GetDescriptorInfo for SSBOs
nvk: Capture/replay buffer addresses for EDB capture/replay
panvk/shader: Implement [de]serialization of ASM and NIR strings
panvk/shader: [de]serialize desc_info.max_varying_loads
panvk/shader: Use the right copy size for deserializing dynamic UBOs/SSBOs
panvk: Add an in-memory shader cache
panvk: Use the build SHA for the pipeline/binary cache UUIDs
panvk: Enable the disk cache
zink: Disable building the zink_check_requirements tool for now
Update the Vulkan-profiles wrap to 1.4.330 and re-enable zink_check_requirements
vulkan/util: Add a vk_format_srgb_to_linear() helper
vulkan/meta: Handle VK_RENDERING_ATTACHMENT_RESOLVE_SKIP_TRANSFER_FUNCTION_BIT
vulkan/meta: Handle VkResolveImageModeInfoKHR
nvk: Plumb attachment flags through to MSAA resolve
nvk: Switch to CmdEndRendering2KHR()
nvk: Advertise the new maintenance10 format features
nvk: Advertise VK_KHR_maintenance10
nvk: Don’t re-initialize the descriptor writer if the set matches
nvk: Document some environment variables
nvk: Add an NVK_DEBUG=coherent flag
nvk: Enable ASTC on Tegra
nvk: VK_EXT_shader_uniform_buffer_unsized_array
pan: roll lower_texture() into postprocess()
pan/bi: Call constant folding in postprocess()
nir: Handle lowered I/O in lower_viewport_transform()
nir: Check the deref mode in lower_point_size()
pan/bi: Move lower_noperspective*() to postprocess()
pan: Move point size and viewport lowering to postprocess
vulkan/runtime: Add a get_push_range_for_stage() helper
vulkan/runtime: Add a vk_compile_shaders() helper
vulkan/runtime: Add an environment variable to validate shader binaries
nvk: Advertise VK_KHR_pipeline_binary
panvk: Initialize the disk cache earlier
panvk: Advertise VK_KHR_pipeline_binary
nir: Simplify assign_io_var_locations()
drm-uapi: Import the new NVIDIA modifiers
nil: Add support for Blackwell 8 and 16-bit modifiers
nir: Add a couple panfrost sysvals to divergence analysis
pan/compiler: Expose the bifrost optimization loop
panvk: Split var copies and lower local vars early
panvk: Lower copy_deref and indirect derefs before nir_lower_io
panvk: Only lower outputs to temporaries
panvk: Optimize in the preprocess hook
nir: Add a type parameter to nir_lower_point_size()
pan: Use nir_lower_point_size for the float16 conversion
panvk: Make noperspective_varyings const
panvk/dispatch: s/shader/cs/g
panvk: Add a panvk_common_sysvals struct
spirv: Assume variable workgroup size unless it’s set
pan/bi: Add some helpers an an info field for needing the extended FIFO
pan/bi: Add support for writing gl_PrimitiveID from IDVS
pan/genxml: Rename Primitive Index Override
panvk: Set primitive_index_override when prim ID is written by IDVS
spirv: Only set workgroup_size_variable on compute-like stages
vulkan/drm-syncobj: Stop returning early waiting for sync files
poly,asahi: Rename poly_tess_args to poly_tess_params
poly,asahi: Rename poly_ia_state to poly_vertex_params
asahi: Upload vertex and geom/tess params together
hk: Expose the vertex param buffer to other stages
poly,asahi: Move vertex_output_buffer to poly_vertex_param
poly,asahi: Fetch directly from poly_vertex_state::output_buffer in GS
SQUASH: poly,asahi: Move the output mask to poly_vertex_state
asahi: Reorder state uploads in agx_draw_patches()
poly: Rename poly_nir_lower_gs.h to poly_nir.h
poly: Add a poly_nir_lower_sysvals() pass
nir: Improve comments for a couple poly intrinsics
poly: Fetch the index size from a sysval
poly,asahi: Put the indirect draw directly in the geometry params
poly: Add helpers for filling out poly_geometry_params
poly: Add helpers for filling out poly_vertex_params
hk: Use the new poly param helpers
agx: Use the new poly param helpers
poly: Move vs_grid to poly_vertex_params
poly/asahi: Pull a bunch of vertex_id_for helpers into poly/prim.h
poly,asahi: Pull restart unrolling into libpoly
poly: Generalize unroll_restart() to arbitrary workgroup/subgroup sizes
poly: Make all heap allocations atomic
nvk: Add a dedicated_image to nvk_device_memory
nir: Add LAYER_ID and VIEW_INDEX to nir_lower_sysvals_to_varyings()
spirv: Emit SYSTEM_VALUE_LAYER_ID for fragment shaders
nir: Support sysval intrinsics in lower_sysvals_to_varyings()
microsof: Run lower_sysvals_to_varyings after lower_input_attachments
tu: Set use_layer_id_sysval for nir_lower_input_attachments
nir: Always use sysvals in lower_input_attachments()
pan: Move compiler to compiler/bifrost
pan: Move midgard to compiler/midgard
pan: Move util/* to compiler/
pan/bi: Add separate meson files for bifrost tests
pan: Add a central libpanfrost_compiler library
pan: Move pan_arch() to pan_model.h
pan: Move disassembly wrappers to a new pan_compiler.h
pan: Move pan_shader NIR helpers to pan_compiler.h
pan/compiler: Move all NIR passe definitions to pan_nir.h
pan/compiler: Move pan_ir.h into pan_compiler.h
pan: Move pan_shader_compile() to pan_compiler.h
pan/genxml: Decode blend shaders on CSF
pan/blend: Use flat inputs for blend shaders
panvk/jm: Delete panvk_varying_hw_format()
pan/bi: Fix LD_VAR_BUF indirect offset calculations
pan: Move PRINTF_BUFFER_SIZE to the compiler
pan: Drop bifrost_shader_blend_info::format
pan: Move pan_compile_shader to pan_compiler.c
pan/bi: Handle small vectors in bi_src_index()
pan/bi: Only delete function temp variables
pan/bi: Move opt_sink and opt_move calls to postprocess
panvk: Run pan_preprocess_nir() in the preprocess step
pan/bi: Run nir_lower_all_phis_to_scalar() late
panvk: Upload all variants at the end of compile_shader()
panvk: Add separate COMPUTE and FRAGMENT cases in compile_shader()
panvk: Use nir_instr_clone() for input attachment loads
panvk: Stop using descriptor helpers in lower_input_attachments
panvk: Break input attachment lowering into its own file
panvk: Call lower_input_attachment_loads() from compile_shader()
panvk: Re-prefix panvk_shader_desc_info/map with lower_
panvk: Move I/O lowering out of panvk_lower_nir()
panvk: Store the varying attribute descriptor count in desc_info
panvk: Only pass the panvk_shader_desc_info to panvk_lower_nir()
panvk: Make compile_inputs const in panvk_compile_nir()
panvk: Restructure VS variant handling
panvk: Pull multiview lowering out of panvk_lower_nir()
panvk: Drop compile_inputs from panvk_lower_nir()
drm-uapi: Sync the panthor header
drm-uapi: Sync the panfrost header
util: Move STACK_ARRAY into util
pan/kmod: Add a panfrost_kmod_driver_version_at_least() helper
pan/kmod: Expose the BO flags supported by a pan_kmod_device
pan/kmod: Add new helpers to sync BO CPU mappings
panvk: Mask off BO_FLAG_WB_MMAP in adjust_bo_flags()
panvk: Implement Flush/InvalidateMappedMemoryRanges()
panvk: Sync CPU maps around host image copies
panvk: Store the memory heaps/types in the physical device
panvk: Base memoryTypeBits on phys_dev->type_count
panvk: Advertise a HOST_CACHED memory type if we have WC maps
panvk: Add various flush/invalidate helpers for internal BOs
panvk: Map our standalone private BOs writeback when it makes sense
panvk: Add a write_desc_data() helper
panvk: Use write-back maps for descriptor sets
panvk: Use WB maps for command buffer memory
nvk: Check before claiming UNIFORM_TEXEL_BUFFER_BIT
nil: Claim buffer support for R64_[US]INT
nak: Use nir_lower_io_lower_64bit_to_32
nvk: Add support for 64-bit vertex attributes
pan/genxml: Get rid of non-existant Tiler Heap fields
pan/genxml: Add float internal and writeback formats
pan: Add a helper for packing blend constants
pan/blend: Add support for float blending
panfrost: Only set blend constants if needed
panfrost: Plumb through float blending equations
panvk: Set pan_blend_equation.is_float
panvk: Check can_fixed_function() before checking constants
pan: Add support for blending with F16 and F11/10 formats
util: Add a helper to convert color blend factors to alpha
pan/blend,panvk: Optimize blend equations
pan/bi: Dump shader to stderr
pan/bi: Use nir_print_shader() instead of nir_log_shader()
panvk: Lock around the compile_shaders() when debug dumping
nir: Add some new panfrost fragment shader intrinsics
pan/nir: Add a NIR pass to lower FS outputs to the new intrinsics
pan: Implement the new NIR FS intrinsics
pan/bi: Use bi_emit_collect_to() for load_const
pan/bi: Use MUX for setting LD_TILE sample indices
pan: Use nir_intrinsic_blend_pan for blend shaders
pan: Switch to nir_intrinsic_load_blend_input_pan
pan/compiler: Drop pan_compile_inputs::bifrost::rt_conv
pan/bi: Lower FS outputs to blend in NIR
pan: Move pan_nir_lower_writeout to midgard/
pan/genxml: Fix some sizeof() asserts
pan/genxml: Enable CSF tracing of RUN_FULLSCREEN
panvk: Use a full-screen barrier draw for FB barriers
pan/bi: Add a bi_instr::blend_target
panvk/csf: Stop calling blend_emit_descs() with no FS
pan/genxml: The BLEND array must be 64B aligned
panvk/csf: Set the correct DCD_FLAGS_1.render_rarget_mask
panvk/blend: Stop setting color_mask = 0
pan/bi: Mark whole flat variables
panvk/jm: Drop the loads_blend_const hack for uniform_count
panvk: Push our own blend descriptors
nir: panfrost tile loads are always divergent
pan/bi: Implement pack_32_4x8 natively
nir,pan: Rework the pafrost tile load intrinsic
nir,pan: Add and implement a new store_tile_pan intrinsic
nir/lower_blend: Move the format to nir_lower_blend_rt
nir/lower_blend: Optimize trivial logic op cases
nir: Expose the guts of nir_lower_blend as builder helpers
pan/blend: Use the blend builder helpers instead of nir_lower_blend()
panfrost: Lower pixel-local storage to load/store_tile in NIR
ci: Mark fbo-blending-format-quirks as a fail on G52
panfrost: SPDX everything
pan/genxml: Add lisence blocks to the XML files
panfrost: Add a few missing license blocks
panvk: Map ro_sink_address_poly to an OOB address
nvk: Enable ZPASS_PIXEL_COUNT in draw_state_init()
nir/lower_bool_to_bit_size: Use the correct num_components for conversions
pan/bi: Run lower_alu_width after opt_algebraic_late
pan/bi: Don’t attempt to fuse AND(ICMP, ICMP) if the AND is swizzled
Felix DeGrood (12):
intel/tools: make frame and cb index base-0 in intel_measure
intel/tools: add eop timestamp to intel_measure
intel/tools: make eop default
intel/tools: add cmdbuf/queue annotation parsing
intel/ds: reduce min sampling period of pps-producer to 5us
anv/pps: remove assert for double init
anv/rt: rewrite encode.comp for better performance
anv/rt: fully restore code to write instance_count
anv/rt: multithread writing of invalid leaves
anv/rt: reduce writes to block_incr_and_start_prim
anv/perfetto: include all pc reasons
anv/rt: avoid out of bound access by clamping global id
Frank Binns (5):
pvr: sort extensions alphabetically
pvr: Advertise VK_KHR_relaxed_block_layout
pvr: Advertise VK_KHR_storage_buffer_storage_class
nvk: remove duplicate header include
pvr: check image usage features against image features
Franz Hoeltermann (1):
device-select: Avoid usage of legacy GetPhysicalDeviceProperties This caused validation errors and redundantly called both the new “2” variant and the legacy variant
Georg Lehmann (172):
nir: remove manual nir_load_global
treewide: use nir_load_global alias of nir_build_load_global
nir: remove manual nir_store_global
treewide: use nir_store_global alias of nir_build_store_global
nir: remove manual nir_load_global_constant
treewide: use nir_load_global_constant alias of nir_build_load_global_constant
aco/optimizer: re-index labels
aco/optimizer: add seperate fp16 abs/neg/fcanonicalize labels
aco/optimizer: rework canonicalized label
aco/optimizer: replace 64bit mul with 1.0/-1.0 with bitwise instruction if possible
aco/isel: emit v_mul_f64 with modifiers for fneg/fabs
aco/isel: emit v_mul_f64 for fp64 fsat
aco/optimizer: fix applying 64bit neg/abs
aco/optimizer: apply fp64 modifiers
aco/tests: add some simple fp64 modifier tests
aco/lower_to_hw: emit vop2 for gfx12+ fp64 reductions
aco/isel: emit vop2 v_fadd_f64 for gfx12+
aco/isel: emit vop2 v_mul_f64 for gfx12+
aco/isel: emit vop2 v_min_f64 for gfx12+
aco/isel: emit vop2 v_max_f64 for gfx12+
aco/isel: emit vop2 v_lshlrev_b64 for gfx12+
aco/opcodes: remove VOP3 alias for new gfx12 VOP2 opcodes
aco: fix v_mad_mix denorm behavior
aco: allow v_fma_mix with denorms for gfx9 chips where it’s fused
aco/optimizer: never unfuse fma
radv: do not report wave32 in gl_SubgroupSize for Doom Dark Ages
aco/gfx10_3: work around NSA hazard
nir/opt_algebraic: optimize open coded pack_32_2x16
aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits
aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits
aco,nir: support subdword v_permlane_b16
aco/optimizer: refactor insert
aco/optimizer: add extract_float helper
aco/optimizer: make label_mad more generic
aco/optimizer: add new helper functions for combining two instructions
aco/optimizer: use new helpers to create fma
aco/optimizer: create fma with s_mul_f32/f16
aco/optimizer: add less agressive pattern matching option
aco/optimizer: use new helpers for min3/max3/minmax/maxmin
aco/optimizer: use new helper functions to create med3
aco/optimizer: create max3/min3/med3 with salu min/max
aco/optimizer: use new helpers to optimize mul(b2f(a), b)
aco/optimizer: use new helpers for add16 opts
aco/optimizer: use new helpers for packed fma
aco/tests: test packed fma opts
aco/optimizer: reduce max alu_opt_info stack operands to 4
aco/optimizer: parse pseudo alu instructions
aco/optimizer: use new helpers for v_or opts
aco/optimizer: use new helpers for xor opts
aco/optimizer: use new helpers for v_add_u32 opts
aco/optimizer: optimize add(mad_u32_u16(a, b, 0), c)
aco/optimizer: use new helpers for s_lshl<n>_add_u32
aco/optimizer: use new helpers for v_add_lshl_u32
aco/optimizer: add more v_add_lshl_u32 opts
aco/optimizer: use new helpers for v_and opt
aco/optimizer: use new helpers for remaining add opts
aco/optimizer: use new helpers for v_sub opts
aco/optimizer: use new helpers for bitwise n2 opts
aco/optimizer: add some bitop combining
aco/optimizer: use cndmask for neg(b2i)
aco/optimizer: some more mul opts
aco/optimizer: create ff0/bcnt0
aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))
aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)
nir/opt_algebraic: create more bit test
aco/opt_postRA: allow v_cmpx to clobber exec before nop split/create vector
aco/optimizer: move med3 -> add_clamp opt later
aco/optimizer: add new helpers for applying output modifiers
aco/optimizer: handle gfx11+ vinterp as fma special case
aco/optimizer: use new helpers to apply neg/abs to output of instructions
aco/optimizer: back propagate modifiers through rcp
aco/optimizer: use new helpers to apply packed fsat
aco/optimizer: use new helpers to apply insert
aco/optimizer: use new helpers to create v_fma_mixlo_f16
aco/optimizer: use new helpers for omod/clamp
aco/optimizer: apply omod to pseudo scalar trans instructions
nir: don’t sink alu that uses ballot(true)
nir/peephole_select: allow ballot
nir/peephole_select: allow mbcnt_amd
aco/optimizer: fix uses in to_uniform_bool_instr
aco/optimizer: validate uses
aco/optimizer: propagate salu fneg
aco/optimizer: propagate salu fabs
nir/opt_uniform_subgroup: don’t try to optimize non trivial clustered reduce
nir/opt_uniform_subgroup: fix swizzle_amd without fetch_inactive
nir/divergence_analysis: fix swizzle_amd without fetch inactive
nir/opt_uniform_subgroup: use nir_shader_intrinsics_pass
nir/opt_uniform_subgroup: wire up mbcnt_amd path
nir/opt_uniform_subgroup: handle more trivial shuffles/votes
aco/optimizer: keep pass_flags valid for all instructions
aco/isel: emit exec copy for ballot(true)
aco/optimizer: fix skip_smem_offset_align with non temp register operands
aco/optimizer: propagate fixed registers
aco/optimizer: propagate fixed regs to copy/extract/insert
aco/isel: emit register copies for workgroup ids
radv: optimize known front_face_fsign too
aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt
ac/llvm/gfx6: move mrtz writemask workaround to ac_build_export
ac/nir/lower_ps_late: remove gfx6 mrtz writemask workaround
radv/nir: fix radv_nir_remap_color_attachment progress
radv: consider dual src blend for when epilog needs alpha
radv: gather color0_written with scalar io correctly
radv: eliminate unused FS output channels
radv/nir: fix front_face_fsign opt
radv: use nir_opt_uniform_subgroup
radeonsi: use nir_opt_uniform_subgroup
aco/isel: remove uniform reduce/scan optimization
aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
zink/ci: update radv trace checksums
nir/divergence: add nir_def_is_divergent_at_use_block helper
nir/opt_uniform_subgroup: optimize min/max/and/or reduce of bcsel(div, con, con)
nir/opt_uniform_subgroup: optimize add/xor reduce of bcsel(div, con, con)
gallivm: use nir_alu_instr_is_sz/nan_preserve
nir: use a seperate enum for per alu floating point math control
nir/opt_varyings: use per instruction inf/nan flag for moving past interp
nir/opt_varyings: use per instruction nan flag for promoting to flat
vtn: implement default fp_math_ctrl without using execution mode
gallivm: stop using per shader float fast math flags
nir: remove per shader float fast math flags
ac/nir/cull: do not reuse variables if subgroup ops are used
aco: allow opsel for last v_alignbyte/bit operand
nir/opt_uniform_subgroup: optimize uniform ddx/ddy
nir/opt_algebraic: explicitly add some -0.0 variants of patterns
nir/opt_algebraic: canonicalize scmp with -0.0
nir/search: respect sign of zero when comparing floats
nir/opt_algebraic: replace is_negative_zero with constant -0.0
ci: disable vmware farm
util: add IEEE 754-2019 min/max number
nir/opcodes: fix fsat signed zero correctness
nir/opcodes: use util_max_num/util_min_num for fmin/fmax constant folding.
nir: prevent undefined behavior in idiv/imod/irem constant folding
nir/opt_varyings: actually clone alu math control to different shader
nir: document signed zero, inf, nan preserve flags
nir: add nir_alu_instr_is_exact helper
spirv: don’t set float control for integer dot
nir: move exact bit to nir_fp_math_control
amd/drm-shim: add vega20
nir/opt_algebraic: move fsat last for fsqrt(fsat(a))
ac/nir/lower_sin_cos: use nir_shader_alu_pass
ac/nir/lower_sin_cos: preserve fp_math_ctrl
ac/nir/opt_pack_half: preserve fp_math_ctrl
ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports
aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards
aco/tests: test VALUReadSGPRHazard with v_cmpx
aco/tests: test VALUMaskWriteHazard with v_cmpx
aco/tests: don’t destroy vk_device if it was never created
nir: make fquantize2f16 32bit only
nir/constant_expression: remove fquantize2f16 denorm special case
hasvk: create a new intrinsic for push constant to uniform load lowering
brw: make sure nir_opt_algebraic_late was called after late brw_nir_optimize
nir/constant_expressions: don’t avoid unused source variable warnings
nir/constant_expressions: flush input denorms if denorms have to be flushed
nir: document that both input and output denorms have to be flushed
nir/opt_algebraic: use fcanonicalize
nir/search: allow inexact patterns if denorms have to be flushed
ci: update trace checksums
radeonsi: only override float_mode for llvm
aco: add fma_mix opcodes with rtz fp16 rounding
aco/insert_fp_mode: exclude some instructions that will never round
aco/insert_fp_mode: insert fp mode in reverse
aco/optimizer: support fma_mix with rtz
aco/optimizer: apply v_cvt_pkrtz_f16_f32 as fma_mix to operands
ac/nir,radv: remove ac_nir_opt_pack_half
aco/optimizer: fix parsing salu p_insert as shift
aco: fix demote in header of single iteration loop
aco: add a helper function for non supported DPP opcodes
aco: disable DPP for rev integer subs and shifts
nir/opt_algebraic: use correct syntax to create exact fsat
aco/lower_branches: consider jump target of conditional branches based on vcc
aco: handle all SALU that modifies PC in needs_exec_mask
aco/opt_postRA: don’t optimize across calls
Gert Wollny (23):
r600/sfn: rework 64 bit to vec2 32 bit lowering
r600/sfn: drop unused code
r600/sfn: correct register interference range
r600/sfn: drop range pinning for registers after RA
r600/sfn: extract function to update group after instr insert
r600/sfn: move some common code into try_readport
r600/sfn: Track whether a ALU group has a exec flag update
r600/sfn: make sure kill and update_exec don’t happen in one group
r600/sfn: AR loads are not dependend on the future and other code blocks
etnaviv: isa: Add “thread” info to TEX instruction
r600/sfn: Don’t start a new ALU-CF if LDS pipeline loads are pending
r600: Handle dummy dest in assembler and disass
r600/sfn: remove some unused static variables
r600/sfn: Silence warning about unused parameter
r600/sfn: Don’t assign dest registers in non-write interpolation slots
r600/sfn: fix querying number of sources for LDS ops in readport validation
r600/sfn: don’t use dummy register with non-write 64 bit slots
r600/sfn: change register ID of dummy dest register
r600/sfn: Add slot access operator to AluGroup
r600/sfn: Make value factory a member of the block scheduler
r600/sfn: Add method to force-override the dest of an AluInstr
r600/sfn: Fix test creation and handling of 3-src without dest
r600/sfn: use PS and PV inline registers when possible
Gil Pedersen (1):
intel: Add PIPE_FORMAT_R10G10B10X2_UNORM support
Gurchetan Singh (38):
virtio: kumquat: slice length fix
gfxstream: kumquat: opaque fd or dmabuf, not both
gfxstream: codegen: add vkTraceAsyncGOOGLE to GLOBAL_COMMANDS_WITHOUT_DISPATCH
gfxstream: codegen: remove CheckOutOfMemory
gfxstream: fix build after VK 1.4.33.0 spec update
gfxstream: meson format -i {all meson files}
subprojects: update rustix and libc to newer versions
subprojects: enable proper cross-compile on MinGW of certain crates
subprojects: add windows-link and windows-sys
subprojects: rustix: enable windows + macos build support
subprojects: errno: support for windows
util: rust: more rust support for windows/MacOS
util: be consistent about transitive dependencies
gfxstream: WindowsVirtGPU.h –> WindowsVirtGpu.h
gfxstream: enable kumquat building on Windows
gfxstream: silence non-null Clang check on Android
gfxstream: make functions static when needed
gfxstream: delete createImmutableSamplersFilteredImageInfo
gfxstream: codegen: don’t generate custom protocols in function table
gfxstream: more fixes for missing prototypes
util: fix arithmetic on a pointer to void warning
meson: add -Wgnu-pointer-arith to _trial_msvc
android_stub: add missing definition
util: fix error about missing include
android_stub: fix missing prototypes issues
gfxstream: fix logspam in TLS helper function
gfxstream: fix warning
gallium/tessellator: fix -Wmissing-prototype issues
gfxstream: drm_fourcc.h –> drm-uapi/drm_fourcc.h
gfxstream: explicitly list Python dependencies for gfxstream codegen
gfxstream: filter VkPhysicalDeviceProperties2 structs before encoder call
freedreno: check dependencies before running custom_target(..)
virtio/kumquat: fixes to enable meson2hermetic
meson: check for <poll.h>
meson: avoid calling nm.full_path() when tool is not found
meson: add dependency on android-hwvulkan-headers
meson,gfxstream: add Android support via meson2hermetic
gallium: fix sometimes-uninitialized warning
Hans-Kristian Arntzen (4):
vulkan/wsi: Promote EXT_swapchain/surface_maintenance1.
vulkan: Add KHR_swapchain_maintenance1 promotions.
vulkan/wsi: Add missing KHR_surface_maintenance1 promotions.
egl/x11: Fix memory leak when querying translated coord.
Hyunjun Ko (9):
anv/video: rework for handling alternative quantizer for vp9 decoding.
anv/video: handling segmentations features for vp9 decoding
vulkan/video: Fix H.265 short-term reference picture set handling
vulkan/video: Fix H.265 long-term reference handling
anv/video: fix VP9 chroma subsampling format detection
anv/video: clean up VP9 picture state setup
anv/video: fix a typo in Vulkan AV1 decoding.
anv/video: Compute AV1 tile positions internally
anv/video: disable encoder on untested platforms
Iago Toral Quiroga (2):
broadcom/compiler: use nir_opt_uub
nir/opt_vectorize_load_store: allow sizes unaligned with high offset for loads
Ian Forbes (6):
svga: Check if Stencil buffer is NULL
svga: Enable GL_ARB_texture_mirror_clamp_to_edge
svga: Fix vertex-fallbacks Piglit test
svga: Don’t crash if only one of Depth or Stencil buffer is present
svga: Report “VRAM” more accurately
svga: Set modifier in surface_get_handle
Ian Romanick (36):
nir/algebraic: Don’t generate integer min or max that will need to be lowered
brw: Apply Gfx9 vgrf127 workaround in more cases
elk: Apply vgrf127 workaround in more cases
nir/opt_if: See through inot
brw: Correctly generate conditional modifier for BFN
vulkan: Fix incorrect assert
nir/opt_if: Specify which branches are valid for evaluate_if_condition
nir/opt_if: Conditionally do not propagate constants through bcsel
nir/opt_if: Both parts of logic-joined conditions can be evaluated
brw: Don’t spill_all on internal shaders
brw: Force allow_spilling when spill_all is set
brw: Don’t pass compressed to brw_lower_vgrf_to_fixed_grf
brw: Return the new register from brw_lower_vgrf_to_fixed_grf
brw: Add OPT macro to brw_shader.cpp like brw_opt.cpp
brw: Add fill and spill opcodes for LSC platforms
brw: Eliminate redundant fills and spills
brw: Eliminate duplicate fills
lavapipe: fp16 flrp must also be lowered
nir/lower_flrp: Check and set shader_info::flrp_lowered
glsl: Move flrp lowering out of the loop
elk: only lower flrp once
broadcom/compiler: only lower flrp once
vc4: Don’t call nir_lower_flrp in vc4_optimize_nir
nir/algebraic: Mask with shifted constant instead of shift-then-mask
brw: Add brw_reg::is_grf
brw/cmod: Don’t propagate between instructions in different groups
brw: elk: Disable can_do_cmod for MACH
brw/cmod: Allow FIXED_GRF
brw/dce: Don’t generate more NULL destinations after brw_lower_3src_null_dest
brw/cmod: Propagate to an instruction with same source
brw: Do cmod prop again after post-RA scheduling
brw: Do cmod prop again after scheduling
nir/algebraic: Add missing f on F-strings
nir/algebraic: Detect missing f on F-strings
mesa: Fix segfaults in _mesa_delete_program and _mesa_reference_program_
iris/elk: Restore setting nir->num_uniforms to zero.
Icenowy Zheng (19):
gallivm: orcjit: remember Context in addition to ThreadSafeContext
pvr: enable samplerMirrorClampToEdge feature
pvr: fix cleaning up failed CreateDevice
pvr: fix PVR_DEBUG=info when running w/o KHR_display
pvr: copy WSI can_present_on_device function from PanVK
vulkan/wsi/headless: do not destroy images that are never created
pvr: advertise VK_KHR_incremental_present
pvr: prevent a NULL dereference for pass-less pipeline creation w/o info
pvr: advertise VK_EXT_headless_surface
pvr: advertise X11-related WSI instance extensions
zink: add Mesa powervr to explicit sync / invalid<->linear allowlists
zink: only warn about fillModeNonSolid when used
gallivm: orcjit: support GALLIVM_DEBUG=dumpbc
mesa: workaround GL_INVALID_OPERATION in GLES 2.0 draws
mesa: fix GL_INVALID_OPERATION with GLES1/2 + Kopper
mesa: fix GL_INVALID_OPERATION when releasing buffer in GLES1/2 ctx
nir/algebraic: fix Python-3.10-incompatible syntax
pco: add NIR global_atomic lowering
vk: descriptors: sort bindings along with flags
Isaac Marovitz (1):
kk: BCn Formats
Iván Briano (10):
hasvk: don’t report custom sample locations for sample count 1
brw: plug some holes in brw_wm_prog_data
brw: shut -Wmaybe-uninitialized up
anv: report actual AS descriptor limits
nir: clear SAMPLE_MASK_IN if we lowered it
nir: add nir_lower_single_sampled::lower_sample_mask_in option
anv: maxFragmentShadingRateCoverageSamples is 16 on all platforms
anv: coarse_pixel doesn’t require any InputCoverageMaskState
anv: enable fragmentShadingRateWithShaderSampleMask on Xe2+
brw: fix local_invocation_index with quad derivaties on mesh/task shaders
Janne Grunau (3):
hk: Report the correct plane count in VkDrmFormatModifierProperties2?EXT
meson: Add asahi to aarch64’s auto-generated drivers
util/driconf/asahi: Override GL renderer for web browsers
Jarrett Johnson (1):
kk: advertise multiDrawIndirect
Jason Macnak (6):
gfxstream: Handle BGRA in Gfxstream AHB format conversions
gfxstream: codegen changes for new filenames and namespaces
gfxstream: Add Vulkan func/structs for passing debugging data to host
gfxstream: Remove unnecessary tag to simplify perfetto trace config
gfxstream: Reland “Add Vulkan func/structs for passing debugging da…”
gfxstream: Reland “Remove unnecessary tag to simplify perfetto trac…”
Jeff Burnett (1):
util: Don’t force 64-bit division on 32-bit platforms
Jesse Natalie (17):
d3d12: Only try to compute scaled point size for stream 0
u_threaded_context: Use 64-bit bitmask utils
zink: Fix 64-bit bitmask usage
mesa: Cast bitmasks to 64-bit before negating
dzn: Suppress new MSVC warning by upconverting to uint64_t
spirv2dxil: Move clip/cull merging from common passes to just spirv2dxil passes
wgl: Support contexts created from non-window DCs
wgl: Only swap back and front buffers after a successful present
wgl/d3d12: Return success based only on Present return
d3d12: Allow state promotion for non-simultaneous access textures
d3d12: Decay state when resolving context -> global state
d3d12: Assert that there’s no front buffer writes
d3d12: Ensure that flush_resource causes batches to get flushed
d3d12: Don’t promote to read-write states
d3d12: Fix resolving global state vs per-context state with promotion
d3d12: Don’t use D3D12 B8G8R8X8 format
nir: Suppress ‘potentially uninitialized local pointer variable used’ warning
Jianxun Zhang (5):
anv: And a new function to consolidate import paths
isl: Add a macro for number of maximum planes of modifiers
anv: Replace ANV_MAX_PLANES with ISL_MODIFIER_MAX_PLANES
anv: Use gralloc helper to get tiling
anv: Enable compression on importing Android buffers (xe2)
Job Noorman (31):
ir3: move ir3_catN_absneg to ir3.c
ir3: add has_sel_b_fneg compiler flag
ir3: allow (neg) on sel.b on a6xx gen4+
ci,marge_queue: read token from file by default
nir: mark fneg distribution through fadd/ffma as nsz
ir3/ra: fix assert during file start reset
ir3/ra: reset merge set preferred reg when unavailable
spirv: don’t set in_bounds for structs
spirv: set in_bounds for ptr_as_array
nir: print in_bounds info for deref_type(_ptr_as)_array
rusticl: fix mismatched-lifetime-syntaxes lint warning
nir: add has_umul_16x16 option
nir: add opt_uub pass
ir3: add support for umul24
ir3: removed unused parameter from ir3_optimize_loop
ir3: add options parameter to ir3_optimize_loop
ir3: enable nir_opt_uub
ir3: add ir3_disasm_options struct
freedreno/computerator: add option to print raw disassembly
ir3: don’t use list_head for rpt groups
ir3: merge rpt groups after postsched
ir3/ra: try to allocate subreg movs earlier
ir3/ra: try to allocate overlapping regs for shared subreg movs
tu: add UBO lowering workaround for Yooka-Laylee
ir3/legalize: run dbg nop/sync sched later
ir3: print eq and needs_helpers instruction flags
ir3/legalize: schedule (eq) more accurately
ir3/bisect: fix off-by-one issues while bisecting
ir3/legalize: fix (eq) scheduling for sam.s2en
ir3: print (eolm)/(eogm) flags
tu,freedreno: add chicken bit to enable (eolm)
John Anthony (3):
panfrost: Add shader core count to RENDERER string
panvk: Add shader core count to deviceName
pan: Use correct architecture name for v12+
Jonathan Marek (1):
tu: remove magic bo reg packing (use iovas directly)
Jordan Justen (5):
intel/dev: Add INTEL_PLATFORM_NVL_U platform enum
intel/dev: Add NVL-S/U device info
intel/dev: Add NVL-S/U PCI IDs (with FORCE_PROBE required)
intel/brw: Add brw_data_type_float/brw_data_type_int
intel/brw: Add new encode/decode for use with brw_data_type_float/int
Jose Maria Casanova Crespo (10):
v3d: mark FRAG_RESULT_COLOR as output_written on SAND blits FS
v3dv: use vk_drm_syncobj_copy_payloads helper
v3dv: Enable VK_FORMAT_A2R10G10B10_UINT_PACK32 format
v3dv: Enable VK_FORMAT_B8G8R8A8_SINT and VK_FORMAT_B8G8R8A8_UINT formats
v3dv: Enable VK_FORMAT_B8G8R8A8_SNORM format
v3dv: only apply simulator stride alignment for from_wsi images
broadcom/compiler: enable umul24 and imul24 ALU opcodes
broadcom: Drop use of nir_lower_wrmasks
v3d: Enable TFU blits with raster destinations on 7.1 HW (RPi5)
v3dv: Enable TFU blits with raster destinations on 7.1 HW (RPi5)
Josh Simmons (1):
radv: Fix crash in sqtt due to uninitalized value
Joshua Ashton (1):
vulkan/wsi: Handle 0xFFFFFFFF special case in vk_wsi_force_swapchain_to_current_extent driconf
Joshua Simmons (1):
vtn: Fix OpCopyLogical destination type
José Expósito (2):
winsys/amdgpu: Fix userq job info log on PPC
venus: Fix error log on PPC
José Roberto de Souza (29):
intel/dev: Add supports_low_latency_hint to intel_device_info
anv: Add support for low latency hint on Xe KMD
iris: Release global_bufmgr_list_mutex on missing error paths
iris: Move code to emit binding tables to its own function
iris: Improve iris_emit_binding_tables()
iris: Move code to emit push constants to its own function
iris: Rename iris_binding_table::sizes to iris_binding_table::surf_count
intel/brw: Split to a function the code that calculate sampler channels that should be written
iris: Fix slab memory leak
iris: Make uint32 the type used for slab sizes
intel/brw: Nuke brw_inst::is_volatile()
drm-uapi: Sync xe_drm.h
anv: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION
iris: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION
intel/brw: Add comment to ubo_ranges
intel/brw: Document UBO_START
anv: Fix variable shadowing
anv: Set push_constant_range once
iris: Reduce COMPUTE_WALKER_BODY code duplication
iris: Remove duplicated iris_measure_snapshot(INTEL_SNAPSHOT_COMPUTE)
iris: Nuke iris_emit_execute_indirect_dispatch()
intel/blorp: Remove duplicated calls in blorp_exec_compute()
anv/hasvk: Nuke register_config from anv_performance_configuration_intel
anv/hasvk: Add intel_perf_get_configuration_id() and replace intel_perf_load_configuration() usage
intel/perf: Nuke intel_perf_load_configuration() and related code
intel/perf: Add Xe2 mdap_metrics struct and set it
intel/perf: Extend Xe2 mdap_metrics to Xe3
intel/perf: Change mdapi switch cases from ver to verx
intel/perf: Add Gfx 12.5 mdap_metrics struct and set it
Juan A. Suarez Romero (31):
v3d/ci: add new flakes for rpi5
broadcom/ci: document some of the failures
broadcom/ci: disable baremetal jobs already running with CI-Tron
broadcom/ci: unlock more CI-Tron jobs
vc4/simulator: create GEM BOs in GTT memory for AMD GPUs
vc4/simulator: add helper to get stride alignment
vc4: set stride alignment when using simulator
v3d/simulator: create GEM BOs in GTT memory for AMD GPUs
broadcom/simulator: add helper to get stride alignment
v3d: set stride alignment when using simulator
v3dv: align width to 256 when using simulator
v3d: enable forward facing primitive for lines and points
v3dv: enable forward facing primitive for lines and points
broadcom/ci: adjust fractions for nightly jobs
broadcom/ci: unlock more CI-Tron jobs
v3dv/ci: add timeout in expected list
broadcom/ci: unlock more CI-Tron jobs
v3d/ci: add SKQP failure
v3d/ci: update expected results
broadcom/ci: remove all baremetal nightly jobs
broadcom/ci: remove `ci-tron-` prefix from nightly jobs
broadcom/ci: update expected list
broadcom/ci: set testgroup size for asan
v3d: don’t build disk cache access on shader disablement
v3dv/ci: skip tests causing GPU issues
broadcom/compiler: enable skip_helpers
v3dv: create a proper load_uniform instruction
nir: add ACCESS to load_uniforms
broadcom/compiler: use skip_helpers with textures, UBOs and SSBOs
v3d/ci: add new fail in expected list
broadcom/cle: bump up gen version for v3d
Julia Zhang (1):
amdgpu/virtio: unmap bo in destroy_host_blob
Julian Orth (1):
kopper: disable color management for wayland surfaces
Juston Li (2):
anv/android: align AHardwareBuffer naming to ahb
anv/android: query and use explicit layout for ahb resolve
Karmjit Mahil (12):
tu: Use TU_BREADCRUMBS_ENABLED value
ir3: fix comparison of different signedness build issue
freedreno/afuc: Fix potentially uninitialized variable
freedreno/decode: Add code to extent pkt processing with Lua
freedreno/cffdump: Emulate RMW
freedreno/docs: Add -k option to nc command
freedreno/registers: Remove extra space in reg definition
meson: Use adreno-pm4-pack.xml.h instead of custom definitions
freedreno/decode: Add some code to the already present generate-rd
freedreno/registers: Mark functions as constexpr where possible
freedreno/registers: Clarify bit 64B of CP_REG_TO_MEM
gallium: Fix gnu-empty-initalizer error
Karol Herbst (36):
nak: extract cmat load/store element offset calculation
nak: ensure deref has a ptr_stride in cmat load/store lowering
nak: simplify SM80 HMMA latency categorization
nak: improve fp16 latencies on Ampere
nak: fix MMA latencies on Ampere
st/interop: fix fence leak
rusticl/queue: fix error code for invalid queue properties part 1
rusticl/queue: fix error code for invalid queue properties part 2
rusticl/queue: fix error code for invalid sampler kernel arg
rusticl/kernel: take no kernel_info reference inside the launch closure
rusticl/spirv: preserve signed zeroes by default
rusticl/kernel: fix clGetKernelSuggestedLocalWorkSizeKHR implementation
rusticl/kernel: Do not run kernels with a workgroup size beyond work_dim
nvk/ci: add broken coop matrix CTS tests to skips
nak/cmat: add alignment info to matrix load/stores
nak/cmat: add optimisation to cmat load/store to do 32-bit load for f16vec2
nir: mark cmat_load_shared_nv as CAN_ELIMINATE
nak: add Movm
nak/cmat: use movm
nir: add ACCESS to shared_uniform_block_intel
rusticl: remove unecessary transmutes around uuids
rusticl/mesa: remove unnecessary lifetimes
rusticl/mesa: convert pointer to ref without transmute in PipeScreen::from_raw
gallium: add SUBGROUP_FEATURE bits for rotate and rotate_clustered
llvmpipe: advertise support for subgroups in all stages
clc: handle all optional subgroup extensions
rusticl: properly check for subgroup support
docs: reorder and add zink to CL subgroup entries
clc: reorder headers to fix compilation errors due to UNUSED
clc: support some atomic and generic address space features
clc: enable generic address space and seq_cst and device scope atomic features
nir: fix nir_fixup_is_exported for LLVM-22
clc: fix compile compatability with LLVM-22
rusticl/mesa: only use resource_from_user_memory if the cap is advertised
vtn/opencl: flush denorms for cbrt()
vtn: set default fp_math_ctrl values for kernels
Kenneth Graunke (52):
iris, crocus: Disable new IO slot validation for FB fetch load_output
elk: Disable IO semantic validation when remapping patch offsets
ci: Run intel shader-db on Haswell, Broadwell, and Meteorlake
nir: Drop writemask from all Intel memory store intrinsics
brw: Add an assertion that writemasks can be fully ignored
brw: Use nir_intrinsic_[set_]base rather than poking at const_index[0]
brw: Store brw_urb_inst::offset in bytes on Xe2
iris: Use iris_any_prog_key, not brw_any_prog_key
brw: Delete program_string_id from brw program keys
brw: Delete input_slots_valid from brw_wm_prog_key
brw: Set extended_bindless_surface_offset to true for Gfx12.5+
nir: add new intrinsics to load/store from URB on intel
brw: Implement URB handle intrinsics for TCS and TES stages
brw: Pass devinfo to brw_nir_lower_tes_inputs
brw: Flip the TESS_LEVEL_INNER/OUTER vue map slot assignments
brw: Rework the tess level remapping interface
brw: Remap tesslevels before other patch remapping
brw: Rename remap_non_header_patch_values to remap_patch_values
brw: Pass devinfo into remap_patch_urb_offsets
brw: Lower tesslevel vars to vectors even for unlinked TCS/TES
brw, anv, iris: Switch to reversed patch header layouts
brw: Rename read_attribute_payload_intel to load_attribute_payload_intel
brw: Generalize read_attribute_payload_intel to handle more cases
brw: Use io_sem.location instead of base to get varying slots
brw: Add infrastructure for lowering to URB intrinsics
brw: Switch to NIR URB intrinsics for TCS outputs
brw: Switch to NIR URB intrinsics for TES inputs
brw: Switch to URB intrinsics for TCS inputs
brw: Rewrite legacy tess level remapping
brw: Drop check for legacy tess levels from remap_patch_urb_offsets
brw: Combine output stores for TCS outputs even when unlinked
intel/elk: Also disable output constant offset src folding
brw: Fix outdated comments about urb->offset units
brw: Use LSC extended descriptor offsets for Xe2 URB messages
intel: Replace signed char with int8_t
brw: Calculate tessellation URB offsets when lowering to URB intrinsics
brw: Rename brw_nir_lower_vue_inputs to brw_nir_lower_gs_inputs
brw: Add missed access to store_urb_lsc_intel intrinsics
brw: Delete attr_desc struct
nir: Fix mod analysis of ishl to shift the recursive result
brw: Extend load_urb/store_urb to handle 32-bit non-vec4-aligned access
brw: Make lower_{inputs,outputs}_to_urb_intrinsics non-static
brw: Extend URB lowering infrastructure to handle mesh shader outputs
brw: Lower mesh shader outputs in NIR
brw: Lower task shader payload access in NIR
brw: Delete all the old backend mesh/task URB handling code
nir: Support Intel URB intrinsics in nir_opt_offsets
brw: Call nir_opt_offsets for mesh shaders
brw: Update try_load_push_input to handle dword-unit offsets too
brw: Make max_push_bytes a parameter to URB lowering data
brw: Move GS URB Read Length limiting to brw_nir_lower_gs_inputs()
brw: Convert GS pulled inputs to use URB intrinsics
Khem Raj (1):
glx: fix const qualifier warnings found with C23 glibc support
Kitlith (3):
hk: override can_present_on_device
panvk: Free drm device in can_present_on_device
pvr: Free drm device in can_present_on_device
Konstantin Seurer (60):
vulkan/cmd_queue: Fix indentation for struct array copies
vulkan/cmd_queue: Free all elements of struct arrays
radv/bvh: Add radv_first_active_invocation
vulkan: Add vk_ir_header::driver_internal
vulkan: Bump MAX_ENCODE_PASSES to 4
vulkan/bvh: Add some debug helpers
radv/rra/gfx12: Properly validate geometry indices
radv: Emit compressed primitive nodes on GFX12
vulkan: Remove the vk_ir_triangle_node::id field
vulkan/bvh: Add leaf.h to vk_bvh_includes
radv/bvh: Pair compress triangles in more cases
aco: Fixup out_launch_size_y in the RT prolog for 1D dispatch
radv: Always use compact bvh encoding
radv: Report smaller bvh sizes when possible
lavapipe: Bump maxPrimitiveCount
lavapipe: Zero image null descriptors
lavapipe: Bump MAX_DESCRIPTOR_UNIFORM_BLOCK_SIZE
gallivm/nir/soa: Use the sign of src1 for imod
llvmpipe: Always recompute 1/w
nir: Remove parallel copy handling from rewrite_uses_to_load_reg
nir/from_ssa: Stop using nir_parallel_copy_instr
nir: Remove nir_parallel_copy_instr
radv: Add re-format commit to .git-blame-ignore-revs
nir: Move nir_def directly after nir_instr
treewide: add & use parent instr helpers
nir: Remove nir_def::parent_instr
nir: Fix typo in nir_opt_ray_query_ranges
nir: Ignore ray query ranges that don’t start with rq_initialize
radv: Use hw_leaf_node_count for computing BVH size
radv/rra/gfx12: Fix primitive/geometry index validation
radv/bvh: Assert that indices_midpoint is valid
radv/bvh: Fix calculating the vertex payload/prefix sizes
radv/bvh: Avoid a slow case when compressing triangles
radv/nir: Use fmt_idx correctly
radv: Optimize BVH4 acceleration structure updates
nir/opt_algebraic: Remove a pattern for 8bit floats
nir/opt_algebraic: Do not emit patterns for 64bit booleans
nir/print: Print annotations as comments
nir: Allow shaders in tests to be annotated
nir: Allow using nir_eval_const_opcode in C++ code
nir: Add f2f16_ru/rd opcodes
spirv: Add internal f2f16 opcodes
aco: Add support to f2f16 with rtpi/rtni
radv/rra: Count box16 nodes properly
radv/bvh: Add radv_aabb16 and use it for box16 nodes
radv/bvh: Use box16 nodes when bvh8 is not used
radv: Fix crash if proceed comes before initialize
nir: Add an assert_eq intrinsic for testing nir_opt_algebraic
nir: Fix the types of udot_.*_uadd_sat
nir: Add a unit test base class for algebraic patterns
nir/opt_algebraic_tests: Add an option for generating unit tests
nir: Generate unit tests for nir_opt_algebraic
vulkan: Implement HPLOC
radv: Use HPLOC for TLAS builds
vulkan: Handle inactive primitives with LBVH builds
vulkan: Avoid NAN in the IR BVH
vulkan: Limit the number of LBVH invocations
radv/rra: Fix nullptr dereference
vulkan: Make sure no NaNs end up in the BVH
radv/bvh: Make sure internal nodes are collapsed when possible
Lakshman Chandu Kondreddy (1):
dri: Add R32F,RG32F,RGBA32F format mappings for DRIImage
Lars-Ivar Hesselberg Simonsen (23):
panvk/v9+: Reduce maxBoundDescriptorSets to 7
panvk: Only call req_res when required
panvk: Fix IUB decode
pan/format: Fix mapping for I16F
pan/format: Disable PAN_BIND_STORAGE_IMAGE for RGBA4/BGRA4
panfrost: Rename (LD|LEA)_BUFFER to (LD|LEA)_PKA
pan/va: Change LEA_BUF_IMM src description
pan/va: Add LEA_BUF
pan/genxml: Remove reg_format from v9+ ConversionDesc
nir: Add pan intrinsics for texel buffer access
pan/va: Add late lowering passes for texel buffers
pan/format: Add PAN_BIND_TEXEL_BUFFER
panvk: Increase maxBufferSize to UINT32_MAX
pan/v9+: Remove unnecessary nir_u2u32 from load_tex_size
panfrost/bi: Fix potential out-of-bounds writes
glsl/nir: Add texture_buffers to shader info
nir: Add channels to pan texel_buf intrinsics
pan/bi: Add texel buf lowering support for Bifrost
pan/bi: Add lowering pass for texel buffer indices
panvk/bi: Add texel buffer branch to meta_desc_copy
pan/bi: Make texel buffers use Attribute Buffers
pan/bi: Change texel buffer limits
panfrost/bi: Fix unbound texel buffers
Laura Nao (3):
ci: Enable Perfetto tracing support in Mesa builds for Linux/Android
ci/prepare-artifacts: Keep pps-producer binary in artifacts
ci/container: Add script to build Perfetto tracebox
Leon Perianu (1):
pvr: pvr_pds_fragment_program_create fix allocation callback usage
LingMan (5):
rust: build `equivalent` dependency with the correct edition
rust: build `paste` dependency with the correct edition
rust: build `ucd-trie` dependency with the correct edition
meson: silence warnings in rust subprojects
meson: specify minimal target meson version for rust subprojects
Linus Karl (2):
rocket: fix build on non LP64 architectures
ethos: fix build on non LP64 architectures
Lionel Landwerlin (168):
Revert “wsi: Implements scaling controls for DRI3 presentation.”
brw: add a new sampler payload parameter description
brw: port some NIR lowering to the sampler payload description
brw: switch to new sampler payload description scheme
brw: new Xe2 sampler opcodes
anv: reenable KHR_maintenance8 on Xe2+
brw: get rid of GET_BUFFER_SIZE opcode
anv: fix image-to-image copies of TileW images
brw: account for disabled SEND fused message in cycle computation
Revert “brw: add serialize send stats”
brw: add missing offset to MCS fetching messages
brw: constant fold u2u16 conversion on MCS messages
brw: only consider cross lane access on non scalar VGRFs
brw: fix ballot() type operations in shaders with HALT instructions
brw: fix missing generation requirement on sampler opcode
nir/divergence: fix handling of intel uniform block load
brw: mark divergence data as valid for debug purposes
brw: handling dynamic programmable offsets pre-Xe2
anv: reenable VK_KHR_maintenance8 on pre-Xe2 platforms
anv: rename structure holding 3DSTATE_WM_DEPTH_STENCIL state
brw: handle GLSL/GLSL tessellation parameters
nir/lower_io: add missing levels intrinsics to get_io_index_src_number
anv/brw: fix output tcs vertices
anv: destroy sets when destroying pool
anv: expose VK_EXT_shader_uniform_buffer_unsized_array
vulkan/runtime: enable null pointer to vkCmdSetSampleMaskEXT()
vulkan/render_pass: Add a missing sType
vulkan/render_pass: handle maintenance10 resolve flags
anv: implement VK_KHR_maintenance10
Revert “anv: Convert DEBUG_SPARSE logging to use mesa_log”
brw: disable io_semantic validation for mesh intrinsics
u_trace: reserve chunk space before emitting copies
anv: avoid null pointer access in utrace copies on CCS
brw: avoid invalid URB messages
anv: enable accelerationStructureCaptureReplay
anv: avoid invalid timestamp generation due to skipped commands
anv: don’t use IndirectStatePointersDisable at the end of secondaries
anv: avoid unnecessary stalling on secondaries
brw: stop emitting flush operations for begin/end interlock
vulkan/runtime: split out partitioning logic
vulkan/runtime: simplify robustness state hashing
vulkan/runtime: drop some geometry shader hashing
vulkan/runtime: drop blake3 hash on precomp shaders
vulkan/runtime: split precomp shader hashing from precomp loading
vulkan/runtime: keep the set layouts on the stack until pipeline creation
vulkan/runtime: use stage flags to track valid stages
vulkan/runtime: split compute shader hashing from compile
vulkan/runtime: split graphics shaders hashing from compile
vulkan/runtime: split rt shaders hashing from compile
vulkan/runtime: use only blake3_hash to shader key
vulkan/runtime: switch precomp shaders to blake3 hashes
vulkan/runtime: track imported stages
vulkan/runtime: implement VK_KHR_pipeline_binary
anv: enable KHR_pipeline_binary support
anv: limit maxComputeSharedMemorySize to 48KiB
anv/blorp/iris: rework Wa_14025112257
anv: disable software detiling on Xe2+ for image atomics 64bits
intel/isl: add INTEL_DEBUG=noccs-modifier to disable CCS modifiers
anv: ensure shader printf is functional on all backends
brw: fixup 64bit atomics emulation on 2D array images
anv: consider 64bit atomics on similar formats with mutable images
brw: fixup immediate bindless surface handling
brw: fix SIMD lowering of sampler messages with fp16 data
vulkan/runtime: fix incorrect assert on empty shader groups
anv: track descriptor mode in SBA tracepoint
anv: optimize pipeline switching with secondaries
brw: fix workaround fence rlen field
anv: fixup load_ubo lowering
anv: ensure slab allocated memory matches image requirements
anv: split non binding related intrinsics from apply_layout
anv: bump maxTessellationControlTotalOutputComponents
anv: Wa_18040903259 only applies to RCS when in GPGPU mode
anv: avoid pipe control reason tracking in emit_pipe_control
anv: put more readable PIPE_CONTROL reasons
brw: compute final copy propagation resulting source
brw: fix SS surfaces usage
nir: print out number of printfs
nir: fix lower_printf with no arguments
spirv: fix printf generation
nir/lower_printf: fix array alignment
nir/lower_printf: fix missing singleton add
anv: enable mesh/task shader hashes
anv: enable application shader printfs with debug option
brw: switch to load_(pixel_coord|frag_coord_z|frag_coord_w) intrinsics
anv: shrink image opaque data
brw: use default builder for urb handle adjustment
brw: Implement load/store URB intrinsics
anv: remove errors on format queries
brw: fix sample mask flag emission
anv: add 32-wide subgroup requirement heuristic
brw/iris: remove fs key for coherent_fb_fetch
vulkan/runtime: track dynamic descriptor offsets for RT pipelines
anv: fix broken ray tracing dynamic descriptors
vulkan/runtime: add an internal flag for independent sets
anv: reintroduce non independent sets dynamic descriptor optimization
iris: lower load_num_workgroups
anv: move load_num_workgroups tracking to driver
brw: remove driver specific load_num_workgroup lowering
vulkan/runtime: include unaligned dispatch bit in hashing
anv/brw: drop cs_prog_key::lower_unaligned_dispatch usage
anv: fix internal representations of shaders
intel: remove unused show_shader_stage debug option
anv: add missing device_memory_report for shaders
anv: fixup error path for shader allocation
anv: program STATE_BASE_ADDRESS instruction ptr using pdevice address
anv: fix dynamic buffers & independent sets
anv: switch shader heap placement to beginning of heap by default
anv: remove unused gpu_memcpy function
anv: remove use of emit_apply_pipe_flushes() in various helpers
anv: add tracking of involved stages in pipe flushes
anv: move cs/pb-stall detection to flushing function
anv: remove pb-stalls from various locations
anv: update pipeline barriers for Xe2+
anv: consider CS coherent with L3 on Xe2+
anv: disable deferred bits on Gfx20+
anv: remove unused event field
anv: store event creation flags
anv: use the blitter/video barrier helper for event signalling
anv: switch events to use 0/!0 values for unsignaled/signaled
anv: use flushing PIPE_CONTROL for event signaling
anv: use anv_add_pending_pipe_bits for event reset
intel: rename DCFlushEnable to ForceDeviceCoherency
anv: introduce an new virtual pipecontrol flag for BTI change
anv: implement Wa_18037648410
anv: use RESOURCE_BARRIER for event waiting when possible
anv: instrument resource barriers instruction in u_trace
anv: implement WA_18039014283
anv: add a no-resource-barrier debug flag
anv: disable crast on SKL
brw: Implement URB handle intrinsics for task/mesh stages
brw: move MUE initialization out of the SIMD loop
anv: remove CS-L3 coherency on Xe2
nir/printf-helpers: set writes_memory at printf emission
nir: add missing divergence handling for ray_query_global_intel
nir: use load() helper for inline_data_intel
nir: add a new push_data_intel intrinsic
brw: invert condition to reduce code nesting
brw: add a pass to lower ubo to push constant data
anv: stop going through push ranges on the first empty slot
anv: ensure internal compute kernels are run at SIMD16
anv/brw/iris: get rid of param array on prog_data
iris: manage TBIMR null push constant wa in driver
intel: rework push constant handling
anv/brw: prep work for SIMD32 ray queries
brw: enable ray query spilling in SIMD32
brw: handle lowering of a couple of opcodes
brw: enable topology opcodes in SIMD32
brw/nir/rt: ensure we can load 2 RT_DISPATCH_GLOBALS
brw: enable SIMD32 compute shaders with ray queries
brw: fix derivatives on non 32bit floats
brw: handle layer_id only through system value
brw: drop unused color_outputs_valid key
brw: switch buffer/image size intrinsics lowering to NIR
anv: remove all kinds of useless info for internal shaders
anv: enable debug printfs on internal shaders
brw: add missing base offset decoding
brw: improve push constant loading using base offsets
brw: apply same workaround to spawn than trace opcode
brw: treat inline parameters like UNIFORM
nir/compiler_options: add nir_load_pixel_coord
brw: set nir_shader_compiler_options::has_pixel_coord
brw: populate wm_prog_data earlier
brw: make coarse pixel bit available to NIR lowering
nir: add intrinsics for Z calculation in shaders with FSR
brw: move coarse_z computation to NIR
brw: use fp64 to compute coarse_z
iris: fix incorrect intrinsic usage on ELK
vulkan/wsi/direct: remove VkDisplay created from GetDrmDisplayEXT on ReleaseDisplayEXT
Lorenzo Rossi (8):
vulkan: increase MESA_VK_MAX_DISCARD_RECTANGLES
nvk: implement VK_EXT_discard_rectangles
nak/dataflow: Fix typo in comments
nak: Add latency_upper_bound to ShaderModel
nak/reg_tracker: Add SparseRegTracker
nak: Add cross-block instruction delay scheduling
nak: Fix delay insertion missing WaR
nak/sm120: Fix panic for CS2R during prepass
Loïc Molinari (1):
panfrost: Fix clean_pixel_write_enable forced check for AFBC
Lucas Fryzek (8):
util: Move ASTC unpack routines to common util
anv: For HIC only convert tile worth of memory at a time
anv: Implement host_image_copy astc emulation on CPU
anv: Enable host_image_copy on emulated formats
lvp: Enable VK_FORMAT_R4G4B4A4_UNORM_PACK16
lp: Implement gallium depth_bounds_test capability
drisw: Modify drisw_swap_buffers_with_damage to swap entire buffer
Revert “drisw: Copy entire buffer ignoring damage regions”
Lucas Stach (5):
etnaviv: blt: fix tile count calculation for in-place resolve
etnaviv: don’t emit steering state when uniforms are unchanged
etnaviv: check all necessary dirty bits when marking constbufs during draw
etnaviv: simplify constant dirty bit handling during state emission
etnaviv: idle the pipe before flushing texture caches
Ludvig Lindau (8):
panfrost/panvk: Merge stores in vector spills
panfrost/panvk: Reduce fills from LCRA
panfrost: Make instrs_equal check res table/index
pan/va: Add LD_CVT
pan/genxml: Move BufferDescriptor for v9+
pan/genxml: Add ConversionDesc to v9+ BufferDescriptor
pan/v9+: Make texel buffers use BufferDescriptor
pan/v9+: Change texel buffer limits
Luigi Santivetti (11):
pvr: split out driver specific framebuffer data
pvr: split framebuffer attachments allocation and setup
pvr: split framebuffer clear values allocation and setup
pvr: split out device tile buffers teardown
pvr: split out command buffer render pass inheritance
pvr: be more restrictive of when to emit vdm terminate
pvr: do not assert in multi-layer rta emulated path
pvr: get the format for start of render clears from pass info
pvr: move code for resolving attachments
pvr: add support for VK_KHR_dynamic_rendering
pvr: enable VK_KHR_dynamic_rendering
Marek Olšák (182):
r300: fix DXTC blits
winsys/radeon: fix completely broken tessellation for gfx6-7
radeonsi/ci: update hawaii failures
radeonsi: rename si_get_strmout_en -> si_get_streamout_enable_state
radeonsi: rename num_active_shader_queries -> streamout.num_ngg_queries
radeonsi: return false from si_update_ngg early on gfx11+
radeonsi: allow queries to return more than UINT32_MAX
radeonsi: cosmetic changes for queries
radeonsi: mostly fix NGG streamout overflow queries when XFB is disabled
zink: fix mesh and task shader pipeline statistics
amd: don’t use non-existent GLM packet fields on gfx12
amd: don’t use non-existent GL1 packet fields on gfx12
ac/surface: add helper use_tile_swizzle to consolidate that logic
winsys/amdgpu: don’t set ac_surf_info::surf_index = NULL
radv: don’t set ac_surf_index::surf_index to NULL
radv: don’t check VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT for surf_index
radv: don’t check vk_format_is_depth_or_stencil for surf_index
radv: move VK_IMAGE_USAGE_HOST_TRANSFER_BIT checking to ac_surface.c
radv: move more surf_index logic to use_tile_swizzle
radv: set RADEON_SURF_SHAREABLE for surf_index logic
ac/surface: pass ac_addrlib* everywhere instead of ADDR_HANDLE
ac/surface: move surf_index and fmask_surf_index into ac_addrlib
amd: constify struct radeon_surf
ac/surface: pass all ac_compute_surface info via ac_surf_config, not radeon_surf
radeonsi: enable ACO by default
radeonsi/ci: update failures
nir/lower_indirect_derefs: don’t lower compact arrays unconditionally to fix perf
ac/nir: set support_indirect_inputs/outputs in common code
nir: add nir_intrinsic_ssbo_descriptor_amd for lowering get_ssbo_size
amd: lower get_ssbo_size in ac_nir_lower_resinfo
nir/lower_io: force src offset=0 for any indirect access with num_slots == 1
nir/validate: expand IO intrinsic validation with nir_io_semantics
Revert ABI breakage “amd: Add user queue HQD count to hw_ip info”
nir/lower_interpolation: check IO location correctly
gallium/noop: don’t unref buffers passed to set_vertex_buffers to fix crashes
radv: set ZMM_TRI_EXTENT for conservative rasterization == overestimate
nir: add NIR_PASS_ASSERT_NO_PROGRESS
nir/opt_copy_propagate: refactor for readability, describe missing stuff
nir: rename nir_copy_prop -> nir_opt_copy_prop
nir: document how nir_opt_dce works
nir: document how nir_opt_cse works and suggest improvements
nir: add nir_separate_merged_clip_cull_io
nir,glsl,zink: remove the option nir_io_separate_clip_cull_distance_arrays
ac,radeonsi: remove gfx11 FW-based MCBP
nir: for nir_shift_channels, fill undefined components with undef instead of .x
nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees
nir/lower_io_passes: lower indirect TCS outputs sooner and clarify the behavior
nir/lower_io_passes: simplify conditions for when to lower IO to temps
nir/lower_io_passes: fold bool lower_indirect_inputs
nir/lower_io_passes: only sort variables for nir_lower_io_vars_to_temporaries
ac: document RELEASE_MEM limitation with PS_DONE/CS_DONE on gfx6-11
ac,winsys/amdgpu: report why ac_query_gpu_info failed
nir: fix a typo in NIR_PASS_ASSERT_NO_PROGRESS for non-debug builds
nir/lower_io_passes: call nir_opt_undef to eliminate undef output stores
st/mesa: call nir_opt_intrinsics for the GL_SELECT shader
st/mesa: call nir_opt_intrinsics slightly later
gallium/hud: don’t fclose stdout for GALLIUM_HUD=…,stdout
ac/nir: move aco_nir_op_supports_packed_math_16bit here
ac,radv: move opt_vectorize_callback to common code
nir/validate: don’t require offset src to be 0 if constant
nir: handle load_fs_input_interp_deltas in nir_is_input_load
nir: add shader_info::disable_input/output_offset_src_constant_folding
nir/opt_constant_folding: add nir_io_add_const_offset_to_base behavior
nir: remove nir_io_add_const_offset_to_base
nir/recompute_io_bases: don’t use safe iterators
nir/recompute_io_bases: move color input bases after all other inputs
nir/recompute_io_bases: report progress only if anything was changed
gallium: add a flag to finalize_nir to allow drivers to skip NIR opts
amd: rename most GFX115x definitions for released chips
nir/has_divergent_loop: require divergence metadata, check all function impls
winsys/amdgpu: retry the CS ioctl on -ENOMEM only if GDS OA is used
winsys/amdgpu: protect driver stats changes by a mutex
iris: add struct iris_scissor_state because pipe_scissor_state will be changed
panfrost: don’t expose 32K textures because st/mesa doesn’t support them
gallium: change pipe_scissor_state to 32 bit integer
gallium: change pipe_framebuffer_state width/height to 32-bit integer
gallium: declare pipe_resource::height0 as 32-bit integer for 64K textures
gallium/u_blitter: change width/height parameters to 32-bit integer
mesa: remove unused _mesa_total_texture_memory
mesa: remove unused mesa_store_cleartexsubimage, _mesa_store_compressed_teximage
mesa: remove unused make_null_texture
mesa: merge mostly duplicated mesa_format_image_size & mesa_format_image_size64
mesa: use size_t for image address computations
mesa: remove MaxTextureMbytes, use the cap instead
mesa: bump MAX_TEXTURE_RECT_SIZE, MAX_RENDERBUFFER_SIZE
mesa: raise MAX_TEXTURE_LEVELS to 17 to allow 64K mipmap textures
st/mesa: don’t use the PBO GetTexImage compute shader for 64K textures
st/mesa: disallow the PBO upload fragment shader
radeonsi: fix a few non-critical 64-bit integer overflows
radeonsi: reject textures that don’t fit in the CPU address space
radeonsi: allow 64K viewports
st/mesa: remove bogus framebuffer state assertions
radeonsi: enable 64K x 64K textures
zink/ci: update fixed tests
nir/lower_io_vars: don’t insert output stores for unrelated streams before emits
nir/gather_info: clear clip/cull_distance_array_size if the IO is not present
nir: split gathering array sizes from nir_lower_clip_cull_distance_array_vars
nir: give nir_lower_clip_cull_distance_array_vars a better name
nir: add FRAG_RESULT_DUAL_SRC_BLEND and an option to use it
radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND
ac/nir: allow smaller workgroups for GS
nir: fix the value of nir_io_use_frag_result_dual_src_blend
nir/print: print tex->sampler_dim
nir/lower_io: remove unused option nir_lower_io_lower_64bit_float_to_32
nir/lower_io: explain properly how nir_lower_io_lower_64bit_to_32* options work
nir/opt_cse: update potential future plans merging copy propagation with CSE
radeonsi: double pixel throughput in certain cases of PS without inputs
radeonsi: don’t load sampler states for buffer and MS samplers
radv: double pixel throughput in certain cases of PS without interpolated inputs
mesa: allow pipeline statistics in glCreateQueries
radeonsi: fix color interpolation when finalize_nir is called twice
radeonsi: assert that invalid FS inputs aren’t present
radeonsi: assert that IO bases don’t have holes & the same base isn’t used twice
radeonsi: remove unused FS input slots due to colors
radeonsi: don’t scalarize IO in finalize_nir
radeonsi: rename si_nir_scan_shader -> si_nir_gather_info, etc.
radeonsi: remove unnecessary NIR divergence analysis invocations
radeonsi: call si_nir_mark_divergent_texture_non_uniform later
Revert “radeonsi: use nir_opt_large_constants earlier”
radeonsi: update XFB info in the correct place after mediump IO lowering
radeonsi: lower nir_var_mem_shared later
radeonsi: fold nir_lower_compute_system_values_options into pass parameters
radeonsi: rename si_shader_info & si_shader_variant_info sysval fields
radeonsi: move CS user SGPR layout determination into si_shader_variant_info
radeonsi: move CS sysval si_shader_info fields into si_shader_variant_info
radeonsi: lower compute system values later
radeonsi: use si_preprocess/postprocess_nir function names
radeonsi/ci: update gfx12 flakes
radeonsi: move NIR callbacks to si_get.c
radeonsi: call nir_lower_fp16_casts in si_postprocess_nir
radeonsi: don’t set progress uselessly in si_postprocess_nir
radeonsi: call nir_opt_16bit_tex_image in si_postprocess_nir
radeonsi: use ac_nir_opt_vectorize_cb
radeonsi: call nir_lower_gs_intrinsics in si_preprocess_nir
radeonsi: lower task & mesh shader IO is si_preprocess_nir
radeonsi: move sparse intrinsic lowering to a separate file, call it later
radeonsi: remove glsl_tests subdirectory
radeonsi: move more lowering from si_lower_nir to si_preprocess_nir
radeonsi: remove the rest of si_lower_nir
radeonsi: call si_nir_lower_color_inputs_to_sysvals in si_preprocess_nir
radeonsi: merge 2 PS color input lowering passes for monolithic shaders
nir,radeonsi: simplify load_color0 & load_color1 intrinsics and shader_info
ac,radeonsi: move lowering to load_color0/1 to ac_nir_lower_ps_early
radeonsi: remove si_shader_selector::*_descriptors_index fields
radeonsi: move info fields from si_shader_selector to si_shader_info
rusticl: call nir_opt_intrinsics
radv: fix halved pixel throughput for a few non-blended 16bpp/32bpp formats
radeonsi: fix halved pixel throughput for a few non-blended 16bpp/32bpp formats
ac,radeonsi: move SX PS downconversion code into ac_formats.c
radv: use ac_set_sx_downconvert_state_for_mrt
nir/clip_cull_distance_utils: fix assertion failures with GL_EXT_mesh_shader
nir/clip_cull_distance_utils: add more assertions validating the type & sizes
ac/gpu_info: don’t read uninitialized dev_filename
ac/lower_ngg_mesh: fix a segfault accessing out_variables out of bounds
radeonsi: remove the PointSize output if it has no effect
radeonsi: fix slightly incorrect assertions in si_shader_ps
radeonsi: fix incorrect PS shader key with sample shading
radeonsi: fix clip/cull distance gathering for mesh shaders
amd: demystify various optimizations we already have for memory channels
ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES
ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers
ALL: use #define and a copy helper to check and copy build_id
anv: use SHA1_DIGEST_LENGTH
util: use SHA1_DIGEST_STRING_LENGTH in fossilize_db
util: increase SHA1_DIGEST_LENGTH to 32 (BLAKE3_KEY_LEN)
util: remove SHA1, use BLAKE3 in its functions to switch everything to BLAKE3
gallium/util: print task/mesh statistics in util_end_pipestat_query
radv,radeonsi: don’t set LINE_STIPPLE_TEX_ENA on gfx12
ac: remove never enabled gfx12 HiS
radv: rename hiz_his to gfx12_*hiz
radeonsi: set WALK_ALIGN8_PRIM_FITS_ST=0 for 64K rendering
radeonsi: set FORCE_STENCIL_VALID less often on gfx12
radeonsi: rename hiz_his to gfx12_*hiz
radeonsi: use deprecated fb_cbufs and fb_zsbuf less
radeonsi: move most si_surface color fields into new si_cb_surface_info
radeonsi: move most si_surface z/s fields into new si_zs_surface_info
radeonsi: stop using si_surface::base
radeonsi: remove si_surface::dcc_incompatible
radeonsi: remove dead code in si_create_surface
radeonsi: move si_surface::width0/height0 code into si_initialize_color_surface
radeonsi: stop using create_surface
radeonsi: remove si_surface & create_surface
Mario Kleiner (12):
hk: Enable VK_KHR_present_id[2] and VK_KHR_present_wait[2]
wsi/display: Accept 0 nits for HDR light level properties for “undefined”
wsi/display: Initially set default HDR metadata from EDID for HDR modes
wsi/display: Allow atomic modeset for change of Colorspace or HDR poperties
wsi/wayland: Zero min_luminance, max_luminance HDR light levels are valid.
util/format: Add util_format_is_unorm16()
dri,gallium: Add support for RGB[A]16_UNORM display formats.
egl/wayland: Support RGB[A]16_UNORM formats for display.
egl/drm: Support RGB[A]16_UNORM formats for display.
egl/surfaceless,device: Support RGB[A]16_UNORM formats for pbuffers.
ci/deqp: Pull in a fix for EGL render tests for rgba16 and rgb16 unorm
util/driconf: Disable EGL RGB[A]16 unorm configs on panfrost for now
Martin Roukala (né Peres) (12):
radv/ci: update the expectations of pre-merge jobs
zink/ci: update the expectations of RADV-based pre-merge jobs
ci: disable mupuf’s farm during the planned electric outtage
Revert “ci: disable mupuf’s farm during the planned electric outtage”
ci: disable mupuf’s farm
Revert “ci: disable mupuf’s farm”
freedreno/ci/a750: switch to the linux-firmware-provided gpu fw
freedreno/ci: update the a750 expectations
turnip/ci: update the vkd3d expectations
zink/ci: update the a750 expectations
ci: disable the valve-kws farm
Revert “ci: disable the valve-kws farm”
Mary Guillemard (30):
asahi/libagx: Stop exposing fake entrypoint _libagx_prefix_sum
asahi/libagx: Do not expose anything not use externaly
nir: Rename stat_query_address_agx to stat_query_address_poly
compiler: rename vs.tes_agx bit to vs.tes_poly
asahi/gs: Remove agx_nir_* prefix around static functions
asahi: Move compiler preprocess out of agx_nir_lower_gs
asahi,nir: Stop relying on zero and scratch page in GS/TESS code
asahi/gs: Reuse GS shader compiler options
poly: Migrate AGX’s GS/TESS emulation to common code
mr-label-maker: Add poly
mr-label-maker: Remove mapi label
hk: Fix maxVariableDescriptorCount with inline uniform block
hk: Disable 1x in sampleLocationsSampleCounts
hk: Remove unused allocation in queue_submit
hk: Make width and height per block in HIC
hk: Allocate the temp tile buffer in copy_image_to_image_cpu
asahi: Update CI expectations
mailmap: Update my email
people: Update my email
nvk: Implement ISBE space sharing on vertex stage
panvk: Move FAU space info to panvk_compile_nir
panvk: Move late lowering to panvk_compile_nir()
nvk: Use rendering state attachment count when setting SET_CT_SELECT
docs/features: add anv to VK_EXT_shader_uniform_buffer_unsized_array
hk: Advertise VK_EXT_shader_uniform_buffer_unsized_array
docs/features: Update info on VK_KHR_pipeline_binary
hk: Uses vk_device::mem_cache
hk: Advertise VK_KHR_pipeline_binary
hk: Hash the multiview mask for both vertex and fragment stages
nvk: Reenable compression support with nouveau 1.4.2
Matt Turner (2):
meson: Fix sysprof-capture-4 dependency
meson: Let -Ddraw-use-llvm=false work for R300 on non-x86
Mauro Rossi (3):
util: Fix gnu-empty-initializer error
radv/rt: Fix gnu-empty-initializer error
radv/rt: Fix gnu-empty-initializer error in radv_pipeline_rt.c
Maíra Canal (4):
teflon: Improve dumped graph formatting
teflon: List all supported operations on tflite_builtin_op_name()
docs/envvars: Document Teflon environment variables
docs/teflon: Update documentation with more recent output
Mel Henning (62):
nvk: Really fix maxVariableDescriptorCount w/ iub
nvk: VK_DEPENDENCY_ASYMMETRIC_EVENT_BIT_KHR
vulkan: Add vk_collect_dependency_info_src_stages
treewide: Use vk_collect_dependency_info_src_stages
docs/nvk: Add a list of external hardware docs
docs/nvk: Add some developer hardware docs
docs/nvk: Update hardware support
docs/nvk: Document NVK_DEBUG=trash_memory
docs/envvars: Remove references to nine
nak/nvdisasm_tests: Test plop3
nak/opt_lop: Don’t handle modifiers in dedup_srcs
nak/nvdisasm_tests: Turn sm_list() into a function
nak/nvdisasm_tests: Skip SM70 on cuda 13
docs/nvk: Fix description of supported GPUs
zink: Return zink_device in create_logical_device
zink: Make screen->queue_lock a pointer
zink: Create one queue lock per device
zink: Lock queue_lock in zink_destroy_screen
zink: Lock around screen_debug_marker_{begin,end}
nvk: Use the OS page size in nvk_AllocateMemory
nouveau/headers: Use 906f defines for nv_push.c
nouveau: Deduplicate drf.h
nouveau/headers: Use drf defines in nv_push.c
nouveau/headers: Use drf and cl906f.h in nv_push.h
nak: Split LegalizeBuilder into its own type
nak: DCE after legalize
nak/legalize: Use ConstTracker to skip some movs
nir: Add nir_deref_instr_is_arr() helper
treewide: Use nir_deref_instr_is_arr()
nir: Use instr_clone in rematerialize_deref_in_block
nak: Handle CS2R latencies in SSA form
nak: Add a Dst::file() helper function
nak: Set variable_latency=0 for !needs_scoreboard
nak: Add ShaderModelInfo
nak: Replace &dyn ShaderModel w/ &ShaderModelInfo
nak: Don’t box ShaderModelInfo
nak: Use the hardware’s max warps_per_sm value
nak: Factor out prev_multiple_of
nak: Reserve capacity in LiveSet::from_iter,extend
nak: Add a prepass instruction scheduler
nvk: Disable compression for image import/export
nvk: Set maxStorageBufferRange = maxBufferSize
nvk: Use semaphore helper for BufferMarker2AMD
nvk: Skip barriers if engine is not present
novueau/winsys: nv_device_info.has_transfer_queue
nouveau/winsys: Set channel_alloc.tt_ctxdma_handle
nvk: Expose transfer-only queues
nak: impl fmt::Debug for SSAValue
nak: Take &ShaderModelInfo in instr_sched_common
nak: Use .file() helper in sm120_instr_latencies
util/rmq: Fix test upper bound
util/rmq: Fix uninitialized read in preprocess
util/rmq: Remove unused header
nouveau/drm-shim: Implement new getparam values
nak/copy_prop: Split out prop_to_ssa_values helper
nak: Copy-prop bindless cbuf handles
nak/instr_sched_prepass: Fix RegOut special case
nvk: Ignore meta ops in occlusion queries
nvk: Disable large pages for now
nvk: Initialize SET_ALPHA_TO_COVERAGE_OVERRIDE
nvk: Report additional host_image_copy layouts
zink: Emit float controls for preserve_denorms too
Michael Cheng (2):
anv: Add VMA allocator for shader binaries
anv: Switch shaders to dedicated VMA allocator
Michael Tretter (3):
r600: remove obsolete option for experimental NIR support
r600: fix documentation of preoptir
r600: remove LLVM dependency
Michal Krol (1):
lavapipe: Bump maxGeometryInputComponents to 128.
Michal Vanis (2):
glsl: replace gl ctx direct access
mesa: replace gl ctx direct access
Mike Blumenkrantz (28):
zink: consistently set/unset msrtss in begin_rendering
zink: disable primitiveFragmentShadingRateMeshShader feature
zink: set gfx_pipeline_state::mesh_pipeline when updating pipeline
zink: collapse gfx pipeline fetching and binding conditionals
zink: collapse mesh pipeline fetching and binding conditionals
zink: don’t destroy old push layout when enabling fbfetch descriptor
lavapipe: maintenance10
zink: return mesh pipeline when creating mesh pipelines
zink: add back atomics for internal refcounts
zink: correctly use GENERAL layout for dynamic texture clears
zink: allow rendering to emulated alpha images for clears
zink: flatten out params to nir_to_spirv()
zink: move xfb stride off zink_shader_info struct
zink: move ntv params into zink_shader_info
zink: move the ntv sparse checks into ntv
zink: use vk enum members for ntv util returns
zink: move ntv shader info to single-use screen member
zink: delete all the no-op checks when rewriting clears
zink: automatically rewrite clears where possible to avoid using format views
zink: rename msaa_expand to attachment_shadow
zink: improve checks for srgb mutability
zink: flag immutable handles as such when creating resources
zink: create new transient image if the sample count doesn’t match
zink: explicitly null pipe_resource::next when creating transients
zink: reuse transient attachments for format view shadowing
zink: re-allow transient images during blitting
ntv: emit demote extension/capability when emitting demote
ntv: emit ViewIndex with flat for fragment stage
Mohamed Ahmed (6):
nouveau/winsys: Store the nouveau kernel version
nouveau/winsys: Retrieve and store the PTE kind in the nouveau_ws_bo
nvk/nvkmd: Fix alignments
nil, nvk: Add plumbing for compression
nvk: Move non-sparse image plane VA allocation to bind time
nvk: Enable compression
Nanley Chery (17):
anv: Limit the SCANOUT flag to color images
anv: Allow modifiers on depth images
anv: Don’t allow STORAGE + CCS for Y_TILED mod
intel/isl: Only assert surface addresses on gfx9+
iris: Fix pipe control around fast-clears
iris: Add comments from Bspec fast-clear preamble page
intel/isl: Fix miptail selection for compressed textures
blorp: Fix Tile64 clear redescription assertion
intel/isl: Fix QPitch of arrayed MCS
iris: Set missing flags on clear color changes
iris: Use the CLEAR state on Xe2+ for MCS
anv: Update predicated resolve documentation
anv: Fix the fast clear type for FCV writes
anv: Don’t return the Xe2+ fast-clear type early
anv: Fix clear state of WSI blit sources during presentation
anv: Treat non-WSI PRESENT_SRC as TRANSFER_SRC
anv: Don’t set the display flag on WSI blit sources
Natalie Vock (69):
nir/lower_shader_calls: Repair SSA after wrap_instrs
aco: Add preload_preserved pseudo instruction
aco/ra: Add utility to clear PhysRegInterval
aco/ra: Also consider blocked registers as not containing temps
aco/ra: Skip blocked regs in get_reg_impl
aco/ra: Don’t clear fixed operand sources if they were blocked
aco/ra: Handle callee ABI preserved register constraints
aco/ra: Handle call ABI constraints
util/bitset: Wrap __size in braces
util: Add sparse bitset data structure
nir: Use sparse bitset for liveness information
radv: Fix PSO history with RT pipelines
aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
radv/rt: Keep updated nodes always active
radv/rt: Correctly copy culling flags when updating to separate AS
radv: Move VMID reservation to vkCreateDevice
radv/rt: Refactor and split radv_nir_rt_shader.c
radv/rt: Use traversal vars for object origin/direction in ahit/isec
aco/live_var_analysis: Count linear VGPRs as always preserved by calls
aco: Remove unused p_reload_preserved def
aco: Record required call spills during live-var analysis
aco/spill: Handle calls
aco/spill: Reset scratch_rsrc on calls
aco/ra: Handle linear VGPRs allocated by p_startpgm
aco/spill: Create linear VGPRs for spilling ABI-preserved SGPRs
aco/spill: Restore registers spilled by call immediately
aco/lower_to_hw_instr: Add scratch size in call lowering
aco/util: Add aco::unordered_set
aco: Add pass for spilling call-related registers
radv/rt: Use subgroup invocation for stack index
radv/rt,aco: Always dispatch 1D workgroups for RT
aco: Swizzle ray launch IDs in the RT prolog
aco: Include arbitrarily fixed registers in max_reg_demand
aco/spill_preserved: Only reload linear VGPRs at end
aco: Don’t insert p_reload_preserved in loops
aco/lower_to_hw_instr: Preserve linearity of lowered linear VGPRs
aco/insert_waitcnt: Don’t determine linearity by reg number
aco/spill: Fix preserved reload operand update
aco/spill_preserved: Preserve linear VGPRs even if they aren’t p_spill operands
radv: Add traversal stack size to cache
aco/spill_preserved: Fix spilled VGPR overflow handling
nir/intrinsics: Add incoming/outgoing payload load/store instructions
aco/ra: Move register preservation logic in last block to p_return
aco: Remove bypass_reg_preservation
aco: Note if a parameter needs to be explicitly preserved
radv: Temporarily disable RT pipelines
radv: Refactor RT lowering decisions and add RADV_PERFTEST CPS override
radv/rt: Use function call structure in NIR lowering
radv,aco: Use function call structure for RT programs
radv: Re-enable RT pipelines
docs: Document RADV/ACO function calls
nir,aco: Clean up useless lowering of sbt_base_amd
radv: Use wave32 for RT on gfx11+
aco: Put boolean parameters inside SGPRs
aco: Tweak ABI register param limits
radv/rt: Don’t consider non-internal INTERSECTION shaders as the traversal shader
radv/nir: Add and use radv_nir_return_param_from_type helper
radv/nir: Make nir_lower_intersection_shader public
radv/rt: Fix terminate_ray handling for intersection shaders
radv/rt: Compile ahit/isec shaders to asm
radv/rt: Call ahit/isec shaders
aco: Add and use nir_abi_to_aco helper
aco: Add parameter assignment hints
aco: Use parameter assignment hints for any-hit shaders
aco: Fix parameter stack size calculation
radv/rt: Refactor shader group stack size calculation to include traversal stack
aco: Don’t exclude discardable parameters from register preservation
radv/rt: Fix some tail-call compatibility checks
radv/rt: Fix discardable attributes on chit and traversal shaders
Nick Hamilton (8):
pvr: Fix staging buffer realloc usage
pvr: Fix missing frees in error exit paths
pvr: Fix missing sample mask test instructions
pco: Fix encoding of branch to an empty block
pco: Fix for shadow sampler comparison not clamping the compare value
pvr: Temporarily disable the buffer device address extension
pco: Fix for atomic operations on an image buffer
pvr: Fix the isp samples per tile calculation
OPNA2608 (2):
vc4: Fix printing of get_tiling.modifier
rocket: Fix printing of rknpu_mem_create.dma_addr
Olivia Lee (15):
panfrost: fix cl_local_size for precompiled shaders
hk: fix data race when initializing poly_heap
panvk/csf: fix uninitialized read in draw context
panvk/csf: explicitly set ls_sb_slot in set_fbds_provoking_vertex
panvk/csf: put precomp syncobj behind PANLIB_BARRIER_CSF_SYNC option
panvk/csf: add PANLIB_BARRIER_CSF_WAIT, to insert WAIT after precomp
panvk/csf: factor out cs_match_iter_sb helper macro
panvk/csf: merge v10 and v11 paths in issue_fragment_jobs
poly: add messages to static_assert calls
panvk/csf: implement VK_EXT_primitives_generated_query except primitive restart
panvk/csf: implement dynamic precomp dispatch size
panvk/csf: implement VK_EXT_primitives_generated_query primitive restart
panvk: advertise VK_EXT_primitives_generated_query on v10+
Revert “panvk: advertise VK_EXT_primitives_generated_query on v10+”
hk: fix hk_passthrough_gs_key size computation
Patrick Lerda (12):
r600: fix r600_draw_rectangle refcnt imbalance
r600: update nplanes support
r600: limit pre-evergreen predicate ready size
r600: fix rv770 read scratch compatibility
r600: fix error filters compatibility
r600: improve cayman scissor 1x1 workaround
r600: fix cayman msaa shading behavior
r600: fix rv770 dot4 operations
r600: make vertex r10g10b10a2_sscaled conformant on palm and beyond
r600: fix rv770 clamp to max_texel_buffer_elements
r600: update cubearray imagesize calculation
r600: improve vs_as_ls switch reliability
Paul Gofman (1):
driconf: add a workaround for Investigation Stories : gunsound
Paulo Zanoni (9):
hasvk: restore anv_is_aligned()
blorp: fix argument indentation
blorp: replace magic ‘2’ with BLORP_NUM_BT_ENTRIES
blorp: reorganize struct blorp_params
intel/blorp: blorp_blit_vars_init() doesn’t need ‘key’
intel/blorp: generate the fast_clear_surf shaders later
intel/blorp: unionize blorp_params->wm_inputs
intel/blorp: add blorp_shaders.cl
meson: crocus and intel_hasvk now require clc
Pavel Ondračka (13):
r300/ci: update expectations
r300: fix dummy_vs leak
r300: fix overflow in r300_draw_elements_immediate
r300: fix locked_zbuffer leak
r300: fix contant remap table leak
r300/ci: asan testing
r300/ci: remove RV530 and RV380 non-asan deqp jobs
r300: program explicit scissor around viewport
r300: pop-free clipping
r300: enable guardband for draw
nir/opt_algebraic: improve dot product narrowing
r300: add explicit late lowering for a + -0
r300: invalidate texture cache when clearing texture bound for sampling
Peyton Lee (2):
radeonsi/vpe: correct tone mapping parameters
radeonsi/vpe: correct format setting
Pierre-Eric Pelloux-Prayer (25):
radeonsi: limit the sqtt buffer size
radeonsi: set VS dirty bit from si_vs_key_update_inputs
radeonsi: propagate shader updates for merged shaders
ac/virtio: remove dead code
ac/virtio: fix incorrect NULL check
ac/info: get vm_always_valid support through ac_linux_drm
radv: enable global BO list if vm_always_valid is supported
radeonsi/sqtt: clear out sqtt bo on resize
mesa: fix function prototype
mesa: remove unused image debug code
mesa: consider Attrib.MinLayer in do_blit_framebuffer
hud: only increase y if the pane contains graphs
hud: add new ‘dev’ pseudo-graph
ac/descriptors: account for num_storage_samples for gfx10
mesa: add assert to validate the no atomic path
Revert “glthread: mark internal bufferobjs for the ctx they belong to”
ci: enable shader-db test for radeonsi
ac/sdma: fix ac_sdma_get_tiled_header_dword for older gen
ac/sdma: fix src/dst pitch for sdma < 4
radeonsi: add a si_set_barrier_flags helper
radeonsi: fix references to sctx->flags in documentation
radeonsi: add a si_clear_and_set_barrier_flags helper
radeonsi: add extra flags param to si_emit_barrier_direct
radeonsi/sqtt: restore barrier_flags in si_sqtt_init_cs
radeonsi: add asserts to validate emit functions use of atoms
Piotr Masłowski (5):
hk: promote VK_EXT_robustness2 to VK_KHR_robustness2
hasvk: promote VK_EXT_robustness2 to VK_KHR_robustness2
nvk: promote VK_EXT_robustness2 to VK_KHR_robustness2
tu: promote VK_EXT_robustness2 to VK_KHR_robustness2
lvp: promote VK_EXT_robustness2 to VK_KHR_robustness2
Pohsiang (John) Hsu (24):
mediafoundation: add stats resource pool so we can use pool for QP map as well
mediafoundation: fix sporadic build failure with u_inlines.h not found on test target
mediafoundation: for low latency, change stats pool size to 2, this is because there is no synchronization btwn returning MF sample and ProcessInput
mediafoundation: periodic clang-format, no code changes
mediafoundation: setup wpp logging in more of the files and add some error handling on dpb manager and reference frame tracker
mediafoundation: add support for initial pool size and max pool size for stats pool
mediafoundation: periodic clang-format
mediafoundation: remove private CODECAPI_AVEncVideoEnableFramePsnrYuv as this is published
mediafoundation: remove unused code
mediafoundation: propagate input timestamp / duration to output
mediafoundation_frontend: update version to 1.08
mediafoundation: log warning if dx11 device is not created with multithread protected
d3d12: Fix lack of flushing when encoding h264 with SVC
mediafoundation: turn on slice auto on frames with dirty rect only
mediafoundation: propagate PrepareForEncode error up.
mediafoundation: add some end of function error logging for diagnosing error
mediafoundation: remove unneeded memset (~34KB for hevc)
mediafoundation: remove unused templ and small code cleanup
mediafoundation: handle the case where output sample is returning after MFT has been released.
d3d12: add missing updating of pMetadata
mediafoundation: add logging
mediafoundation: rename VideoEncodeReconstructedPicture to VideoEncodeD3D12ReconstructedPicture
d3d12: fix slice support for setting number of coding units per slice
mediafoundation: set rc mode in GetCodecPrivateData for 2 pass rc mode
Qiang Yu (74):
radeonsi: enlarge SI_NUM_SHADERS for mesh shader
radeonsi: handle mesh shader when si_create_shader
radeonsi: add context shader state for mesh shader
radeonsi: inline uniform support mesh shader
radeonsi: add si_mesh_resources_add_all_to_bo_list
radeonsi: add task/mesh shader info to si_shader_info
radeonsi: calc workgroup size for mesh shader
radeonsi: init mesh shader args
radeonsi: init pm4 state for mesh shader
radeonsi: no ngg culling for mesh shader
radeonsi: add task info to screen
radeonsi: lower task/mesh shader io to mem
radeonsi: kill outputs for mesh shader
radeonsi: share some vertex pipe function with mesh pipe
radeonsi: update scratch va for mesh shader
radeonsi: si_get_vs support mesh shader
radeonsi: simplify si_update_rasterized_prim while handle mesh shader
radeonsi: save mesh shader when blit
mesa,radeonsi: add comments about vertex and mesh pipeline shader states
gallium/blitter: no need to save TS state
mesa,gallium: not touch TS when internal draws
radeonsi: call si_shader_change_notify when vs bind
radeonsi: emit shader pointer for mesh shader
radeonsi: export si_set_user_data_base for mesh shader usage
radeonsi: add mesh shader state create/delete/bind
radeonsi: add mesh shader debug options
radeonsi: select key for mesh shader
radeonsi: support mesh shader per vertex output
radeonsi: si_get_output_prim_simplified support mesh shader
radeonsi: si_select_hw_stage support mesh shader
radeonsi: compile mesh shader with ACO only
radeonsi: dump shader key for mesh shader
radeonsi: add mesh shader bits for dirty_shaders_mask
radeonsi: compute vs_output_ps_input_cntl for mesh shader
radeonsi: support mesh shader per primitive output
radeonsi: support fragment shader per primitive input
radeonsi: handle primitive indices for mesh shader
radeonsi: lower mesh shader outputs
radeonsi: add si_emit_buffered_gfx_sh_regs_for_mesh
radeonsi: add radeon_emit_alt_hiz_packets for mesh shader
radeonsi: don’t put descs in user sgpr for task shader
radeonsi: init task shader args
radeonsi: change arg for si_cp_dma_prefetch
radeonsi: export si_setup_compute_scratch_buffer for task shader
radeonsi: add si_upload_shader_descriptos
radeonsi: add si_emit_task_shader_pointers
winsys/amdgpu: support gang submit for kernel queue
radeonsi: add task/mesh shader context states
radeonsi: implement task ring nir intrinsic lower
radeonsi: log cs support mesh shader
radeonsi: export si_init_compute_preamble_state for task shader
radeonsi: move shared_size to si_shader_variant_info
radeonsi: add si_create_compute_state_for_nir
radeonsi: init mesh shader ngg info
radeonsi: implement nir_intrinsic_load_ring_mesh_scratch_amd
radeonsi: increase task wait count when emit barrier
radeonsi: add task shader queries support
radeonsi: lower mesh shader local id and workgroup id
radeonsi: si_emit_buffered_compute_sh_regs support gang cs
radeonsi: compute culldist_mask and clipdist_mask for mesh shader
radeonsi: add si_update_shaders_shared_by_vertex_and_mesh_pipe
radeonsi: add si_update_shaders_for_mesh
radeonsi: add si_emit_rasterizer_prim_state_for_mesh
radeonsi: add mesh shader functions
radeonsi: handle maybe per primitive input for fragment shader
radeonsi: si_calculate_max_simd_waves support task and mesh shader
radeonsi: enable EXT_mesh_shader
doc: mark GL_EXT_mesh_shader as done
dri: avoid sending too many present reuqests when app start or pause
glsl: support barrier() for task and mesh shader
ac/llvm: workaround legacy fma intrinsic crash on gfx12
radeonsi: fix primitive restart gpu hang for pre gfx10
radv: fix primitive restart gpu hang for pre gfx10
radeonsi: fix mesh shader outputs kill
Radu Costas (1):
pvr: Add calculation for spill/scratch buffers
Reilly Brogan (1):
amd,compiler: fix const errors found with C23 glibc support
Rhys Perry (63):
amd/lower_mem_access_bit_sizes: don’t create subdword UBO loads with LLVM
amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering
amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load
nir/lower_mem_access_bit_sizes: increase chunk limit
amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8
ac/nir: stop using NIR_PASS in ac_nir_lower_ngg_nogs()
radv: remove NIR_PASS in radv_nir_lower_rt_abi
radv: stop rallocing objects which don’t belong to the shader under it
radv: remove NIR_PASS in insert_rt_case
nir/lower_shader_calls: reobtain impl after NIR_PASS
nir/lower_tex: optimize txd(coord, ddx/ddy(coord))
ac/nir: refactor move_coords_from_divergent_cf a bit
ac/nir: optimize txd(coord, ddx/ddy(coord))
radv,radeonsi: use optimize_txd
ac/nir: don’t consider quads incomplete inside loops
aco/scheduler: fix register demand check
ac/nir: add some tests for ac_nir_lower_mem_access_bit_sizes
aco/ra: copy vector_info to affinities
aco/ra: add first loop header phi operand to temp_to_phi_resources
aco: print large p_parallelcopy using several lines
ac/nir: fix calculation of aligned_new_size
ac/nir: fix check for increasing size of non-descriptor loads
ac/nir: don’t vectorize 16-bit shared loads to 8-bit
aco: micro-optimize ray launch ID swizzling
aco: use correct addition opcodes in gfx6-8 RT prolog
aco/ra: refactor update_renames slightly
aco/ra: omit renaming when necessary when moving copy definitions
aco: always run RA validation during tests
aco: add RA validation for p_call
aco/ra: remove dead code in split_blocking_vectors
aco/ra: discard tmp_file after get_regs_for_copies fails
aco/ra: fix operands when recreating blocking vectors
aco/ra: use original name for blocking vectors rename
aco/ra: update register file when recreating blocking vectors
aco/ra: fix split_blocking_vectors with some subdword vectors
aco/ra: emit p_split_vector after p_parallelcopy
aco/tests: add function call regalloc tests
radv/rt: cleanup phis after lowering parameter variables to SSA
radv/rt: lower non-return load_param to variable loads
aco: track number of post-RA spilled vgprs/sgprs
aco: don’t try to preserve SCC in callees
aco/ra: don’t use update_vgpr_sgpr_demand in increase_register_file
aco: move update(fixed_reg_demand) into update_vgpr_sgpr_demand
aco: increase max_reg_demand to help avoid preserved VGPRs
aco/ra: always prefer earlier regs in get_reg_impl() if costs are the same
aco/ra: always abort loop in get_regs_for_copies() if candidate is worse
aco/ra: refactor get_reg_impl and get_regs_for_copies using tuples
aco/ra: prefer clobbered registers in callees
aco/ra: prefer clobbered registers in get_reg_specified()
aco/ra: consider already-used preserved registers to be free
aco/sched: don’t use previously unused preserved registers
aco: don’t spill no-op copies of input parameters in preserved registers
lavapipe,nv50/ir,lima: run nir_opt_algebraic_late
nir: add fcanonicalize
aco/ra: copy precolor affinities to p_create_vector/p_split_vector
aco/ra: move split_blocking_vectors higher
aco/ra: split blocking vectors if needed when handling fixed operands
aco: remove dead p_call code in live_var_analysis
aco/tests: remove vcc definitions from p_call
aco: use size_t for monotonic_buffer_resource
aco: reduce memory usage of live_var_analysis
aco/insert_fp_mode: remove incorrect assertion
radv: fix when incomplete rt pipeline libraries are loaded from cache
Ritesh Raj Sarraf (3):
ci: Use Linux 6.17.3 for mesa gfx-ci
freedreno/ci: Drop KERNEL_TAG retargeting the new Linux 6.17.3
ci/virgl: Mark test job for Linux 6.16
Rob Clark (165):
freedreno/a6xx: Additional handle import logging
loader: Ignore empty override strings
freedreno: Move *_POWER_CNTL to raw_magic_regs
freedreno: Move TPL1_DBG_ECO_CNTL to raw_magic_regs
freedreno: Move GRAS_DBG_ECO_CNTL to raw_magic_regs
freedreno: Move SP_CHICKEN_BITS to raw_magic_regs
freedreno: Move UCHE_CLIENT_PF to raw_magic_regs
freedreno: Move PC_MODE_CNTL to raw_magic_regs
freedreno: Move SP_DBG_ECO_CNTL to raw_magic_regs
freedreno: Move HLSQ_DBG_ECO_CNTL to raw_magic_regs
freedreno: Move VPC_DBG_ECO_CNTL to raw_magic_regs
freedreno: Move UCHE_UNKNOWN_0E12 to raw_magic_regs
freedreno: Move RB_CCU_DBG_ECO_CNTL to raw_magic_regs
freedreno: Flatten fd_dev_info props
freedreno: Move magic/magic_raw out of props
freedreno: Collapse A6XXProps/A7XXProps
freedreno/a6xx: Fix UB in convert_color()
freedreno: Fix internal VBO reference leak
freedreno: Remove use of FDL_MIN_UBWC_WIDTH
freedreno/registers: Fix definition of CP_COND_EXEC
freedreno/crashdec: Dump cmdstream at end
freedreno/crashdec: Log IBs to snapshot
freedreno/registers: Convert events to hex
freedreno/registers: Event cleanups
freedreno/registers: Move FLAGS_REGID
freedreno/a6xx: Move VFD_RENDER_MODE emit
freedreno/a6xx: Use with_crb() helper
freedreno: flip template param order
freedreno/a6xx: genx helper for additional template param
freedreno/decode: Drop summary override for CRB
freedreno/a6xx: Add RB_DBG_ECO_MODE helper
freedreno: More ergonomic cs casting
freedreno/a6xx: Pass cs to fd6_clear_lrz()
freedreno/a6xx: Drop emit_marker6()
freedreno/a6xx: Drop fd6_emit_blit()
freedreno/a6xx: Rework where we emit ccu cache cntl
freedreno/a6xx: Emit RB buffer setup for sysmem too
freedreno/a6xx: Split preamble for gmem vs sysmem
freedreno/a6xx: Be more precise about CP_SET_MARKER
freedreno/a6xx: Actually use lrz fast clear
freedreno/a6xx: Add helper to set render mode
freedreno/a6xx: Add helpers for preamble const loads
freedreno/a6xx: Fix debug comment
freedreno/decode: Fix bindless descriptor dumping
freedreno/decode: Print mode for compute shaders
freedreno/decode: Add extra indent levels
freedreno/registers: Fix a few field names
freedreno/registers: Rename SP_HLSQ_MODE_CNTL
freedreno/registers: Name RB_LRZ_CNTL2
freedreno/registers: Name HYSTERESIS regs
freedreno/registers: Fix GRAS_LRZ_CNTL definition
freedreno: Add chip range template helpers
freedreno/registers: Extend ncrb builder for new gens
freedreno/lrz: Extend lrz fc helpers for gen8
freedreno/event: Extend event helpers for gen8
freedreno: Add gen8 device info
freedreno/common: Make max tile dimensions a param
freedreno/common: Add placeholder a8xx device
freedreno/drm-shim: Add a830
freedreno: Add gen8 chip template-fu
freedreno/registers: pm4 updates for gen8
freedreno/registers: Fix gen8 swizzle enum
ir3: Skip non-bindless ldc warmups
ir3: Fix gen8 instruction timings
ir3: Fix cat3 latency
ir3: Limit CS lock/unlock quirk
ir3: Extract out helper for nop flags
ir3: Add (sy) before end of preamble when necessary
ir3: Add disasm test macro for gen8
ir3: Add (eostsc)
ir3: Add cat1 (sat) bit
ir3: Add cat3 alt immed encoding
ir3: Add cat3 flut src encoding
ir3: Add mova .u bit
ir3: Use ldc.u in preamble
ir3: Add mova.r encoding
ir3: Fix gen8 ldc encoding
ir3: Add new cat2 instructions
ir3: dp2acc is removed in gen8
ir3: Add new cat3 instructions
freedreno/registers: Fix gen8 UBWC array pitch
freedreno/registers: Add TPL1_MODE_CNTL bitfields
freedreno/fdl: Fix gen8 TEX_LINE_OFFSET
freedreno/fdl: Fix gen8 buffer depth
freedreno/a6xx: Handle tess_bo size differences for gen8
freedreno/registers: More gen8 prep
freedreno/registers: gen8 support
freedreno/a6xx: Drop log_pipeline_stats()
freedreno/a6xx: Add gen8 query support
freedreno/a6xx: Fix VSC_BIN_SIZE for gen8
freedreno/computerator: gen8 support
freedreno: gen8 support
freedreno/common: Add A840 and X2-85
ir3: Fix early-preamble (sy)
gallium/aux: Add debug option to force u_upload rollover
gallium: Make upload_cb0 return a releasebuf
asahi: Set prefer_real_buffer_in_constbuf0
freedreno/devices: Add num_slices
freedreno/a6xx: Fix GRAS_LRZ_BUFFER_PITCH
freedreno/a6xx: Fix GRAS_LRZ_BUFFER_SLICE_PITCH
freedreno/lrz: Add gen8 lrz layout support
freedreno/a6xx: Fix layered lrz
freedreno/a6xx: gen8 lrz support
freedreno/a6xx: Set FD_BO_NO_HARDPIN from meson
freedreno/registers: Mark LOAD_IMMED as a5xx
freedreno/a6xx: Drop legacy CP_EVENT_WRITE builders
freedreno/registers: Move ‘unknown’ last
freedreno/registers: Reintroduce FD_NO_DEPRECATED_PACK
tu: Drop tu_cs_image_*_ref
tu: Drop use of legacy reg offset macros
tu: Rework pipeline stat queries
tu: Convert tu_clear_bit deprecated reg builders
tu: Convert tu_cmd_buffer deprecated reg builders
tu: Convert tu_shader deprecated reg builders
tu: Rework emit_xs_config()
tu: Rework emit_vpc()
tu: Convert rest of tu_pipeline deprecated reg builders
tu: Drop FD_NO_DEPRECATED_PACK
freedreno/a6xx: Move assert
freedreno/a6xx: Extract out GMEM cache helper
tu: Use GMEM cache helper
tu: Convert viewport state to CRB
tu: Convert emit_lrz_buffer to CRB
freedreno/fdl: Fix gen8 buffer descriptors
freedreno/fdl: Add STRUCTSIZETEXELS arg
tu: Replace A6XX_TEX_CONST_DWORDS
tu: Plumb CHIP thru descriptor set building
tu: Use more fdl6_buffer_view_init()
tu: Extract out descriptor helpers
tu: Fix TU_DRAW_STATE_VB size
tu: Fix zero length pkt4
freedreno/a6xx: Fix gen8 blitter resolve
freedreno/computerator: Use correct CP_SET_RENDER_MODE
tu: Use correct LRZ flush events on A7XX
tu: Track dirty TCS state
tu: Move PC_DS_PARAM emit after early-exit
tu: Move CP_SET_SUBDRAW_SIZE out of SDS
freedreno/rnn: Track min/max offset
freedreno/decode: Add regex support for query-mode
freedreno: Disable has_rt_workaround for gen8
freedreno: Disable supports_double_threadsize for gen8
tu: Convert foveat state to CRB
freedreno/fdl: Fix gen8 MUTABLEEN
freedreno/fdl: Fix gen8 sRGB buffers
freedreno/registers: Fix gen8 UV_PITCH
freedreno/registers: Add subpass fence events
freedreno/registers: Fix gen8 GRAS_SU_STEREO_CNTL
freedreno/registers: Fix gen8 TPL1_MODE_CNTL
freedreno/registers: Fix gen8 TPL1_A2D_BLT_CNTL
freedreno/registers: Fix GRAS_LRZ_CB_CNTL
freedreno/registers: Fix py array reg offsets
freedreno/registers: Update gen8 FDM regs
freedreno/registers: Update gen8 VRS registers
ir3: Avoid narrowing int conversions from GPR on SALU
ir3: Skip shading_rate lowering when unneeded
ir3: Limit 64b atomic 16b offset quirk to a7xx
tu: Support acceleration_structure for wave64
tu: gen8 descriptor support
tu: Add helper to set render mode
tu: gen8 sampler support
tu: gen8 support
freedreno/common: Fix gen8 EFU float control
freedreno: Force single wavesize if double threadsize is unsupported
freedreno/lrz: Correct lrz fc layout for gen8
freedreno/a6xx: Better program state size calc
Rohan Garg (2):
anv: program STATE_COMPUTE_MODE to flush the L1 cache
anv: implement resource barrier emissions
Roland Scheidegger (3):
llvmpipe: do bounds checking for shared memory
llvmpipe: implement strict d3d11 rules for centroid interpolation
llvmpipe: optimize the centroid implementation
Romaric Jodin (10):
pan/va: make valhall_parse_isa input explicit
aux/trace: remove -I argument
pan/bi: improve bi_alu_src_index to avoid bi_make_vec when possible
pan/bi: improve vectorization of 8bit alu
pan/bi: do not vectorize nir_op_f2{i,u}8
pan/bi: do not vectorize nir_op_f2fmp
pan/bi: fix destination of v4i8 instruction returning only v2i8
pan/bi: bi_alu_src_index: remove invalid assert
pan/va: Add missing 8bit widen swizzles
pan/bi: Keep vectorized phis
Rudi Heitbaum (1):
mesa: retain const qualifier from pointer
Ryan Houdek (2):
freedreno/fdl: Fix typo in tiled_to_linear_2cpp
freedreno/fdl: Optimize linear_to_tiled with avx2
Ryan Mckeever (10):
mailmap: update my name and email
people: update my name/email
nir: add support for pixel_local_storage variables
compiler/glsl: replace tabs with spaces
glapi: add EXT_shader_pixel_local_storage extension
glsl, mesa: add EXT_shader_pixel_local_storage extension
gallium, mesa: keep track of pixel local storage state
pan/bi: introduce EXT_shader_pixel_local_storage support to compiler
pan/lib: prepare for pixel local storage support
panfrost: enable EXT_shader_pixel_local_storage
Sagar Ghuge (18):
anv: Call brw_nir_lower_rt_intrinsics_pre_trace lowering pass
brw/rt: Move nir_build_vec3_mat_mult_col_major helper to header
brw/rt: fix ray_object_(direction|origin) for closest-hit shaders
vulkan/runtime: Fix typo in stack size calculation
anv: Use correct engine class for companion RCS
anv: Drop unwanted untyped flush for AS query
intel/common: Consider 0 threads while setting TG
intel/genxml: Update CS_CHICKEN1 register for gfx20
anv: Replay mode is only available on Gfx < 20
anv: Convert indirect to direct dispatch
vulkan/runtime: Account for pipeline libraries stage count
anv/rt: Increment block count only for valid children
blorp: Set persample_msaa_dispatch for render shader
blorp: Handle 2D MSAA array image copies on compute shader
anv: Stop using RCS companion for MSAA copy/clear on Xe3+
anv: Add host barrier while dumping out BVH data
anv/rt: Don’t always set disableOpacityCull bit
anv/rt: Drop atomic operations on opacity flags
Samuel Pitoiset (313):
radeonsi: use ac_emit_write_data_imm() more
radv: use ac_emit_cond_exec() more
amd,radv,radeonsi: add ac_emit_cp_set_predication()
amd: add a predicate parameter to ac_emit_cp_pfp_sync_me()
radv: use ac_emit_cp_pfp_sync_me() more
amd,radv,radeonsi: add ac_emit_cp_gfx11_ge_rings()
amd,radv,radeonsi: add ac_emit_cp_tess_rings()
amd,radv,radeonsi: add ac_emit_cp_gfx_scratch()
amd,radv,radeonsi: add ac_emit_cp_acquire_mem()
amd,radv,radeonsi: add ac_cmdbuf_flush_vgt_streamout()
radv/ci: uprev kernel to 6.17.3 + drm/buddy backported fixes for zerovram
radv/ci: use the custom 6.17.3 kernel for NAVI21/NAVI31
radv/ci: use the custom 6.17.3 kernel for POLARIS10
radv/ci: drop RADV_PERFTEST=video_decode,video_encode for NAVI31
radv/ci: bump number of deqp-runner jobs to 32 for GFX1201
radv/ci: set RADV_DEBUG=novideo for NAVI21
radv/ci: set RADV_DEBUG=novideo for NAVI31 too
radv: remove an useless check when destroying descriptor sets
radv: add a small helper to destroy descriptor pool entries
radv: simplify allocating pool entries for descriptor sets
radv: use vk_zalloc2() for allocating the descriptor pool
radv: simplify error handling when creating descriptor pools
radv: pass int_sel to radv_cs_emit_write_event_eop()
radv: remove useless parameter to gfx10_cs_emit_cache_flush()
radv: simplify L2 cache flushes on < GFX12
radv: remove an obsolete comment about SMEM stores
radv: use ac_emit_cp_copy_data() more for perfcounters
amd,radv: add ac_emit_cp_atomic_mem()
amd: add missing _cp_ to some emit helpers
amd,radv,radeonsi: add ac_emit_cp_nop()
amd,radv,radeonsi: add ac_emit_cp_load_context_reg_index()
amd: add a predicate parameter to ac_emit_cp_copy_data()
radv: use ac_emit_cp_copy_data() more
amd,radv,radeonsi: add ac_emit_cp_write_data_{head}()
amd,radv: move SDMA utility helpers to common code
amd: move CP emit helpers to ac_cmdbuf_cp.c/h
radv: gather push constant size from shaders for ESO
radv/rt: radv: gather push constant size from shaders for RT
radv: gather push constant size from shaders for pipelines
radv: remove radv_shader_layout::push_constant_size
radv: remove radv_pipeline_layout::push_constant_size
radv: bump maxImageArrayLayers to 8192 on GFX10+
radv: bump maxImageDimension3D to 8192 on GFX10+
radv: initialize image properties earlier
radv: configure the screen scissor to the maximum image dimension
radv: bump image limit properties on GFX12
amd,radv,radeonsi: add ac_pm4_emit_commands()
radv/amdgpu: use common emit helpers in radv_amdgpu_cs_chain_dgc_ib()
amd,radv: add ac_emit_cp_indirect_buffer()
radv/amdgpu: remove now unused radeon_emit helpers
amd,radv,radeonsi: add and use more ac_cmdbuf_XXX helpers
amd,radv,radeonsi: add ac_emit_cp_inhibit_clockgating()
amd,radv,radeonsi: add ac_emit_cp_spi_config_cntl()
amd,radv,radeonsi: add ac_emit_spm_setup()
amd,radv,radeonsi: add ac_emit_cp_release_mem()
radv/ci: stop skipping dEQP-VK.descriptor_indexing.* on Cezanne
radv/ci: update comments around video failures
radv: dirty dynamic descriptors when required
radv: add radv_bind_{graphics,rt,compute}_pipeline() helpers
radv: use a linked-list for storing descriptor pool sets
radv: implement a new descriptor sets allocator
vulkan: update spec to 1.4.330
vulkan: exclude non-existant Shader64BitIndexingEXT SPIR-V capability
spirv: Update the JSON and headers
radv: use radv_buffer_get_va() more
radv/amdgpu: use radv_amdgpu_bo_va_op() for BOs from pointer
radv/amdgpu: add a way to wait for VM updates at alloc time
radv: add radv_wait_for_vm_map_updates drirc and enable for Forza Horizon 5
amd,radv,radeonsi: move some GFX12 emit helpers to common code
amd,radv,radeonsi: add ac_{gfx11_reg_pair,gfx12_reg}
amd,radv,radeonsi: add ac_buffered_sh_regs
amd,radv,radeonsi: move GFX12 push SH REGS helpers to common code
radv: advertise VK_EXT_shader_uniform_buffer_unsized_array
radv: remove some RADV_DEBUG deprecated options
radv: fix reserving enough space for emitting the SPM setup
radv: ignore dual-source blending when blending isn’t enabled for MRT0
radv: implement vkCmdEndRendering2KHR()
radv: allow NULL pSamplesMask with vkCmdSetSampleMaskEXT()
radv: add support for depth/stencil resolves with vkCmdResolve2()
radv: reverse the logic for NO_CONCURRENT_WRITES_BITS_MESA
radv: implement new input attachment information for dynamic rendering
radv: allow ds<->color copies on compute/transfer queues
radv: add support for controlling sRGB transfer function with resolves
radv: advertise VK_KHR_maintenance10
radv,vulkan: replace VK_RENDERING_INPUT_ATTACHMENT_NO_CONCURRENT_WRITES_BIT_MESA
radv: add a workaround for illegal depth/stencil descriptors with No Man’s Sky
radv: fix creating linked graphics ESOs with a compute shader
radv: use radv_get_shader_layout() more with ESO
radv/sqtt: do not try to resize the SQTT buffer for per-submit captures
aco: fix reserving VGPRs for 64-bit attributes in VS prologs
radv,aco: wait for all VMEM loads when the prolog loads large 64-bit attributes
amd,radeonsi: add GFX11 packed context registers helpers to common code
radv: add GFX11 packed context registers helpers
radv: add separate functions for emitting framebuffer on GFX11-11.5
radv: use GFX11 packed context regs
radv: support more tessellation parameters with TCS for ESO unlinked shaders
radv/ci: remove RADV_PERFTEST=video_encode,video_code for GFX6-7
radv/tests: use vkGetPipelineKeyKHR() instead of compiling pipelines
radv: move back ac_sqtt_{init,finish}() to the right places
ac/surface: ban 256KB swizzle modes for non-MSAA images on GFX11+
radv: add vk_wsi_disable_unordered_submits and enable for GTK
radv/meta: remove useless blit2d_src_temps
radv/meta: split radv_meta_blit2d() into two separate functions
radv/meta: remove radv_meta_blit2d_rect
radv/meta: remove multiple aspects in radv_gfx_copy_memory_to_image()
radv/meta: simplify radv_gfx_copy_memory_to_image() even more
radv/meta: simplify aspect/formats in radv_gfx_copy_image()
radv/meta: rework radv_meta_nir_texel_fetch_build_func
radv/meta: fuse depth/stencil aspects copy with the GFX path
radv/amdgpu: add a way to identify preamble/postamble when dumping CS
radv: add RADV_DEBUG=dumpibs to dump command buffers
ac/parse_ib: decode SDMA_OPCODE_POLL_REGMEM
radv: fix supporting more tess parameters with TCS for ESO unlinked shaders
radv: bump maxRayDispatchInvocationCount to 2^30
radv: fix gathering push constants from shaders with ESO
radv: add a workaround for color<->stencil only copies on SDMA4-5
spirv: Update the JSON and headers
vulkan: update spec to 1.4.333
ac,radv: add ac_emit_sdma_constant_fill()
ac,radv,radeonsi: add ac_emit_sdma_copy_linear()
radv: remove unnecessary handling of SDMA in radv_cs_emit_write_event_eop()
ac,radv,radeonsi: add ac_emit_sdma_copy_linear_sub_window()
ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()
ac,radv: add ac_emit_sdma_copy_t2t_sub_window()
radv: remove now unused SDMA helpers
vulkan: add support for vkCustomResolveCreateInfoEXT
radv: implement VK_EXT_custom_resolve
radv: advertise VK_EXT_custom_resolve
ci: build drm-shim for RADV tests in debian-vulkan
radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY
radv: always use MALL for CP DMA operations on GFX12
radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()
radv: fix RB+ for depth-only with unused attachments
amd/drm-shim: export a function that allows to select a different device
meson: require drm-shim for ACO tests
aco/tests: switch to drm-shim
vulkan: stop excluding Shader64BitIndexingEXT SPIR-V cap
radv: allocate the SQTT BO in GTT for faster readback
ac/spm: add cache counters configuration for GFX12
ac/spm: adjust the granularity of SPM results on GFX12
ac/spm: use hardware names for performance counters
radv: enable RADV_THREAD_TRACE_CACHE_COUNTERS on GFX12
radv: remove the ability to create NULL devices with RADV_FORCE_FAMILY
ac/spm,radv,radeonsi: configure the SPM sample interval in common code
radv: only reset SPM when cache counters are enabled with RGP
ac,radv,radeonsi: add more SPM helpers to common code
radv: reformat debug/perftest options arrays
radv: use a separate parameter for radv_rt_wave64
radv: use a separate parameter for radv_disable_dcc
radv: ignore radv_disable_dcc{_mips} drirc options on GFX12
radv: fix per-submit RGP captures on video queues
radv: add a new dirty state for the VRS surface state on GFX11+
radv: implement VRS for flat shading on GFX11+
radv: enable VRS for flat shading on GFX11+
radv: fix resetting descriptor pool since the new descriptor sets allocator
radv: add radv_hide_rebar_on_dgpu and enable for Red Dead Redemption 2
radv: make sure to reset uses_fbfetch_output for NULL fragment shaders
radv: fix fbfetch output with ESO
ac/surface: do not use tile swizzle for replayable/aliased FMASK surfaces
radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3
ci: uprev vkd3d
Revert “radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3”
ci: uprev VKCTS main to 211e452358f5cafd14bdd76d78342b62741e94aa
radv: reduce maxTexelBufferElements to 1<<29
vulkan: update spec to 1.4.335
radv: add support for computeDerivativeGroupQuads on < GFX12
radv: enable conservativeRasterizationPostDepthCoverage on GFX10+ when possible
radv: remove redundant buffered regs emission for dispatches on GFX12+
radv: constify radv_gfx12_emit_buffered_regs()
radv: decouple RT and compute dispatches paths
radv: add radv_cmd_state::emitted_rt_pipeline
radv: only include executable size when capturing shaders with RGP
radv: fix race condition when getting the blit queue
radv: add RADV_DEBUG=vm option
radv: rename RADEON_FLAG_VA_UNCACHED to RADEON_FLAG_GL2_BYPASS
radv: constify radv_{cb,ds}_buffer_info parameters
radv/meta: inject image view usage info
radv/meta: stop passing a stencil attachment for depth decompress
radv: create descriptors for color/depth-stencil surfaces earlier
zink/ci: add two tests to the skip lists
ac,radeonsi: move si_tracked_reg to common code
ac/cmdbuf: add new slots to ac_tracked_reg
radv: switch to AC_TRACKED_xxx
ac,radv,radeonsi: add ac_tracked_regs
radeonsi: remove dead code in si_set_tracked_regs_to_clear_state()
ac,radv,radeonsi: add functions to initialize tracked regs
radv: remove redundant assertions in radeon_emit_{array}()
ac,radv: add more cmdbuf emit helpers
ac,radv: add ac_cmdbuf::context_roll and use it
ac,radv,radeonsi: add tracked register macros to common code
radv: add the SQTT relocated shaders BO to the cmdbuf list
radv/nir: fix front_face opts for points/lines and unknown prim
ac/perfcounter: add a separate group for GFX10.3
ac/perfcounter: adjust the number of events for TD on GFX10.3
ac/perfcounter: add GCEA block description on GFX10-11
ac/spm: adjust configuration of some GPU blocks
ac/spm: add an assertion to check the number of global instances
ac/spm: fix programming more than one counter slot
ac/spm: print an error message when a group is unknown
ac/spm: add an ID to raw performance counters
ac/spm: implement the new derived SPM chunk for performance counters
ac/spm: add support for new LDS counters in RGP 2.6
ac/spm: add support for new Memory bytes counters in RGP 2.6
ac/spm: add support for new Memory percentage counters in RGP 2.6
ac/spm: add support for Ray Tracing counters in RGP
ac/rgp: enable new performance counters for RGP 2.6 on GFX10-GFX11
radv: change the default value of RADV_TRACE_CACHE_COUNTERS on < GFX10
radv: fix capturing performance counters with SPM
amd,radv,radeonsi: add a new function to update windowed perf counters
ac/perfcounter: move configuration for GFX12 in a separate file
ac/perfcounter: define a distribution mode for all perf blocks on GFX12
ac/perfcounter: update the number of events for GRBME_SE on GFX12
ac/perfcounter: fix the number of static instances for some blocks on GFX12
ac/perfcounter: rework computing the number of block instances on GFX12
ac/perfcounter: update configuration of many blocks on GFX12
ac/spm: fix GRBM broadcasting for global blocks
ac/spm: select correct broadcasting mode for CPF/GCEA blocks
ac/spm: prevent selecting invalid brodcast mode for SPM blocks
radv: increase the reserved CS space size for SPM
ac/perfcounter: remove setting unused fields for GFX12 blocks
ac/perfcounter: fix configuration of SQ/SQ_WGP blocks on GFX12
ac/perfcounter: define new GPU blocks on GFX11+
ac/perfcounter: move configuration for GFX11 in a separate file
ac/perfcounter: define a distribution mode for all perf blocks on GFX11
ac/perfcounter: fix the number of static instances for some blocks on GFX11
ac/perfcounter: update configuration of many blocks on GFX11
ac/perfcounter: compute the number of block instance properly on GFX11
ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY
radv: use 2D swizzle modes for 3D CB render targets when optimal
radv: add new drirc radv_prefer_2d_swizzle_for_3d_storage
radv: enable radv_prefer_2d_swizzle_for_3d_storage for TLOU1
ac/perfcounter: fix number of instances for GCEA
ac/perfcounter: move configuration for GFX10.3 in a separate file
ac/perfcounter: define a distribution mode for all perf blocks on GFX10.3
ac/perfcounter: fix the number of static instances for some blocks on GFX10.3
ac/perfcounter: update configuration of many blocks on GFX10.3
ac/perfcounter: compute the number of block instance properly on GFX10.3
ac/perfcounter: move configuration for GFX10 in a separate file
ac/perfcounter: define a distribution mode for all perf blocks on GFX10
ac/perfcounter: fix the number of static instances for some blocks on GFX10
ac/perfcounter: update configuration of many blocks on GFX10
ac/perfcounter: compute the number of block instance properly on GFX10
ac/spm: fix a crash with the RT counters on GFX10
ac/perfcounter: add new GCEA_CPWD block definition on GFX12
ac/perfcounter: add new GCEA_SE block definition on GFX12
ac/spm: rework indexing of the derived groups/counters/components
ac/spm: update the cache group on GFX12
ac/spm: add support for new LDS counters in RGP 2.6 on GFX12
ac/spm: add support for new Memory bytes counters in RGP 2.6 on GFX12
ac/spm: add support for new Memory percentage counters in RGP 2.6 on GFX12
ac/spm: add support for Ray Tracing counters in RGP on GFX12
ac/rgp: enable the new derived SPM chunk for performance counters on GFX12
ac/spm: use GPU block distribution mode to determine broadcasting
ac/spm: use GPU block distribution mode to determine instances
ac/perfcounter: add num_{16,32}bit_spm_counters to GPU blocks
ac/perfcounter: rename ac_pc_block::num_instances to num_scoped_instances
ac/perfcounter: rename ac_pc_block::num_global_instances to num_instances
Revert “radv: allocate the SQTT BO in GTT for faster readback”
radv/sqtt: add a comment about the allocation strategy of the SQTT BO
ci: uprev vkd3d
radv: fix flushing gang semaphore with SDMA/ACE
radv/ci: document a regression with transfer queue on RENOIR
radv/rt: fix a compilation warning about uninitialized fields
radv: use UNREACHABLE for illegal texture filter
radv: remove extra instructions after UNREACHABLE
ac/spm: fix typo in one GPU perf block name
ac/spm: define new per-shader engine blocks
ac/perfcounter: fix number of 32-bit SPM counters
ac/perfcounter: fix computing number of 16-bit/32-bit SPM counters
ac/perfcounter: define more GPU blocks on GFX12
ac/perfcounter: re-order GPU perf blocks on GFX12
ac,radv,radeonsi: rename num_spm_counters to num_spm_modules
ac/perfcounter: add missing configuration for GCEA on GFX11
ac/perfcounter: fix number of scoped instances for RMI block
ac/perfcounter: define more GPU blocks on GFX11
ac/perfcounter: re-order GPU perf blocks on GFX11
radv/sqtt: use VkCommandBuffer objects for SQTT start/stop sequences
radv/sqtt: rework allocating the SQTT buffer
radv/sqtt: use a staging buffer for faster reads on dGPUS
radv/spm: rework allocating the SPM buffer
radv/spm: use a staging buffer for faster reads on dGPUS
ac,radv: sample and set correct shader/memory clocks for RGP
ac/perfcounter: use GFX11 definition for GFX11.5
radv: enable SPM for GFX11.5
ac/sdma: fix stencil only copies on GFX9
ci: uprev VKCTS main to 4d3bedc74e2258c483cf968753207cff84d9e4fc
zink/ci: document a GLX crash on RADV/POLARIS10
radv/sqtt: rework radv_emit_sqtt_userdata() to support gang CS
radv/sqtt: emit userdata in the gang CS when needed
radv: fix missing SQTT markers for task+mesh draws
radv/dgc: adjust task+mesh SQTT markers
ac/cmdbuf: disable ENABLE_PING_PONG_BIN_ORDER on GFX11.5
radv/sqtt: delay VMID reservation at capture time
radv/sqtt: fail if GPU clocks can’t be sampled
radv/meta: use 2D array for color resolves with compute
radv/meta: batch resolving all color image layers with compute
radv/meta: always use mip level 0 for source image resolves
ac/debug: add a function that dumps texture descriptors
radv/meta: fix layered depth stencil resolves with compute
radv: always fast-clear color image with comp-to-single on GFX11-11.5
radv/meta: add support for fast clearing color images with non-zero baseArrayLayer
radv: optimize layered fast clear colors when comp-to-single is supported
ac/nir: fix computing cube derivatives when the major axis is negative
vulkan: fix missing begin debug marker for HPLOC
radv: fix applying radv_ssbo_non_uniform=true for Crysis 2/3 remastered
radv: add a workaround for a synchronization bug in Strange Brigade Vulkan
radv: zero-initialize image view objects
radv: fix tracking of pipelines used in secondaries
radv: disable unordered submits when SQTT queue events are enabled
radv: emit pending flushes after late decompressions with fbfetch
radv/meta: fix the key for DCC decompress on compute
radv: fix late decompressions for fbfetch with more corner cases
radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE
Saroj Kumar (1):
radeonsi: Move binary upload, dump code to new file
Serdar Kocdemir (3):
gfxstream: Check host allocation mode for external memory
gfxstream: Enable VK_EXT_blend_operation_advanced
gfxstream: Add VK_EXT_frame_boundary support
Sergi Blanch Torne (18):
ci: disable Collabora’s farm due to maintenance
Revert “ci: disable Collabora’s farm due to maintenance”
ci: disable Collabora’s farm due to maintenance
Revert “ci: disable Collabora’s farm due to maintenance”
crnm: default wo coloring when unknown GitLab job status
crnm: clean uncolored job status
ci,piglit: update expectations from piglit nightly
ci,crnm: enable attempts ctr include status
ci,crnm: warning message when a job can’t be enabled
ci,crnm: enhancement within a GitLab job
ci,crnm: inhibit single target trace dump
ci,crnm: refresh_wait_job as argument
ci,tools: know if running within a GitLab job
ci,crnm: inhibit pretty_wait within a GitLab job
ci,crnm: fix round error in pretty_duration
ci: disable Collabora’s farm due to maintenance
Revert “ci: disable Collabora’s farm due to maintenance”
ci,piglit: update expectations from gc2000 piglit nightly
Sharjeel Khan (1):
gfxstream: [C++23] Fixes for C++23 issues
Silvio Vilerino (84):
p_video_codec::encode_bitstream_sliced: Add last_slice_completion_fence for PIPE_VIDEO_SLICE_MODE_AUTO
mediafoundation: Helpers ConfigureBitstreamOutputSampleAttributes/ConfigureStatsMetadataOutputSampleAttributes
mediafoundation: Add Resolve completion fence to stats IDXGIBuffers
mediafoundation: Set ConfigureBitstreamOutputSampleAttributes earlier for async subregion notifications do not need resolved metadata for it
mediafoundation: Attach async stats DXGI buffers without CPU fence wait
mediafoundation: emit subregions samples before pAsyncFence wait to reduce latency
mediafoundation: Add support for setting CODECAPI_AVEncSliceGenerationMode
mediafoundation: Prepare for multi sample multi slice
mediafoundation: Emit multiple MFSamples per slice when CODECAPI_AVEncSliceGenerationMode = 1i
mediafoundation: Add some more trace logging
mediafoundation: Attach stats deferred buffers to all samples for simplicity
d3d12: Implement last slice signal by splitting Encode/Resolve in two ECL
mediafoundation: Refactor frame, multi slice and combine slice IMFSample emission to make it simpler
mediafoundation: Add pLastSliceFence shortcircuit wait for auto slice mode async slices mode
mediafoundation: Only attach stats to last slice mfsample
d3d12: Use a separate queue for encode resolve operations
d3d12: Optimize d3d12_video_encoder_flush
d3d12: Remove multiple index calc in d3d12_video_encoder_begin_frame
d3d12: Remove multiple index calc in d3d12_video_encoder_prepare_input_buffers
d3d12: Cache ID3D12VideoDevice4 instance if supported
d3d12: Cache ID3D12VideoEncoderHeap1 instance if supported
d3d12: Cache ID3D12VideoEncodeCommandList4 instance if supported
d3d12: Remove per frame allocation slice_sizes(picture->num_slice_descriptors)
d3d12: Only call CheckFeatureSupport(D3D12_FEATURE_FORMAT_INFO when video format changes
d3d12: Only check HEVC video caps if configuration changed between frames
d3d12: Remove unused d3d12_video_encoder::m_transitionsBeforeCloseCmdList
d3d12: Use cached heap allocations for barriers instead of allocating per frame
d3d12: Use cached heap allocations for output bitstreams instead of allocating per frame
d3d12: Video Encode - Make some parameters const & instead of by value
d3d12: Use readback heaps for staging bitstream allocations
d3d12: d3d12_video_encoder_get_slice_bitstream_data use regular Map/Unmap
d3d12: Only check H264 video caps if configuration changed between frames
d3d12: Make output metadata frame buffer READBACK and use direct Map() in get_feedback
d3d12: d3d12_promote_to_permanent_residency to accept res array batch
d3d12: Only check for GetDeviceRemovedReason in debug builds
mediafoundation: SliceGeneration=1: Zero copy IMFSample output with wrapped ID3D12Resource frame/slice buffers
mediafoundation: Only use sliced mode when CODECAPI_AVEncSliceGenerationMode is set, disregarding num slices configured
mediafoundation: Allocate pro-rated buffer sizes for multi-slice encoding
mediafoundation: Only wait on pSyncObjectQueue for stats completion if any stat was enabled
mediafoundation: Also set pSyncObjectQueue = m_spStagingQueue when DX11 input sample
d3d12: Fix d3d12_video_enc.cpp(4794,33): Error C4244: initializing: conversion from uint64_t to SIZE_T, possible loss of data
d3d12_video_encoder_nalu_writer_hevc: Reuse per frame scratch allocations
d3d12_video_encoder_nalu_writer_h264: Reuse per frame scratch allocations
d3d12: d3d12_video_encoder_references_manager_hevc remove double resize() and add reserve() to cached vectors
d3d12: Fix d3d12_promote_to_permanent_residency always making resident
mediafoundation: Fix width/height typo in alignment calculation
mediafoundation: encode.cpp: Remove redundant lock() and memset()
mediafoundation: Optimize STL usage in reference_frames_tracker_hevc.cpp
mediafoundation: Cleanup MaxL1References variables
mediafoundation: Remove unused AllocatePipeResourceFromAllocator
d3d12: Use EnqueueMakeResident with GPU Wait for video permanent residency promotions
d3d12: Remove redundant d3d12_promote_to_permanent_residency overload
d3d12: Video Encode - Remove unnecessary resource waits and syncs since we sync batch fence
d3d12: Video Encode - Remove redundant buffer barriers
d3d12: Video Encode - Flush the pipe context async while submitting encode
pipe: Add PIPE_VIDEO_CAP_ENC_READABLE_RECONSTRUCTED_PICTURE
d3d12: Implement PIPE_VIDEO_CAP_ENC_READABLE_RECONSTRUCTED_PICTURE
d3d12: d3d12_video_proc - Use async residency functions
d3d12: Add get_video_enc_last_slice_completion_fence interop
d3d12: Support d3d12_video_buffer_creation_mode::place_on_resource in d3d12_video_buffer_from_handle
d3d12: Optimize d3d12_video_proc heap allocations
d3d12: Support PIPE_BIND_SHARED resource creation
d3d12: video_processor: Use d3d12_video_buffer subresource indices
d3d12: d3d12_video_buffer - Expose associated data with subresource idx
mediafoundation: Add m_bHWSupportReadableReconstructedPicture
mediafoundation: Add AVEncVideoReconstructedPictureOutputMode and MFSampleExtension_VideoEncodeReconstructedPicture
d3d12: Fix hang in d3d12_video_encoder_extract_encode_metadata with PIPE_VIDEO_SLICE_MODE_AUTO
d3d12: Fix max slice worst case estimation for PIPE_VIDEO_SLICE_MODE_AUTO
mediafoundation: Fix num_output_buffers for PIPE_VIDEO_SLICE_MODE_AUTO
mediafoundation: Add a min slice buffer size stopgap
d3d12: Bump min size in d3d12_video_encoder_calculate_max_output_compressed_bitstream_size
mediafoundation: Remove stale call to MFCreateMemoryBuffer
d3d12: Add buffer size check to d3d12_video_encoder_get_slice_bitstream_data
mediafoundation: Check for PIPE_VIDEO_CODEC_UNIT_LOCATION_FLAG_MAX_SLICE_SIZE_OVERFLOW in calls to get_slice_bitstream_data
d3d12: Video Encode - Do not flush on direct buffer maps
d3d12: Video Encode - Reduce unnecessary syncs between encoder and context queues
mediafoundation: Move dpb_buffer_manager::get_read_only_handle into d3d12 driver and cache resource
mediafoundation: Remove redundant fence openings in ProcessInput
mediafoundation: Take m_EncoderLock only for work submission in ProcessInput
d3d12: Add video encode bitstream buffer full frame size check in get_feedback
mediafoundation: Copy and remove padding gaps in output IMFMediaBuffer if necessary
d3d12: Prefer video encode suballocated buffer mode for subregion notification mode
d3d12: Add missing using Microsoft::WRL:ComPtr in d3d12_context_common
d3d12: Add HAVE_GALLIUM_D3D12_VIDEO guards for d3d12_video_encoder_set_max_async_queue_depth/d3d12_video_encoder_get_last_slice_completion_fence
Simon McVittie (2):
vulkan: Don’t emit library_arch if the library_path is just a basename
vulkan: Optionally share one JSON manifest per driver between architectures
Simon Perretta (4):
nir: commonize barycentric intrinsic opt pass
pvr: temporarily disable gs_rta_support on all cores
pco: restrict shadow sampler comparator clamping to unorm formats
pco: update formatless skip check
Simon Richter (1):
anv, hasvk: Fix reported CPU page size
Steev Klimaszewski (1):
tu: Stop printing descriptor pool allocation failures
Sushma Venkatesh Reddy (7):
intel/dev: Add geometry, color and depth pipes count
intel/perf: Update perf scripts to get additional performance counters
drirc: Add anv_assume_full_subgroups for Detroit: Become Human
brw: Add BRW_TYPE_BF8 and BRW_TYPE_HF8 for float8
brw: Add EU assembler support for float8
compiler: Add FP8 types to GLSL type decoder
brw: Use lookup tables for Gfx12+ 3src type encoding/decoding
Sviatoslav Peleshko (4):
mesa,driconf: Add WA to initialize vertex program outputs to vec4(0,0,0,1)
driconf: Add vertex_program_default_out option for Penumbra: Overture
nir/normalize_cubemap_coords: Handle the projector before the normalization
mesa/main/ff_frag: Don’t generate the projector for cubemap sampling
Tapani Pälli (25):
anv: bring back some lost game drirc workarounds for subgroups
intel/dev: update mesa_defs.json from internal database
intel/genxml: add registers handling autostrip for gfx200
iris: implement autostrip disable for Wa_14024997852
anv: implement autostrip disable for Wa_14024997852
anv: fix issues found with indirect data stride
anv: throw anv_finishme warnings only on debug builds
anv: remove own GetRenderingAreaGranularityKHR
drirc/iris: add drirc to disable threaded context
drirc: set intel_disable_threaded_context for Amnesia The Bunker
compiler/glsl: validate input blocks with opaque/booleans
anv: add furmark workaround layer
anv: add vk_wsi_disable_unordered_submits and enable for GTK
crocus: add struct crocus_scissor_state to clamp values to 16bit
anv/drirc: disable Xe2 CCS drm modifiers for GTK engine
anv: hand over ANV_PIPE_RT_BTI_CHANGE to pipe control
crocus: make sure we have at least 1x1 surface to create null surf
anv: fix setting emitted_flush_bits
anv: fix queue check in anv_blorp_execute_on_companion on xe3
blorp: fix asserts hit with msaa blorp blits on xe3
anv: route clear operations on compute to companion
intel/genxml: add CHICKEN_RASTER_2 with required bit for Xe3
anv: set DisableAnyMCTRresponsefix to zero on init
iris: set DisableAnyMCTRresponsefix to zero on init
anv: skip compressed flag for bo if not supported by modifier
Taras Pisetskyi (1):
drirc/anv: force_vk_vendor=-1 for Wuthering Waves
Thong Thai (5):
meson: add libva wrap and fallback option
frontends/va: get libva api version from va_version.h
meson: add jpeg as a video-codec
meson: add mpeg12dec as a video-codec
frontends/va: include picture_*.c based on selected codec
Tim Van Patten (1):
docs/envvars: Add section: Android System Properties
Timothy Arceri (7):
mesa: skip redundant uniform update optimisation if unsafe
glsl: assign block indices in the order they appear
mesa: fix _mesa_update_texture_matrices()
util/driconf: Add linux version of Penumbra fixes
util/driconf: add Cursemark workaround
driconf: add a way to override GLX_CONTEXT_RESET_ISOLATION_BIT_ARB
util/driconf: add workaround for Interstellar Rift
Timur Kristóf (60):
ac/nir/ngg_mesh: Lower num_subgroups to constant
ac/nir/ngg: Fix scratch space for NGG GS streamout
ac/nir/ngg: Use align() instead of ALIGN()
radeonsi/ci, zink+radv/ci: Remove GS primitive_counter tests from flakes
radv: Disable sparse mapping when unsupported by VM
ac/gpu_info: Disable sparse VM mappings pre-Polaris, for now
radeonsi: Inline si_choose_spi_color_formats
radeonsi: Respect if rbplus is allowed when choosing color formats
radv, radeonsi: Move GFX6-7 CB clamp issue to ac_gpu_info
ac: Improve description of some HW workarounds
ac/gpu_info: Rename has_sparse_vm_mappings to has_sparse
ac/gpu_info: Fix determining when CP DMA supports sparse
ac/surface: Use ADDR_TM_PRT_TILED_THIN1 on GFX6-8
ac/gpu_info: Add different sparse features
radv: Advertise sparse features pre Polaris with perftest flag
radv: Check RADV_PERFTEST=sparse for image formats and sparse queue
aco: Use only VGPR offset on buffer atomics on GFX6-7
radv: Use zero-filled BO for GFX6 and GFX10 null index buffer bug
ac/nir/lower_taskmesh_io_to_mem: Don’t hardcode num_entries in shaders
ac/nir/lower_taskmesh_io_to_mem: Don’t hardcode payload entry size in shaders
radv, radeonsi: Don’t pass task ring info to mesh/task payload lowering
ac/nir/lower_taskmesh_io_to_mem: Use AC_TASK_DRAW_ENTRY_BYTES
radv: Bypass L2 for gang semaphore BO with SDMA/ACE
radv: Add function to determine if SDMA supports an image.
radv: Require gang submit and compute for transfer queues
radv: Update comments for gang semaphores
radv: Implement gang semaphores for transfer queues.
radv: Use SDMA fence packet when flushing gang semaphores
radv: Declare some gang submit functions in radv private header.
radv: Initialize transfer queue gang when needed
nir/opt_vectorize_io: Fix allow_holes option
radv: Lower 64-bit VS inputs to 32-bit
radv: Scalarize and re-vectorize unlinked shader I/O
radv: Only run some optimizations when scalarization made progress
radv: Don’t call nir_opt_combine_stores anymore
radv: Don’t call nir_compact_varyings anymore
radv: Don’t call nir_remove_unused_varyings anymore
radv: Don’t call nir_link_opt_varyings anymore
nir: Add new nir_remove_outputs pass
radv: Use nir_remove_outputs with the noop FS.
radv: Remove radv_remove_varyings.
radv: Add layout argument to transfer_copy_buffer_image.
radv: Use compute for transfer operations unsupported by SDMA
radv: Use compute copy for emulated formats
radv/ci: Adjust expected failures list for transfer queues
nir: Add pass to lower workgroup size
ac/cu_info: Add GFX6-7 SMEM OOB bug
ac/nir: Add pass to fixup SMEM on GFX6-7
radv/amdgpu: Add ability to pad BOs with a read-only VM page
radv: Mitigate GFX6-7 SMEM bug for NULL and mutable descriptors
radv: Mitigate GFX6-7 SMEM bug for robust OOB access
mesa: Require at least 512 variable invocations for ARB_compute_variable_group_size
radeonsi: Limit variable workgroup size to 256 for CS regalloc bug
radv: Lower larger workgroups to 256 for CS regalloc bug
radeonsi: Lower larger workgroups to 256 for CS regalloc bug
radv: Allow using compute queue with CS regalloc hang bug on GFX7
radeonsi: Allow using compute queue with regalloc hang bug on GFX7
radv: Remove previous mitigation of CS regalloc hang bug
radeonsi: Remove previous mitigation of CS regalloc hang bug
ac/gpu_info: Remove FIXME from regalloc hang description
Tomeu Vizoso (1):
dril: don’t build a rocket_dri.so
Tomoki Imai (1):
lavapipe: Support VkDrmFormatModifierPropertiesList2EXT
Utku Iseri (19):
panfrost,panvk: rename pan_fb_info::extent to draw_extent
panfrost,panvk: distinguish fbd bounding box from framebuffer size
panvk: prevent aliased images from using AFBC
panvk: only add storage usage without AFBC
panvk: explicit fallback to linear for legacy scanout images
panvk: change AFBC subresource layout pitches to byte sizes
pan/mod: allow non-tiled modifiers to be optimal
panvk: allow TILING_DRM_MODIFIER_EXT with AFBC
panvk: advertise support for AFBC WSI behind a debug flag
panvk: fix for clearing render targets with 8+ layers
panvk: set allow_forward_pixel_to_be_killed for draws
panfrost: add earlyzs FPK condition for v6-
zink: fix layer count with cubemaps
st/pbo: set src_type on the upload path
zink: set gfx_pipeline_state.dirty for blit rp changes
zink: don’t set gfx_pipeline_state.dirty if min_samples didn’t change
zink: don’t set pipeline_state.dirty for halfz with full_ds3
zink: use gfx_pipeline_state.dirty as a pipeline update condition
zink: handle split DS blits with zink_blit calls
Val Packett (1):
tu: support driconf option force_vk_vendor
Valentine Burley (48):
docs: Update LAVA caching setup
ci/deqp: Also print logs to logcat on Android
tu: Fix indexing with variable descriptor count
tu: Fix maxVariableDescriptorCount with inline uniform blocks
zink/ci: Document ANV flake
venus/ci: Skip slow test on ANV with Cuttlefish
ci: Update linux-firmware version to pick up more ARM firmware
panfrost/ci: Drop redundant KERNEL_IMAGE_NAME for rock-5b
panvk/ci: Add a VKCTS job on G925
panvk/ci: Add an ANGLE job on G925
panfrost/ci: Drop redundant PAN_MESA_DEBUG variables
panfrost: Don’t dump shader disassembly by default on CSF
panfrost/ci: Enable G610 piglit job
turnip/ci: Increase coverage of a660-vk job
freedreno/ci: Move a660-gl-cl job back to pre-merge
ci: Remove Piglit replayer from test-vk container/rootfs
venus/ci: Add missing Collabora farm rules to ANV jobs
ci/lava: Use a660_zap.mbn from linux-firmware
intel/ci: Drop timeout overrides for pre-merge jobs
lavapipe/ci: Run vkd3d job in parallel
anv/ci: Run vkd3d job in parallel
ci/android: Build zink for arm64 as well
egl: Disable kopper on Android
Revert “anv/ci: Run vkd3d job in parallel”
ci: Drop hardware-job prerequisite check jobs
anv/ci: Increase timeout for nightly JSL job
ci: Uprev VKCTS
ci: Uprev GL & GLES CTS
ci/deqp: Backport Android logcat commit
zink/ci: Document recent Turnip flakes
panfrost/ci: Fix GitLab rules after YAML split
ci: Allow PIGLIT_TAG to be unset in deqp-runner script
lavapipe/ci: Add a nightly ASAN job
zink/ci: Mark new TGL glx failures as flakes
ci/android: Update to Android 16
ci/android: Remove custom kernel
ci/android: Reduce Cuttlefish log verbosity
ci/android: Quieten extracting Mesa artifacts
Revert “ci/android: add sudo to EPHEMERAL deps for debian/x86_64_test-android.sh”
lavapipe/ci: Move android-angle-lavapipe-cts job to nightly
venus/ci: Switch Alder Lake job to Xe KMD
ci: Uprev Vulkan Validation Layers
ci: Update deqp-runner to pull in gtest suite support
radeonsi/ci: Convert libva-utils job to deqp-runner suite
radeonsi/ci: Remove redundant radeonsi-vaapi-fluster-rules
radeonsi/ci: Merge VA-API jobs
tu: Handle VkDrmFormatModifierPropertiesList2EXT
tu: Fix memory leak of patchpoints_ctx in dynamic rendering
Vinson Lee (6):
gfxstream: Fix GfxStreamVulkanMapper.cpp build error
bin/symbols-check: Fix undefined symbol detection on macOS
util/u_printf: Fix const correctness in util_printf_next_spec_pos
util/blob: Fix const correctness warning in blob_read_string
compiler/clc: Fix const correctness in libclc_add_generic_variants
freedreno/decode: Fix const correctness in get_tex_count
Xaver Hugl (2):
vulkan/wsi: require extended target volume support for scRGB
vulkan/wsi: remove support for VK_COLOR_SPACE_EXTENDED_SRGB_NONLINEAR_EXT
Yiwei Zhang (113):
panvk: fix to advance vs driver_set properly
panvk: fix to advance vs res_table properly
panvk: minor cleanup in cmd_prepare_push_uniforms
panvk: use cs_move_reg32 and lower to cs_add32 if needed
panvk: support VK_EXT_external_memory_acquire_unmodified
venus: skip feedback cmd record on incompatible queue families
venus: add vn_queue_family_can_feedback helper
venus: allow fence feedback to suspend and resume
venus: update sfb cmd lookup to follow ffb
venus: rename async_wait_mtx to counter_mtx
venus: allow timeline semaphore feedback to suspend and resume
venus: enable sparse only queue family along with feedback
panvk: use nir_log_shader to log NIR on Android
panvk: support VK_EXT_device_memory_report
panvk: fix sample shading of internal blend shader for MSAA
llvmpipe: zero is also a valid fd
llvmpipe: fix udmabuf mmap error check
llvmpipe: add a missing alloc error handling in fd import
llvmpipe: misc fixes for sparse binding
llvmpipe: support sparse resource with LLVMPIPE_MEMORY_FD_TYPE_DMABUF
llvmpipe: handle mmap failure for lp_texture
llvmpipe: handle os_dupfd_cloexec failure
llvmpipe: refactor dmabuf and opaque fd handling
util: add get_fd_header helper in os_memory_fd
util: add os_map_memory_fd_placed for placed mapping support
llvmpipe: add fd type INVALID and ANONYMOUS
llvmpipe: split sparse binding part to llvmpipe_resource_bind_sparse
llvmpipe: refactor llvmpipe_resource_bind_sparse
llvmpipe: support sparse resource with LLVMPIPE_MEMORY_FD_TYPE_OPAQUE
venus: enable sparse resource support on lavapipe
glcpp/meson: fix libglcpp generated header dependency
llvmpipe: add missing util/os_file.h header
panvk: fix mem alloc size for VkBuffer backed by imported blob AHB
pps/meson: amend missing util deps for os_get_option usage
pps/meson: minor refactor for pps_deps
venus: use seq_cst for ring cs and tail update ordering
venus: add a wsi image log
venus: avoid re-imported dma-buf to have a larger map size
venus: properly fix the blob mem mapping size
venus: add error log coverage for virtgpu backend
venus: fix racy semaphore feedback counter update
ci/venus: skip Android incremental and shared present tests
ci/venus: skip those causing oom killer to kill deqp
venus: sync to latest protocol for v1.4.334
venus: enable promoted VK_KHR_robustness2
docs: add VK_KHR_robustness2 and supported drivers
venus: add renderer support for placed mapping
venus: implement VK_EXT_map_memory_placed
venus: sync protocol for sorted VkCommandTypeEXT enum defines
venus: sync latest protocol for more shader extensions support
venus: support VK_KHR_cooperative_matrix
venus: support VK_KHR_shader_bfloat16
venus: support VK_KHR_shader_untyped_pointers
venus: support VK_EXT_shader_float8
venus: support VK_EXT_shader_uniform_buffer_unsized_array
venus: device create to filter promoted swapchain_maintenance1
venus: sync protocol for VK_EXT_mesh_shader support
venus: add VK_EXT_mesh_shader support
ci: uprev virglrenderer
pan: fix pan_blend_reads_dest to consider special min/max funcs
nir: suppress clang warnings for cooperative matrix lowering
venus: add missing VKAPI_ATTR/CALL
kk: add mtl_device_get_gpu_timestamp bridge
kk: support VK_(KHR|EXT)_calibrated_timestamps
vulkan: update ALLOWED_ANDROID_VERSION for api level 36
venus: hide unsupported wsi extensions on Windows
venus: hide unsupported device extensions on Windows
venus: hide unsupported external extensions on Windows
venus: disable TLS ring prio forwarding on Windows
venus: refactor meson to be more flexible for additions
venus: hide vtest from Windows build
venus: use vk_common_GetPhysicalDeviceCalibrateableTimeDomainsKHR
venus: adopt vk_common_GetCalibratedTimestampsKHR
venus: amend missing VKAPI_ATTR/CALL for render pass APIs
venus: track renderer driver version for driver workaround
venus: allow hw wsi for newer Nvidia proprietary driver
zink: tighten up export paths that require true dmabuf support
panvk: fix to defer disk cache init after vk_physical_device_init
venus: respect VK_SUBOPTIMAL_KHR returned from wsi image acquire
venus: remove TP in vn_ResetDescriptorPool
venus: refactor to avoid nesting vn_QueueSubmit entrypoint
venus: vn_GetFenceFdKHR no need to block wait
venus: add vn_wsi_sync_wait to handle implicit sync workaround
venus: track swapchains
venus: prepare chain access for async present
venus: add chain lock helpers for async present
venus: add back vn_QueuePresentKHR
venus: add a deep copy helper for VkPresentInfoKHR
venus: prepare to flush async queue present
venus: implement async present
venus: vn_wsi_sync_wait to relax chain acquire for async present
venus: enable async presentation along with a perf option
venus: vn_wsi_sync_wait to relax queue access for async present
venus: only preserve 12 bits for VkBufferCreateFlagBits
venus: refactor vn_buffer_get_cache_index
venus: cache VkBufferUsageFlags2CreateInfo
venus: amend missing logs for image format cache dump
venus: workaround to consider ALIAS for image mem req cache
venus: refactor vn_QueueSubmit2
venus: allow vtest to properly wait for present
venus: fix aliased image memory requirement caching
vulkan/wsi: avoid host stage when blit to foreign queue
vulkan/wsi: fix present wait support and present id creation condition
vulkan/wsi: rename khr_present_wait to has_present_wait
vulkan/wsi: improve present wait enablement tracking
venus: track prime blit dst buffer memory in the wsi image
venus: track dedicated image during mem alloc
venus: add vn_renderer_bo_export_sync_file helper
venus: refactor vn_AcquireNextImage2KHR
venus: properly handle wsi implicit in-fence
venus: refactor Android ANB tracking to avoid confusions with WSI
venus: remove obsolete asserts for ANB image creation
pan/kmod: drop pan_kmod_bo_check_import_flags validation
Yogesh Mohan Marimuthu (10):
winsys/amdgpu: use correct vm_timeline_point for userq creation
winsys/amdgpu: fwm packet pre-emption for gfx 11.5
winsys/amdgpu: add assert that if kernel fence passes then user fence must pass
winsys/amdgpu: enable userq reg shadowing for gfx11.5
ac: update amdgpu_drm.h for uq metadata query info
winsys/amdgpu,ac: get eop and csa size,alignment from kernel query
util/log: add MESA_LOG_FILE_AUTO to generate log file
winsys/amdgpu: use mesa_log functions instead of fprintf
winsys/amdgpu: print userq job info
winsys/amdgpu: userq job log fwm packet debug count
Yonggang Luo (36):
treewide: Use os_get_option_secure instead secure_getenv
util: Add new function os_get_option_internal to improve os_get_option*
util: Add function os_get_option_dup and os_get_option_secure_dup for latter use
d3d12/dozen: Use os_get_option_dup for passing to ID3D12SDKConfiguration_SetSDKVersion
util,vulkan,llvmpipe: Use os_get_option_dup instead getenv
util: Add PRAGMA_DIAGNOSTIC_IGNORED_CLANG PRAGMA_DIAGNOSTIC_IGNORED_GCC for latter use
nir: Disable gcc warning -Wstringop-overflow for nir_intrinsic_set_* for latter commit
freedreno: Do not use align as variable name, as it’s a function in u_math.h and will be used
freedreno: Use align64 instead ALIGN for 64 bits input
panfrost/drm-shim: Use align_uintptr instead of ALIGN for size_t input
brw: Do not use align as variable name, as it’s a function in u_math.h and will be used
anv: use align/align64 instead ALIGN, as the input is size_t/uint64_t
aco: Use align64 instead ALIGN for 64 bits input
radeon/drm: use align64 for 64 bits input instead of ALIGN
radeon/drm: Replace all usage of ALIGN to align and remove ALIGN macro
treewide: Replace calling to function ALIGN with align
util: Remove unused ALIGN function to prevent future use
treewide: strip unneeded inc_gallium inc_gallium_aux
util: Getting util_align_npot to be same with ALIGN_NPOT so it can be merged latter
ci/microsoft: Downgrading WinFlexBison.win_flex_bison to version 2.5.24
ci: MSVC 2019 is not support anymore, remove it.
ci: update image tags for windows container
docs: Update the minimal MSVC version requirements
gfxstream: Use VK_DRIVER_FILES instead of VK_ICD_FILENAMES as VK_ICD_FILENAMES is deprecated for a while.
gfxstream: Use os_get_option_dup(VK_DRIVER_FILES)
util,asahi,vulkan,panfrost: Replace the remaining usage of getenv with os_get_option
util: Add function os_unset_option/os_set_option for latter use
util: Update os_get_option* comments to match os_set_option
treewide: Replace the usage of setenv manually and #include “util/os_misc.h” when needed
treewide: Use regexp to replace usage of unsetenv with os_unset_option.
treewide: Use regexp to replace usage of setenv with os_set_option.
gfxstream: os_set_option can be used on windows now
util,wgl: Replace usage of putenv with os_unset_option,os_set_option
meson: Use /Zc:enumTypes enables C++ conforming enum underlying type and enumerator type deduction
meson: Remove VK_ICD_FILENAMES totally from source tree.
meson: do not reconstruct ICD paths
Yurii Kolesnykov (2):
loader: Wrap nouveau_zink_predicate with HAVE_LIBDRM
apple_cgl.c: Fix error: call to undeclared function ‘os_get_option’
Yuxuan Shui (1):
wsi/display: Set atomic client cap in Acquire{Drm,Xlib}DisplayEXT as well.
Zan Dobersek (11):
tu: don’t advertise sample location support for VK_SAMPLE_COUNT_1_BIT
tu: emit PC_DGEN_SO_CNTL for any shader type during streamout setup
tu/a7xx: use DI_SRC_SEL_AUTO_XFB for CmdDrawIndirectByteCountEXT
tu: remove data size assert in tu_GetQueryPoolResults
driconf: use vk_dont_care_as_load workaround for Spilled!
tu: use application name matching for Yooka-Laylee driconf option
tu: enable storageBuffer8BitAccess on all a7xx hardware
freedreno/registers: add a custom build target for adreno_pm4.xml.gz
tu: handle DS_DEPTH_BOUNDS_TEST_BOUNDS state under TU_DYNAMIC_STATE_RB_DEPTH_CNTL
tu: allocate transient attachments used for LRZ
tu/kgsl: wait-only submit handling should not ignore sparse bind commands
anonymix007 (3):
venus: Expose deviceLUID in props if available
venus: Guard Linux-specific code against being compiled on Windows
tgsi/nir: Store output variables before each TGSI_OPCODE_RET
hmtheboy154 (1):
v3dv: add support for driconf
hwandy (2):
anv: fix a memory leak in slab allocator.
anv/tests: Add a slab test to cover the memory leak issue.
jaap aarts (1):
radv/sqtt: Prevent concurrent submit when sqtt is enabled
jglrxavpok (1):
radv/aco: Print source location debug info inside ACO disassembly if we have the information
leonperianu (2):
pvr: Change has_fbcdc_algorithm to 1-bit bit-field
pvr: feature promotion to core from derived
llyyr (4):
Revert “drirc/anv: force_vk_vendor=-1 for Wuthering Waves”
vulkan/wsi/headless: populate VkSurfacePresentModeCompatibilityKHR
vulkan/wsi/headless: add stub for VkSurfacePresentScalingCapabilitiesKHR
vulkan/wsi/headless: implement vkReleaseSwapchainImagesKHR for headless
spencer-lunarg (8):
llvmpipe: Fix warning casting 32-bit int to 8-bit
lavapipe: Remove trailing whitespace
llvmpipe: Remove trailing whitespace
llvmpipe: Remove unnecessary includes
lavapipe: Fix crash when using zero queues
lavapipe: Add VK_KHR_copy_memory_indirect formats
lavapipe: Expose EXT version of global_priority
lavapipe: Check for VkCopyMemoryIndirectCommandKHR::size of zero
stefan11111 (3):
glx: Add some NULL pointer checks
gallium/frontends/dri: Don’t force dri cursor buffers to be 64x64
gbm: Make documentation for `gbm_bo_map` more explicit
volodymyr (4):
mesa: ctx->API != API_OPENGL_COMPAT –> !_mesa_is_desktop_gl_compat(ctx)
mesa: ctx->API != API_OPENGLES –> !_mesa_is_gles1(ctx)
mesa: ctx->API != API_OPENGLES2 –> !_mesa_is_gles2(ctx)
mesa: ctx->API != API_OPENGL_CORE –> !_mesa_is_desktop_gl_core(ctx)