Mesa 25.0.0 Release Notes / 2025-02-19

Mesa 25.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 25.0.1.

Mesa 25.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 25.0.0 implements the Vulkan 1.4 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA checksums

SHA256: 96a53501fd59679654273258c6c6a1055a20e352ee1429f0b123516c7190e5b0  mesa-25.0.0.tar.xz
SHA512: 7f5b6674c40b6c8dcab7934512ff754b40a6a8a466422c90236f614d322033d4d465307ddcd983f9f3afb1310e132ec3186a085d261c95493a0c460b2ec59ce8  mesa-25.0.0.tar.xz

New features

  • cl_khr_depth_images in rusticl

  • Vulkan 1.4 on radv/gfx8+

  • VK_KHR_dedicated_allocation on panvk

  • VK_KHR_global_priority on panvk

  • VK_KHR_index_type_uint8 on panvk

  • VK_KHR_map_memory2 on panvk

  • VK_KHR_multiview on panvk/v10+

  • VK_KHR_shader_non_semantic_info on panvk

  • VK_KHR_shader_relaxed_extended_instruction on panvk

  • VK_KHR_vertex_attribute_divisor on panvk

  • VK_KHR_zero_initialize_workgroup_memory on panvk

  • VK_KHR_shader_draw_parameters on panvk

  • VK_KHR_shader_float16_int8 on panvk

  • VK_KHR_8bit_storage on panvk

  • VK_EXT_4444_formats on panvk

  • VK_EXT_global_priority on panvk

  • VK_EXT_global_priority_query on panvk

  • VK_EXT_host_query_reset on panvk

  • VK_EXT_image_robustness on panvk

  • VK_EXT_pipeline_robustness on panvk

  • VK_EXT_provoking_vertex on panvk

  • VK_EXT_queue_family_foreign on panvk

  • VK_EXT_sampler_filter_minmax on panvk

  • VK_EXT_scalar_block_layout on panvk

  • VK_EXT_tooling_info on panvk

  • depthClamp on panvk

  • depthBiasClamp on panvk

  • drawIndirectFirstInstance on panvk

  • fragmentStoresAndAtomics on panvk/v10+

  • sampleRateShading on panvk

  • occlusionQueryPrecise on panvk

  • shaderInt16 on panvk

  • shaderInt64 on panvk

  • imageCubeArray on panvk

  • VK_KHR_depth_clamp_zero_one on RADV

  • VK_KHR_maintenance8 on radv

  • VK_KHR_shader_subgroup_rotate on panvk/v10+

  • Vulkan 1.1 on panvk/v10+

  • VK_EXT_subgroup_size_control on panvk/v10+

  • initial GFX12 (RDNA4) support on RADV

Bug fixes

  • radeonsi: regression with running DaVinci Resolve under rusticl since 666a6eb871d5dec79362bdc5d16f15915eb52f96

  • [ANV][LNL] - Black Myth: Wukong (2358720) - Corruption is visible near the edge of water.

  • [ANV][LNL] - Hogwarts Legacy (990080) - Pixelated corruption is visible when looking out at the water.

  • radv/video/h265: pps.flags.transform_skip_enabled_flag = 1 randomly hangs GPU

  • [ANV][LNL] - Steel Rats (619700) - Game crashes after opening logos play before reaching main menu

  • nvk: Implement host-only descriptors

  • Gnome-shell Wayland fails to start with segfault at modifier-less driver

  • [ANV][LNL] - DYNASTY WARRIORS: ORIGINS (2384580) - Dithered transparency has vertical bands.

  • AMD Radeon R9 270 randomly causes video playback applications to crash with “amdgpu: The CS has been rejected”

  • Rendering issues on GravityMark with RadeonSI ACO

  • i915: multiple tests assert with tgsi_ureg.h:893: ureg_swizzle: Assertion `reg.File != TGSI_FILE_NULL’ failed.

  • shaders/closed/steam/deus-ex-mankind-divided/260.shader_test fails NIR validation

  • shaders/closed/steam/deus-ex-mankind-divided/260.shader_test fails NIR validation

  • panvk : vk_pipeline_cache_object_deserialize: Assertion `reader.current == reader.end && !reader.overrun’ failed.

  • 46a8d5e7ef61735416d0c54886a7a9930621ae2c causes a permission denied spam

  • [BUILD] Build Failure: Implicit Function Declaration ‘timespec_sub_saturate’ (loader_wayland_helper.c)

  • intel genX_acceleration_structure: missing dependency to bvh/header.spv.h

  • KHR_subgroup glsl parsing broken

  • intel: add config options to disable ELK compiler bits

  • a618: godot-tps-gles3-high trace reproducible flakes

  • radv: mesh shader depth-only rendering is broken

  • anv: Enable VK_FORMAT_A4R4G4B4_UNORM_PACK16_EXT for Android 15

  • Using a buffer allocated on a rx 6800XT for scanout on a Ryzen 7950X results in glitches

  • Systemfreeze from mesa version 1:24.3.0-1-x86_64 and above with Chromium and derivatives [and more or less all other graphic related things]

  • msm_kgsl.h:560:21: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘*’ token

  • [radeonsi] VC1 hardware decoding over vaapi outputs green screen

  • consecutive glDrawPixels do not reflect a changed pixel mapping

  • Crashing while Processing Shaders in Marvel Rivals on Mesa 24.3.2 & Mesa 24.3.3

  • Assertion `nir_cf_node_get_function(&block->cf_node)->structured’ failed

  • r300: Conditional jump or move depends on uninitialised value in Xnine.mova test

  • anv: Mesh shaders with two OpSetMeshOutputsEXT instructions are not supported

  • hasvk: apps crash since “intel/compiler: Remove usage of variable length arrays”

  • nir_validate should check metadata

  • anv: vkcube(pp) segfault in multi-GPU config, apparent vkCreateSwapchainKHR failure

  • anv,regression: Black square artifacts in Fenyx Rising on BMG

  • [anv] Cyberpunk visual corruption on BMG

  • [ANV][LNL] - Cyberpunk 2077 (1091500) - Flickering mesh during benchmark.

  • Intel Arc A770: Crosshair in THE FINALS renders too large

  • 3d render issues in Chromium after 1:24.3.1-3 update over 1:24.2.7-1 of mesa package

  • intel/compiler: Out of bounds read in brw_eu_compact.c

  • intel/compiler: Out of bounds read in brw_eu_compact.c

  • egl,dri2: Segfault when running wayland clients on non-default GPU

  • anv,regression: Visual glitches in Ghost of Tsushima on BMG

  • anv, regression: Resident Evil 2 d3d12 freezes in main menu on a Arc b580

  • radeonsi: fails to build with libc++

  • Random mesa crashes in kwin_wayland on a 6600XT

  • enc->enc_pic.enc_pic_order_cnt_type always zero even if pic->pic_order_cnt_type non-zero that application set

  • [anv] Visual corruption in Cyberpunk on LNL and BMG

  • [anv] Borderlands 3 visual corruption on BMG

  • [ANV] LNL triangle corruption on clothing in HogwartsLegacy-trace-dx12-1080p-ultra

  • Intel: Dark graphical glitches on cars and characters on Disney Speedstorm

  • Regression in VA-API decoding

  • freedreno: fails to build with Android NDK 27c

  • hk_cmd_draw.c:3471:32: error: expression in static assertion is not constant

  • anv/gfx12: Enable non-zero fast clears for non-FCV CCS_E

  • gen12: 5% regression in factorio

  • 32-bit: error: format ‘%lx’ expects argument of type

  • regression;bisected;FTBFS: commit b13e2a495e9e3da56add7d852ca01b2cd7eef52d breaks x86_32 mesa build

  • glxext.c: error: ‘struct glx_screen’ has no member named ‘frontend_screen’

  • regression;bisected;FTBFS: commit ae76a6a04596bfdbd37bab165bc5f2a5ff60d389 breaks x86 mesa build

  • Can’t allocate dpb buffer on firefox

  • Segmentation fault resetting a query pool used to get BLAS properties

  • libvulkan_lvp link fails if glslangValidator is not installed

  • lvp acceleration structure broken on `main` but not on `staging/24.x`

  • radv: warning that “radv is not a conformant Vulkan implementation” on Navi 32

  • [anv][UHD630] DXVK 2.5 - 2.5.2 with DXVK_HUD=compiler or DXVK_HUD=fps freezes the game or the entire system (Works without compiler/fps HUD, DXVK 2.4.1 works fine)

  • Licenses seems incomplete/misleading

  • anv: Symbol clash in intel_batch_decoder build when expat not available

  • glcts failures on LNL/BMG

  • Lavapipe vulkan 1.4 support?

  • d3d12 vaapi: thread safety issues

  • anv: Missing textures and glitches in It Takes Two (game)

  • [anv][bisected] GravityMark segfault when enabling u-trace on RT workload

  • features.txt does not have a Vulkan 1.4 section despite some drivers already supporting the new version

  • Black screen bug that only affects AMD

  • Failure to correctly decode H.264, possibly specific to use of array output view

  • X1-85: Portal 2: Bottom of portal gun disappears

  • X-Plane 12: Prop disc rendering regression

  • Errors when enumerating devices create incorrect expecations

  • Resident evil 3 remake hanging - f8b584d6 regression

  • R6700XT: QP value doesn’t affect output when using CQP rate control w/ H264/H265 VAAPI encoders

  • Bug in Mesa headers: `error: redefinition of typedef ‘GLsync’`

  • nak: Crash when starting The First Descendant

  • [r300] Regression in f424ef18010 breaks wayland on RS480M

  • anv: Missing text in Age of Mythology Retold on a Arc b580

  • RustiCL: and Clover broken with 9b7ea720c93 (!32713 (merged))

  • nvk: Artifact Classic crash at loading screen

  • radeonsi VAAPI - vc-1 interlaced decoding garbled on Polaris

  • VDPAU AV1 hardware decoding broken for Mesa 25.0.0-devel

  • mesa: st_glsl_to_nir call to nir_opt_fragdepth might not be valid with MSAA

  • rusticl: warning: pointers cannot be transmuted to integers during const eval

  • rusticl: warning: pointers cannot be transmuted to integers during const eval

  • X1-85: Half Life 2 water rendering artifacts

  • crash on video playback

  • anv: Allow buffer compression for vkd3d by default?

  • anv: bellwright needs force_vk_vendor=-1 %command% to launch

  • [anv] Possible regression from !31269

  • Up to 60% perf drop in SynMark DrvRes benchmark

  • Memory leak on closing and re-opening X11 windows

  • SIVPE errors on GPU-based screen recording (Radeon 890M)

  • d3d12: va-api: build failure regression since 24.3.0-rc1 with MinGW GCC and clang

  • anv: Marvel Rivals XeSS crash, game needs force_vk_vendor=-1 env variable

  • anv: `MESA: warning: INTEL_HWCONFIG_MIN_GS_URB_ENTRIES (2) != devinfo->urb.min_entries[MESA_SHADER_GEOMETRY] (0)`

  • aco: two nir_shader_clock are miss optimized to one for GFX12

  • aco: opengl buffer blit test fail when using aco on GFX12

  • aco: nir_ddx/ddy v_interp optimization does not work on GFX12

  • VAAPI b_depth 2 causes “manage_dpb_before_encode UVD - Failed to find ref0” error

  • regression;bisected;FTBFS: commits 37d47913437e2e9f72283ea8bffce00efc40fce2 and e67e44522f4f5de4fcde53ad0fb75e396ef31f52 breaks x86 mesa build

  • anv: Enable storage image compression on TGL

  • zink: zink_create_quads_emulation_gs doesn’t write primitive ID

  • DZN/DXIL doesn’t validate GTK shaders

  • black screen and “Failed to add framebuffer” error in wayland compositors when not filtering dmabuf formats with ccs modifiers on intel graphics when upgrading to mesa 24.3.0

  • nir: nir_opt_if_merge_test fails validation with NIR_DEBUG=validate_ssa_dominance

  • radv: Vulkan AV1 video decode glitches

  • radv: support RGP captures for purely compute pipelines

  • regression;bisected: c49a71c03c9166b0814db92420eadac74cbc4b11 leads to artifacts if on top of launched game (in full screen mode) show list running apps (Hold Alt + Tab)

  • !32067 broke piglit “spec@egl_khr_create_context@no-error context gl”

  • Intel: Re-enable bo cache in iris driver (Xe2)

  • [amdgpu][regression] GPU Hang/Reset Triggered by Several Applications

  • ANV: X4 Foundations crashes with vkAllocateDescriptorSets -12

  • About twenty vulkan-samples cases will crash caused by the same error while running on PanVK

  • Firestorm crashes on startup with Mesa 24.3

  • anv: Use-after-free detected by AddressSanitizer while running dEQP-VK

  • GPU process crash via WebGPU shader - UAF in mesa gcm_schedule_early_instr at src/compiler/nir/nir_opt_gcm.c:477

  • radv: DCC causes glitches in Red Dead Redemption 2

  • A5xx rendering issues with firefox

  • [ANV][Regression] Broken rendering in Flycast + Per-Pixel Alpha Sorting

  • [TGL][anv] Performance regression in Dota 2 replay

  • vtn: OpTypeStruct in kernel parameters trigger assertion in glsl_types.h

  • anv: Assertion failure in `dEQP-VK.image.extended_usage_bit_compatibility.image_format_list.s8_uint_optimal_transfer_src_bit`

  • radv: Resident Evil 6 Benchmark Tool has artifacts on 7900 XTX when DCC is enabled, game launched on 4K monitor without scaling and with FullHD settings

  • [AMD RX 6700 XT] Artifacts while upscaling games in fullscreen mode

  • Distorted pixelated graphics with Radeon RX 7900 XT with some games

  • Total War Warhammer 2 Graphical Glitch

  • Glitching artifacts in tile shaped patterns on 6700 XT, when using upscaled fullscreen game on labwc

  • anv: Page fault when using MTL simulator in dEQP-VK.ray_tracing_pipeline.data_spill.report_intersection.float32

  • mesa_cache_db.c:316:33: error: call to undeclared function ‘mremap’

  • [trunk] shaders fail hard in openmw after cbfc225e2bda2c8627a4580fa3a9b63bfb7133e0

  • u_perfetto.h:33:9: error: unknown type name ‘clockid_t’; did you mean ‘clock_t’?

  • brw_fs_opt_copy_propagation incorrectly handles size changes of uniforms

  • RADV Command buffer reuse doesn’t reinitialize is_secondary

  • Virgl:Qcom sa8155 GL_MAX_FRAGMENT_SHADER_STORAGE_BLOCKS/GL_MAX_VERTEX_SHADER_STORAGE_BLOCKS is too small to run antutu benchmark apk

  • nouveau paraview msaa corruption 23.1 bisected regression

  • mesa fails to build due to missing SPV_ENV_UNIVERSAL_1_6 symbol

Changes

Aaron Ruby (6):

  • meson: Remove experimental from gfxstream driver build

  • gfxstream: Some cleanup in manual entrypoints

  • gfxstream: Remove VK_HOST_CONNECTION macro

  • gfxstream: Fix unused variable warnings in ResourceTracker.cpp

  • vulkan/util: Add c99_compat.h inclusion for cpp ‘restrict’ compatibility

  • gfxstream: Remove internal vk_util.h and vk_struct_id.h entirely

Adam Jackson (2):

  • docs/envvars: Remove mention of IRIS_ENABLE_CLOVER

  • docs/envvars: Combine WGL sections

Alejandro Piñeiro (1):

  • docs/features: mark VK_EXT_scalar_block_layout as supported for vc7+

Aleksi Sapon (9):

  • draw: primitive ID is per-patch

  • llvmpipe: spec@arb_tessellation_shader@execution@gs-primitiveid-instanced is fixed

  • zink: spec@arb_tessellation_shader@execution@gs-primitiveid-instanced is fixed

  • draw: front-face injection must check geometry shader primitive type

  • llvmpipe: PointCoord is offset when multisampling is enabled

  • meson: fix finding Python on Windows

  • llvmpipe: fix lp_test_arit on Windows

  • llvmpipe: LLVM v2f32 trunc/floor/ceil/nearbyint generates optimal x86 code since at least version 8

  • llvmpipe: disable anisotropic filtering for non-2D textures

Alyssa Rosenzweig (206):

  • nir/opt_algebraic: optimize patterns from Skia

  • nir/opt_algebraic: add more 64-bit patterns

  • nir/opt_algebraic: add another 64-bit pattern

  • nir: add amul flag

  • nir: add late_lower_int64 option

  • nir: add ilea_agx/ulea_agx opcodes

  • nir/builder: use amul over ishl on agx

  • nir/opt_algebraic: don’t lower amul if requested

  • nir/lower_uniforms_to_ubo: use amul

  • rusticl: respect late_lower_int64

  • agx: vectorize SSBOs

  • agx: model IC dispatch

  • agx: fix bfeil timing

  • hk: reduce max SSBO size

  • libagx: promote math to use AGX address mode

  • agx: rewrite address mode lowering

  • agx: change int conversion test

  • agx: add pseudo for signext

  • agx: optimize signext+iadd

  • agx: fold zext into int sources

  • agx: add tests for sign/zero-extend propagate

  • agx: fix atomics in tess count shaders

  • hk: don’t advertise impossible modifiers

  • agx: optimize signext imad

  • agx: fuse iadd+large shift into imad

  • agx: make imad+ishl rules actually work

  • hk: drop assert

  • hk: fix meta shader name

  • libagx: fix cl warning

  • libagx: drop branch

  • libagx: drop dead code

  • libagx: vectorize triangle def’n

  • libagx: drop Clockwise

  • libagx: simplify index patch expression

  • libagx: don’t key unroll to index size

  • libagx: fix unroll kernel constant qualifier

  • libagx: drop silliness in restart kernel

  • agx: fuse also 8-bit address math

  • asahi: extract agx_get_num_cores

  • asahi: correct core count, max freq

  • asahi: fix a2c with sample shading, harder

  • asahi: assert/cse resource valid

  • asahi: don’t take compiled_shader in agx_build_internal_usc

  • asahi: drop dead param

  • asahi: factor out more compiled shader

  • asahi: move agx_gather_device_key

  • util: add u_tristate data structure

  • panfrost: switch to u_tristate

  • agx: make needs_g13x_coherency a tri-state

  • nir/lower_convert_alu_types: use intrinsics_pass

  • nir/conversion_builder: avoid redundant uint->uint clamp

  • nir/opt_algebraic: optimize convert_uint_sat(ulong)

  • nir: add names to function parameters

  • nir/print: print function signature

  • nir/print: annotate entrypoints

  • nir/print: print parameter names in calls

  • vtn: gather function parameter names

  • vtn: use rzalloc in bindgen

  • vtn: use named parameters in bindgen

  • vtn: preserve name, is_return in bindings

  • nir: split off some definitions for OpenCL

  • compiler: make glsl_sampler_dim available to CL

  • nir/lower_system_values: add ID to 32-bit lowering

  • nir: add nir_fixup_is_exported pass

  • vtn: introduce vtn_bindgen tool

  • libagx: switch to vtn_bindgen

  • libagx: move out of lib/

  • libagx: DCE

  • asahi: drop dead ACCESS

  • asahi,agx: move texture lowering into the compiler

  • asahi: drop desc align alloc

  • asahi/decode: disasm 3D helper progs

  • asahi/clc: drop getopt

  • agx: vectorize scratch access

  • agx: gather workgroup size

  • asahi,hk: reenable rgb32 buffer textures

  • hk: generalize internal launch

  • hk: expose missing eds3 feature

  • hk: handle mismatching colour vs z/s dimensions

  • hk: implement EXT_depth_bias_control

  • hk: be robust against invalid MSAA inputs

  • hk: do not increment GS queries for passthru GS

  • hk: use common wg size

  • hk: add cmd buffer to hk_cs

  • hk: dce

  • libagx: fix return type

  • libagx: don’t export vertex_id_for_top

  • asahi/genxml: fix 0 encoding for groups

  • asahi/genxml: fix 128-bit in CL path

  • asahi/genxml: optimize out masking with shr

  • asahi/genxml: define missing macros

  • asahi: add XML for cdm stream link with return

  • asahi: refmt

  • vtn: ignore SpvFunctionParameterAttributeSret

  • nir/pack_bits: handle 8-bit vec8 -> 64-bit

  • nir: add nir_lower_calls_to_builtins pass

  • asahi/clc: switch to nir_lower_calls_to_builtins

  • nir: add nir_foreach_entrypoint macros

  • nir: add workgroup size to functions

  • vtn: plumb through OpEntryPoint

  • vtn: gather workgroup size in libraries

  • nir: add nir_function::pass_flags

  • nir: add nir_remove_entrypoints helper

  • nir: add nir_lower_constant_to_temp helper

  • nir: add helpers for precompiled shaders

  • asahi,vtn: precompile kernels

  • libagx: increase wg size for query copy

  • asahi: crash on fault

  • hk: fix incorrect index size translate

  • hk: fix z bias perf regression

  • hk: implement hack for layered no attachments

  • hk: clarify bounds check calculations

  • agx: disable bounds check optimization

  • agx: reduce preamble/main alignment

  • asahi: drop dead pool stuff

  • asahi: don’t leak rodata

  • hk,asahi,libagx: unify a bit of code

  • asahi: drop dead

  • asahi: fix page size alignment

  • asahi: fix u_blitter related leaks

  • asahi: label individual pools

  • asahi,hk: mmap BO on first use

  • asahi: add more asserts around bo add

  • asahi: fix agx_batch_add_bo

  • asahi: add =bodump debug help

  • asahi: fix agxdecode memory mapping

  • hk: implement timestamps

  • hk: claim 1.4

  • zink: fix gl_PrimitiveID reads with quads

  • nir/search_helpers: handle bcsel in is_only_used_as_float

  • nir/opt_algebraic: optimize sign bit manipulation

  • nir/opt_load_store_vectorize: match amul like imul

  • nir,asahi: make argument alignment configurable

  • mesa_clc: add depfile support

  • libagx: switch to depfile support

  • libagx: remove redundant source files

  • vulkan: rename depth bias graphics states

  • vulkan: bump layer api versions

  • nir: add printf_abort intrinsic

  • nir/lower_printf: allow fixed address

  • nir/lower_printf: lower aborts

  • nir/lower_printf: use unsigned math

  • nir/lower_printf: use 64-bit math

  • util/printf: be robust against truncated buffers

  • util/printf: add context-ful helpers

  • vulkan: add vk_check_printf_status helper

  • nir/lower_point_size: skip non-var derefs

  • clc: plumb cl_khr_subgroup_ballot

  • libcl: add a common header for CPU/GPU stuff

  • libcl: add VkDraw(Indexed)IndirectCommand definitions

  • util/bitpack_helpers: make partially CL safe

  • asahi: allow c23 extensions

  • asahi/clc: remap __FILE__

  • asahi,hk: wire up printf, abort

  • agx: implement halts

  • libagx: drop pointless helper

  • libagx: port to common libcl.h

  • compiler: use libcl.h for CL

  • compiler: add mesa_prim_has_adjacency helper

  • asahi: use mesa_prim_has_adjacency

  • nir: add lower_scratch_to_var pass

  • compiler/glsl_types: add glsl_get_word_size_align_bytes

  • agx: optimize scratch access

  • radeonsi: use mesa_prim_has_adjacency

  • asahi: fix mmap’ing imported BOs

  • hk,libagx: move hk_draw to the gpu

  • asahi: use common draw

  • libagx: add missing agx_vdm_return

  • agx: add more 8-bit address fusing rules

  • asahi: reformat

  • agx: match another address pattern

  • libagx: move index size helpers to the gpu

  • libagx: refactor index buffer code

  • libagx: factor out load/store_index

  • hk: use index buffer overflow check

  • hk: factor out hk_draw_as_indexed_indirect

  • hk,libagx: accelerate index buffer robustness

  • hk,libagx: handle adjacency without a GS

  • libagx,hk: handle pipeline stats queries without a GS

  • libagx: use designated initializers

  • hk: avoid compiling unneeded VS->GS variants

  • hk: fix primitive restart dirty tracking

  • glsl: fix glsl_get_word_size_align_bytes

  • nir: pass a callback to nir_lower_robust_access

  • nir/lower_robust_access: fix robustness with atomic swap

  • libagx: add agx_barrier enum

  • nir,asahi,hk: add barrier argument to MESA_DISPATCH_PRECOMP

  • intel: set max_buffer_size to nir_lower_printf

  • nir/lower_printf: drop null check

  • nir/lower_printf: drop default max buffer size

  • nir,util: move printf serializing into util

  • util: add u_printf_hash helper

  • util/u_printf: add singleton implementation

  • util/u_printf: allow printing from singleton

  • nir/lower_printf: add option to hash format strings

  • nir/lower_printf: support dynamic buffer size

  • nir: add nir_lower_printf_buffer pass

  • agx: defer printf address lowering

  • nir/lower_printf: drop static buffer addr lowering

  • util,vulkan,asahi,hk: hash format strings

  • nir/lower_robust_access: do not preserve control flow

  • nir: fix O(N^2) behaviour in nir_remove_dead_variables

  • meson: project-wide fs = import(‘fs’)

  • clc,libagx: drop –in for mesa_clc

  • clc,libagx: automatically set lang version

  • nir/serialize: strip function names names

Antonino Maniscalco (1):

  • nir,zink,asahi: support passing through gl_PrimitiveID

Antonio Ospite (53):

  • ci/deqp: replace local android patches with upstream solution

  • docs/android: update docs/android.rst after libgallium_dri updates

  • docs/android: improve documentation about building llvmpipe for Android

  • docs: remove leftover mention of meson dri3 option

  • ci/android: unset compiler env vars in debian/android_build.sh

  • ci/android: add a script to build LLVM libraries for Android

  • ci/container: remove S3_JWT_FILE when container_job_trampoline.sh exits

  • ci: set GIT_COMMITTER_DATE in a locale-agnostic format

  • ci/deqp: refresh some patches to apply on top of recent VK-GL-CTS

  • ci/deqp: cherry-pick fixes for building GL and GLES deqp on Android

  • ci/deqp: enable building testlog tools on Android too

  • ci/deqp: collect the mustpass lists also for the android target

  • ci/android: fix problem with deqp version file when building for Android

  • ci/android: build deqp for DEQP_API=VK

  • ci/android: build llvmpipe driver for Android by forcing llvm fallback

  • ci/android: don’t copy the DRI drivers which are not needed anymore

  • ci/android: restart all services after copying the new mesa libraries

  • ci/android: handle premature exit of .gitlab-ci/cuttlefish-runner.sh

  • ci/android: update version of cuttlefish host tools

  • ci/android: add sudo to EPHEMERAL deps for debian/x86_64_test-android.sh

  • ci/android: get custom cuttlefish images from the S3

  • ci/android: make cuttlefish-runner.sh more robust against different Android images

  • ci/android: better separate host and guest mesa artifacts

  • ci/android: use a custom kernel when launching cuttlefish

  • ci/android: fix warning when using chown

  • ci/android: fix result dir for Android guest execution of deqp-runner

  • ci/android: don’t call cuttlefish-host-resources script

  • ci/android: reorder PATH and LD_LIBRARY_PATH values to clarify priority

  • ci/android: also copy mesa vulkan libraries to the Android guest

  • ci/android: update list of deqp files pushed to the guest system

  • ci/android: use a native adb connection

  • ci/android: set XDG_CACHE_HOME and pass –shader-cache-dir to deqp-runner

  • ci/android: use a /data/deqp subdirectory on guest to store dEQP files

  • ci/android: set VK_DRIVER_FILES before launching cuttlefish

  • ci/android: add ci rules to test llvmpipe on Android

  • ci/android: add ci rules to test venus on Android

  • ci/android: upgrade DEBIAN_TEST_ANDROID_TAG

  • ci/android: fix meson C++ cross-compiler argument detection

  • ci/android: update ANDROID_NDK and ANDROID_SDK_VERSION

  • ci/android: use ANDROID_SDK_VERSION when building deqp components

  • ci/android: use ANDROID_SDK_VERSION for debian-android job too

  • ci/android: rename variable ANDROID_NDK to ANDROID_NDK_VERSION

  • docs/android: bump suggested platform-sdk-version to 34

  • freedreno/meson: remove C++ cross-build arguments HACKs

  • freedreno/meson: sort list of options passed to get_supported_arguments()

  • ci/android: update CUTTLEFISH_BUILD_NUMBER

  • ci/android: define an INSTALL var for the source of mesa artifacts

  • ci/android: improve handling of expectation files

  • ci/android: fix pulling results from Android device

  • ci/android: post-process testlog XML and create a junit.xml

  • ci/android: pass –max-fails to deqp-runner in cuttlefish-runner.sh

  • ci/android: pass –allow-downgrades when installing cuttlefish host tools

  • ci/android: stop pushing libglapi.so since it’s not available anymore

Arseny Kapoulkine (1):

  • radv: On GFX11, use box sorting heuristic based on ray flags

Arvind Yadav (1):

  • amd: Add amdgpu userqueue IOCTL functions

Asahi Lina (16):

  • asahi: Add pipe bind flags to resource debug

  • asahi: Add PIPE_BIND_SHARED to imported resources

  • asahi: Extract agx_decompress_inplace()

  • asahi: Introduce batch->feedback to disable compression in PBE

  • asahi: In-place decompress shared resources for feedback loops

  • hk: Add virtio implicit sync support

  • hk: Fix DRM modifier selection for compressed surfaces

  • hk: Enable missing swapchainMaintenance1 support

  • asahi: Use 64bit size fields

  • hk: Bump up max buffer size

  • asahi: UAPI update to add GET_TIME & cleanup

  • asahi: Fix agx_gpu_time_to_ns & implement DRM_ASAHI_GET_TIME

  • asahi: UAPI update to add support for user timestamp buffers

  • asahi: Add timestamp buffer ops

  • asahi: Virt UABI update

  • asahi: hk: Enable timestamps for virt

Autumn Ashton (1):

  • radv/video: Fix bitstreamStartOffset including dstBufferOffset

Bas Nieuwenhuizen (1):

  • util/perf: Fix some warnings.

Benjamin Cheng (4):

  • ac/vcn: allow sq signature package to be skipped

  • radv/video: support event for pre-VCN4 encode queues

  • radv/video: support event for pre-VCN4 decode queues

  • radv/video: enable by default on vcn2/3 with latest fw

Benjamin Lee (36):

  • panvk: inherit sample count in secondary cmdbufs

  • nir: clamp small W in nir_lower_viewport_transform

  • nir: document order requirement for nir_lower_viewport_transform

  • panvk: refactor fbinfo into a temp var in get_tiler_desc

  • panvk: treat provoking vertex as dynamic state

  • panvk: set provoking vertex in fbinfo

  • panvk: advertise VK_EXT_provoking_vertex

  • nir: handle arbitrary per-view outputs in nir_lower_multiview

  • nir: document index semantics in nir_lower_multiview

  • nir: treat per-view outputs as arrayed IO

  • nir: add option to use compact view indices

  • panvk: implement multiview support

  • panvk: only clear enabled views

  • panvk: disable position fifo optimization when multiview enabled

  • panvk: advertise multiview support on v10+

  • panvk: add note about pan_lower_store_component requirements

  • nir: update docs for nir_get_io_arrayed_index_src

  • panvk: set uses_sample_shading NIR flag when sample shading is forced

  • panvk: fix sample position when sample shading is disabled

  • panvk/csf: fix alpha-to-coverage

  • panfrost: add intrinsic to load frag coord at a barycentric

  • panfrost: add nir pass to lower noperspective varyings

  • panfrost: collect noperspective varyings in shader info

  • panvk: pass noperspective_varyings sysval as a push constant

  • panfrost: add pass to lower noperspective varyings to a constant

  • panvk: use static noperspective when statically linking VS and FS

  • panfrost: factor FS shader key into a helper function

  • panfrost: specialize VS on FS interpolation qualifiers

  • panvk: handle sample mask writes on 1-sample targets

  • panvk: remove load_multisampled_pan sysval

  • panfrost/va: add FLUSH instruction

  • panfrost/va: implement fquantizetf16 ftz

  • panvk: disable round_to_nearest_even for NEAREST-filtered samplers

  • panfrost: remove incorrect usage of MALI_PIXEL_KILL_STRONG_EARLY

  • panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload

  • panfrost: remove is_blit flag

Benjamin Otte (1):

  • vulkan/wsi: Support alpha swapchains on win32

Benjamin ROBIN (1):

  • util/disk_cache: Do not try to delete old cache if cache is disabled

Bo Hu (5):

  • gfxstream: snapshot: avoid double boxing dispatchable handle

  • gfxstream: snapshot: DescriptorSet allocate and update

  • gfxstream-guest: update offset to correct value

  • update decoder.py to clean up un-used ApiCallInfo

  • remove the mReconstructionMutex in load

Boris Brezillon (103):

  • panvk: Enable CI on G610

  • pan/ci: Move g610-vk jobs to post-merge CI

  • panvk: Change the prototype of panvk_select_tiler_hierarchy_mask()

  • panvk: Kill unused fields in panvk_cmd_graphics_state

  • panvk: Move the panvk_cmd_graphics_state definition to panvk_cmd_draw.h

  • panvk: Move panvk_cmd_compute_state to a common place

  • panvk: Move is_dirty() to panvk_cmd_draw.h and rename it

  • panvk: Don’t link the VS and FS shaders on v10

  • panvk: Sanitize the driver-internal dirty state tracking

  • panvk: Move common gfx bits to a new source file in the common dir

  • panvk: Cache the fs_required() result

  • panvk/csf: Fix a wait-LS operation in finish_cs()

  • panvk/cs: Poison cmdbuf registers when PANVK_DEBUG=cs is set

  • panvk/ci: Update CI expectations to have a green CI

  • panfrost: Increase AFBC body alignment requirement on v6+

  • panfrost: Add a helper to expose the maximum effective tile size

  • panfrost: Add the concept of render block

  • panfrost: Add support for AFBC(split)

  • panfrost: Advertise support for AFBC(32x8,sparse,split)

  • pan/decode: Flush the dump file before crashing

  • panvk/csf: Keep a cache of the CS reg file at the panvk_queue level

  • panvk/csf: Fix cross command buffer render pass suspend/resume

  • panvk/csf: Explain why the tiler is set to 0xdeadbeefdeadbeef

  • panvk: Fix panvk_plane_index() for D32_SFLOAT_S8_UINT

  • pan/cs: Add cs_exception_handler_ctx

  • pan/cs: Align exception handlers with NOPs

  • pan/cs: Add dynamic save_reg to exception handler

  • pan/cs: Add block macro for exception handler

  • panvk/csf: Fix register overlap in issue_fragment_jobs()

  • pan/cs: Return the dump region size when an exception handler is defined

  • pan/cs: Return exception handler size/address

  • panfrost: Add cs_exception_handler_def() to the ForEachMacros list

  • panvk/csf: Use the information returned by cs_exception_handler_def()

  • panfrost: Use the handler size returned by cs_exception_handler_def()

  • panvk: Filter out input-attachment usage on non renderable formats

  • pan/decode: Untangle CS disassembling and interpretation

  • pan/decode: s/interpret_ceu/interpret_cs/

  • pan/decode: Rename pandecode_cs() into pandecode_interpret_cs()

  • pan/decode: Add a helper to print CS binaries without interpreting them

  • pan/decode: Provide a helper to print messages outside of the decoding path

  • pan/cs: Add a LOAD_IP pseudo instruction

  • pan/cs: Add an event-based tracing mechanism

  • panvk/csf: Use event-based CS tracing

  • panvk/csf: Don’t disable SIMULTANEOUS_USE when tracing is enabled

  • panvk: Add a flag to force SIMULTANEOUS_USE

  • pan/texture: Move the plane info retrieval logic to a helper function

  • pan/texture: Stop passing the view format around

  • pan/texture: s/index/plane_index/ in panfrost_emit_plane()

  • pan/texture: Stop passing a layout to panfrost_emit_plane()

  • pan/texture: Pass pan_image_section_info around

  • nir: Let nir_lower_texcoord_replace_late() report progress

  • panfrost: s/NIR_PASS_V/NIR_PASS/

  • panfrost: Use nir_shader_intrinsics_pass() for the line_smooth lowering pass

  • panvk: s/NIR_PASS_V/NIR_PASS/

  • pan: s/NIR_PASS_V/NIR_PASS/

  • panvk: Move the descriptors preparation out of CreateImageView()

  • vk/meta: Pass depth/stencil attachments only when a clear is requested

  • panvk: Ignore the view aspects when dealing with depth/stencil attachments

  • pan/cs: Fix cs_builder allocation failure robustness

  • panvk: Wrap our descriptor lowering passes in NIR_PASS()

  • panvk: Stop using magic values for the sysval push constant offset/range

  • panvk: Automate sysval access from NIR shaders

  • panvk: Lower dynamic push_constant loads in desc_copy logic

  • panvk: Lower load_push_constant with dynamic offset to global loads

  • pan/bi: Get rid of bi_lower_load_push_const_with_dyn_offset()

  • panvk: Don’t define push_constant range/base when we don’t have to

  • pan/indirect: Don’t use .base to pass the push_constant offset

  • pan/mi: Don’t pretend we support push constants

  • pan/bi: Disallow non-zero .{range,base} on load_push_constant instructions

  • pan/bi: Fix mem_access_size_align_cb() for push constants

  • panvk: Don’t lower load_base_vertex

  • panvk: Fix first_vertex/base_instance types

  • pan: Don’t pretend we support load_{vertex_id_zero_base,first_vertex}

  • panvk: Don’t lower load_blend_const_color_rgba

  • panvk: Factor-out the sysvals initialization logic

  • panvk: Pass a cmdbuf to blend_emit_descs()

  • panvk: Pack push constants

  • panfrost: Kill the mali_ptr typedef

  • panfrost: Kill the uXX typedefs

  • panfrost: Move MALI_EXTRACT_INDEX to pan_format.h

  • panfrost: Move MAX_{MIP_LEVELS,IMAGE_PLANES} to pan_texture.h

  • panfrost: Kill panfrost-job.h

  • panvk: Don’t invalidate the viewport on cull mode updates

  • panvk/jm: Fix depth clipping with small viewport depth range

  • panvk: Fix an alignment issue on x86

  • panvk: Fix panvk_priv_mem_bo() on 32-bit platforms

  • panfrost/ci: Add panvk and panfrost to the debian-x86_32 job

  • pan/genxml: s/PAN_PAN_HELPERS_H/PAN_PACK_HELPERS_H/

  • pan/genxml: Include pan_pack_helpers.h instead of copying it

  • pan/genxml: Generate MALI_XXX_PACKED_T macros

  • panfrost: Fix instanced draws when attributes have a non-zero divisor

  • pan/cs: Fix the tracepoint register dump loops

  • pan/cs: Allow undefined value if condition=always in cs_branch_label()

  • pan/cs: cs_{break,continue} are not for_each macros

  • panvk/csf: Make all sync operations on the CSG scope

  • panvk/csf: Use cs_sr_reg64() instead of cs_reg64() when setting the OQ pointer

  • panvk/csf: Rework the occlusion query logic to avoid draw flushes

  • panvk/csf: Fix add_memory_dependency() for input attachment access

  • panvk/csf: Add a knob to force texture cache invalidation on RUN_FRAGMENT

  • panvk: Don’t clobber registers if the render pass was suspended

  • pan/decode: Fix the blend_count mask

  • panvk/csf: Don’t free the resources twice when init_render_desc_ringbuf() fails

  • panvk: Initialize device virtual address space after the VM creation

Brad Smith (1):

  • util: Support elf_aux_info() on OpenBSD arm and ppc

Brian Paul (2):

  • svga: add svga_resource_create_with_modifiers() function

  • svga: fix printing 64-bit value for 32-bit build

Caio Oliveira (90):

  • intel/executor: Fix exec_size in @read macro for Xe2

  • intel/brw: Add test for combining SWSB dependencies in SENDs

  • intel/brw: Allow extra SWSB encodings for Xe2

  • intel/common: Properly dispose resources in mi_builder tests

  • intel/common: Prepare mi_builder tests to support Xe KMD

  • intel/common: Implement Xe KMD in mi_builder tests

  • intel/common: Enable mi_builder test for PTL

  • intel/brw: Add SHADER_OPCODE_BALLOT

  • intel/brw: Add SHADER_OPCODE_QUAD_SWAP

  • intel/brw: Omit type and region in payload sources when printing IR

  • intel/brw: Use <V,W,H> notation for FIXED_GRF and ARF source when printing IR

  • intel/executor: Enable PTL

  • intel/brw: Fix decoding of cond_modifier and saturate in EU validation

  • intel/brw: Fix SWSB output when printing IR

  • intel/brw: Dump IR after lower scoreboard pass

  • util/ra: Remove unimplemented function declaration

  • intel/brw: Add is_control_source for the new subgroup ops

  • mr-label-maker: Rules for intel/executor

  • intel/brw: Enable EU validation and compaction tests for PTL

  • intel/brw: Dump errors when brw_assemble() fails EU validation

  • intel/compiler: Use #pragma once instead of header guards

  • intel/brw: Remove overloads for brw_print_instruction/s functions

  • intel/brw: Consider if SEND is gather variant when setting ex_desc

  • intel/brw: Add TGL_PIPE_SCALAR value

  • intel/brw: Add assembly support for ARF scalar register

  • intel/brw: Add validation for ARF scalar register

  • intel/executor: Add example using scalar register and send gather

  • intel/brw: Skip some regioning EU validation for Vx1 and VxH modes

  • intel/brw: Extract format enum in EU validation code

  • intel/brw: Add validation for some Xe2 register regioning restrictions

  • intel/brw: Add some tests for new Xe2 register regioning restrictions

  • intel/brw: Add SHADER_OPCODE_READ_FROM_CHANNEL and LIVE_CHANNEL

  • intel/brw: Disallow cmod in some cases of ARF scalar as destination

  • intel/brw: Use variable instead of manually count the passes

  • intel/brw: Rename brw_inst.h to brw_eu_inst.h

  • intel/brw: Rename brw_inst to brw_eu_inst

  • intel/brw: Rename brw_compact_inst to brw_eu_compact_inst

  • intel/brw: Rename brw_inst_bits/set_bits to brw_eu_inst_bits/set_bits

  • intel/brw: Rename brw_inst_* helpers to brw_eu_inst_*

  • intel/brw: Rename brw_compact_inst_* helpers to brw_eu_compact_inst_*

  • intel/brw: Gather brw_reg related implementations in brw_reg.cpp

  • intel/brw: Add missing call to invalidate analysis

  • intel/brw: Move two NIR passes to brw_nir.c

  • gallium/meson: Ensure all needed sym_config are set.

  • intel/brw: Remove ‘fs’ prefix from passes filenames

  • intel/brw: Remove ‘fs’ prefix from passes and related functions

  • intel/brw: Add missing bits in 3-src SWSB encoding for Xe2+

  • intel/brw/xe2+: Do not use $.dst or $.src SWSB annotations in SENDs

  • intel/compiler: Use INFINITY spill cost to represent no_spill

  • util: Add operator new[] to linear context helper declarations

  • intel/compiler: Use linear allocator for ACP trees in copy-prop

  • intel/brw: Remove uses of VLAs

  • intel/elk: Add ELK_MAX_MRF_ALL for static allocating arrays

  • intel/elk: Remove uses of VLAs

  • intel/elk: Fix typo in assertion

  • util/ra: Move less used data out of ra_node

  • util/ra: Don’t store a pointer to graph per ra_node

  • util/ra: Bump the initial size of adjacency lists

  • util/ra: Don’t store a pointer to a ra_regs per ra_reg

  • intel/brw: Rename brw_fs_validate to brw_validate

  • docs: Update syntax on Performance tips page

  • intel/brw: Rename brw_fs_generator.cpp to brw_generator.cpp

  • intel/brw: Add brw_generator.h header

  • intel/brw: Rename fs_generator to brw_generator

  • intel/brw: Add missing cases to flags_written()

  • intel/brw: Remove extra wrapping around fs_visitor in tests

  • intel/brw: Rename brw_fs_builder.h to brw_builder.h

  • intel/brw: Rename fs_builder to brw_builder

  • intel/brw: Stop using namespace for brw_builder

  • intel/brw: Move a few builder helpers to brw_builder.h/cpp

  • intel/brw: Move shuffle_from_32bit_read implementation to brw_builder

  • intel/brw: Apply conventions to lower_src_modifiers helper

  • intel/brw: Rename brw_fs_reg_allocate.cpp to brw_reg_allocate.cpp

  • intel/brw: Remove ‘fs’ prefix from reg alloc code

  • intel/brw: Rely on existing helper for dispatch width of geometry stages

  • intel/elk: Fix wrong destination to memset

  • intel/brw: Use brw prefix for some schedule instructions identifiers

  • intel/brw: Use brw prefix instead of namespace in dynamic_msaa_flags()

  • intel/brw: Remove unused enum

  • intel/executor: Fix typo when copying result into Lua table

  • intel/tools: Use idep_libintel_common in meson

  • intel/tools: Add helpers for decoder_init/disasm

  • intel/tools: Merge libaub into libintel_tools

  • intel: Add meson option -Dintel-elk

  • intel/brw: Add scoreboard support for scalar register

  • intel/brw: Plumb through generator whether SEND is gather variant

  • intel/brw: Add SHADER_OPCODE_SEND_GATHER

  • intel/brw: Add lowering for SHADER_OPCODE_SEND_GATHER

  • intel/brw: Use SHADER_OPCODE_SEND_GATHER in Xe3

  • intel/brw: Fallback to SEND from SEND_GATHER if possible

Caleb Callaway (2):

  • docs: Intel GPU performance tips

  • docs: clarify ASPM performance tips

Casey Bowman (1):

  • vulkan/screenshot-layer: Add region command option

Caterina Shablia (9):

  • pan/bi: fix a typo

  • pan/va: fix WMASK packing

  • pan/bi: handle read_invocation

  • pan/bi: handle ballot, ballot_relaxed and as_uniform

  • pan/bi: lower some subgroup intrinsics

  • pan/bi: lower the rest of subgroup ops using nir_lower_subgroups

  • pan/bi: add a MEMORY_BARRIER pseudo-instruction

  • pan/bi: handle barriers with SUBGROUP scope

  • panvk: enable subgroupSizeControl

Chen, Phoebe (1):

  • amd/vpelib: Refactor YUV format check

Chia-I Wu (69):

  • panvk: ensure res table is restored after meta

  • panvk: add memory mmap/munmap helpers

  • panvk: do not leak mapped memory

  • panvk: update CI expectations

  • panvk: add get_subqueue_stages

  • panvk: rework collect_cache_flush_info

  • panvk: rework collect_cs_deps

  • panvk: always skip frag->tiler subqueue wait

  • panvk: skip frag subqueue self-wait within a render pass

  • panvk: skip tiler subqueue self-wait within a render pass

  • panvk: improve should_split_render_pass

  • panvk: fix a missing cache invalidation

  • panvk: update expectations for G610

  • vulkan: include host write in expanded dst access flags

  • panvk: add normalize_dependency

  • panvk: improve VK_QUEUE_FAMILY_EXTERNAL support

  • panvk: add support for VK_EXT_queue_family_foreign

  • panvk: fix base_workgroup_id sysval

  • ci: update the comment on MESA_VK_ABORT_ON_DEVICE_LOSS

  • panvk: report queue lost timely when PANVK_DEBUG=sync

  • panvk: implement check_status on v10+

  • panvk: no need to map IB internally on valhall

  • panvk: clang-format issue_fragment_jobs

  • panvk: fix frag_completed for layered rendering

  • panvk: minor clean up to prepare_blend

  • panvk: fix dirty check for prepare_blend

  • panvk: expand top-of-pipe and bottom-of-pipe

  • panvk: use u_foreach_bit to loop over mask bits

  • panvk: fix vs image support

  • panvk: add panvk_queue_submit_init

  • panvk: add panvk_queue_submit_init_storage

  • panvk: add panvk_queue_submit_init_waits

  • panvk: add panvk_queue_submit_init_cmdbufs

  • panvk: add panvk_queue_submit_init_signals

  • panvk: add panvk_queue_submit_ioctl

  • panvk: add panvk_queue_submit_process_signals

  • panvk: add panvk_queue_submit_process_debug

  • panvk: clean up panvk_queue_submit

  • panvk: move pandecode_next_frame a bit earlier

  • panvk/csf: fix SIMULTANEOUS_USE gpu faults

  • panvk/csf: fix subqueue ctx memory pool

  • panvk: use cs_tracing_ctx::enabled for exception handler

  • panvk: add u_trace_context to panvk_device

  • panvk: define cmdbuf begin/end tracepoints

  • panvk/csf: add CS_REG_SCRATCH_COUNT

  • panvk/csf: add u_trace to panvk_cmd_buffer

  • panvk/csf: add vk_sync to panvk_queue

  • panvk/csf: flush and process trace events for one-time cmdbufs

  • panvk/csf: flush and process trace events for all cmdbufs

  • panvk: improve C++ compat for perfetto

  • panvk: add u_trace perfetto support

  • panvk: silence a perfetto init warning

  • vulkan: add vk_device_get_timestamp

  • vulkan: add common GetPhysicalDeviceCalibrateableTimeDomainsKHR

  • vulkan: add common GetCalibratedTimestampsKHR

  • anv: use common calibrated timestamp support partially

  • hasvk: use common calibrated timestamp support

  • radv: use common calibrated timestamp support

  • tu: use common calibrated timestamp support

  • nvk: use common calibrated timestamp support

  • hk: remove calibrated timestamp support

  • panvk: no need to zero availability on query create

  • panvk: no need to check query count on query create

  • panvk: no need to zero results on query reset

  • panvk/csf: no need to sb wait on query begin

  • panvk/csf: no need to sb wait on query end

  • panvk/csf: no need to sb wait on query copy

  • panvk/csf: no need to flush caches after query copy

  • panvk/csf: add a comment on query synchronization

Christian Gmeiner (20):

  • broadcom/common: Make v3d_device_info.h usable for C++

  • v3d: Move v3d_ioctl(..) to src/broadcom/common

  • v3dv: Switch to v3d_ioctl(..)

  • v3d: Move v3d_X(..) to src/broadcom/common

  • v3dv: Switch to v3d_X(..)

  • broadcom: Add perfcount library

  • v3d: Switch to use libbroadcom_perfcntrs

  • v3dv: Switch to use libbroadcom_perfcntr

  • etnaviv: blt: Add DBG(..) why blt usage was not possible

  • etnaviv: rs: Add DBG(..) why blt usage was not possible

  • v3d: Sync v3d_drm.h with drm-misc-next

  • broadcom: Add perfetto data source

  • pps: Add support for v3d ds

  • perfetto: Add v3d data sources to system.cfg

  • perfetto: Add v3d data sources to gpu.cfg

  • docs: Update perfetto with the latest status

  • etnaviv: isa: Support src2 for texld

  • etnaviv: isa: Support src2 for texldb and texldl

  • egl/meson: Specify which symbols to export

  • v3dv: Add some CPU tracepoints

Christopher Michael (5):

  • v3d: Add check to see if v3d supports cpu_queue

  • v3d: Add check to see if v3d supports multisync

  • v3d: Add support for timestamp queries

  • v3d: Add support for time elapsed queries

  • v3d: Add support for PIPE_QUERY_TIMESTAMP_DISJOINT

Collabora’s Gfx CI Team (5):

  • Uprev Piglit to eebe1b555f51dbb702f696d08ad5ae8153bcdcdd

  • Uprev Piglit to d04d6fff00849a2a8e29ef3251c6ca04a2f68dc7

  • Uprev Piglit to 468221c722481c470e6a23760b914c33143c2af6

  • Uprev Piglit to 4c0fd15fd956ec70c5509bedee219d602b334464

  • Uprev Piglit to 631b72944f56e688f56a08d26c8a9f3988801a08

Connor Abbott (55):

  • vulkan/runtime: Add driver callbacks for BVH building

  • vulkan/runtime,radv: Add shared BVH building framework

  • vulkan/runtime,radv: Add shared BVH building framework

  • ir3: Fix reload_live_out() in shared RA

  • tu: Add Vulkan 1.4 features and properties

  • tu: Expose Vulkan 1.4 on a7xx

  • tu: Move queue-related code to a new file

  • tu: Refactor the submit path

  • tu/kgsl: Make wait_timestamp_safe() return VkResult

  • tu/knl: Move u_trace fence handling to generic code

  • tu: Rename bo_list to submit_bo_list

  • util/dynarray: Add macro for appending an array

  • tu: Make userspace RD dump generic

  • freedreno/fdl: Make tiled r8g8 images have 4k alignment

  • tu: Re-enable tiled non-ubwc R8G8 images

  • freedreno/fdl: Fix 3d mipmapping height alignment

  • freedreno/fdl, tu: Make mutable part of the image layout

  • freedreno/fdl: Don’t enable r8g8 special case for mutable images

  • freedreno/fdl, tu: Allow swaps with mutable tiled images

  • tu: Allow UBWC with images with swapped formats.

  • vk/bvh: Fix clang build error with turnip

  • ir3: Allow collect sources to be undef

  • ir3: Support assembling/disassembling ray_intersection and resbase

  • ir3: Plumb through two-dimensional UAV loads

  • ir3: Plumb through ray_intersection intrinsic

  • tu: Implement cmd_fill_buffer_addr internal function

  • tu: Implement buffer_write_cp

  • freedreno: CP_SCRATCH_WRITE exists on a7xx too

  • freedreno: Add new a7xx CP_REG_RMW and CP_REG_TO_SCRATCH fields

  • freedreno/a7xx: Document partial workgroup register

  • tu: Stop emitting HLSQ_CS_KERNEL_GROUP_*

  • tu/a7xx: Emit HLSQ_CS_LAST_LOCAL_SIZE dynamically

  • tu: Implement unaligned dispatches

  • tu: Add common define for maxTexelBufferElements

  • tu: Create meta device

  • freedreno: Introduce ray tracing features

  • tu/kgsl: Bump uapi header

  • tu: Plumb through raytracing fuse

  • tu: Move fd_dev_info() before name generation

  • tu: Display when raytracing is disabled in device string

  • tu: Support VK_KHR_acceleration_structure

  • tu: Support VK_KHR_ray_query

  • tu: Expose VK_KHR_ray_tracing_maintenance1

  • tu, ir3: Implement a750 RT workaround

  • ir3: Use nir_split_struct_vars for temporaries

  • vk/bvh: Add default stubs for unsupported entrypoints

  • anv: Delete acceleration structure stubs

  • radv: Delete acceleration structure stubs

  • tu: Use image view format for sysmem resolves

  • tu: Handle non-identity GMEM swaps when resolving

  • tu: Handle non-identity GMEM swaps for input attachments

  • tu, freedreno: Write PC_DGEN_SU_CONSERVATIVE_RAS_CNTL

  • tu: Stop setting binning fields on a7xx

  • tu: Support VK_EXT_conservative_rasterization on a7xx

  • tu: Add missing assignment to shared_viewport

Constantine Shablia (23):

  • panvk: move samplerAnisotropy in the order it appears in struct definition

  • panvk: enable shaderInt64

  • panvk: elaborate the comment on the maxMemoryAllocationCount limit

  • panvk: adjust maxSamplerAllocationCount limit

  • nir: introduce instance_index system value

  • nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively

  • Revert “nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively”

  • Revert “nir: introduce instance_index system value”

  • panvk: replace vkGetBufferMemoryRequirements2 with vkGetDeviceBufferMemoryRequirements

  • panvk: never prefer or require dedicated allocation for buffers

  • panvk: never require dedicated allocation for images

  • panvk: add panvk_image_init helper

  • panvk: implement vkGetDeviceImageMemoryRequirements

  • panvk: enable shaderInt8, VK_KHR_8bit_storage and VK_KHR_shader_float16_int8

  • pan/util: sort files in meson.build

  • panvk: order KHR extension enables alphabetically

  • panvk/csf: use gfx_state_set_dirty instead of touching state directly

  • pan,nir: introduce load_attribute_pan

  • pan/bi: handle load_attribute_pan

  • panvk: Fix base_{instance,vertex} handling

  • panvk: lower drawid to zero

  • panvk: enable shaderDrawParameters

  • panvk: enable drawIndirectFirstInstance

Corentin Noël (6):

  • virgl: Propagate the GL_MAX_stage_SHADER_STORAGE_BLOCKS for each stage

  • virgl: Simply loop over the resources to figure-out if it is already added

  • virgl: Update virgl_hw.h from virglrenderer

  • virgl: Use MAX_SAMPLERS instead of MAX_SHADER_SAMPLER_VIEWS

  • virgl/ci: Remove screen size arguments

  • virgl/ci: Re-enable virgl-traces

Daniel Schürmann (49):

  • aco/ra: set Pseudo_instruction::scratch_sgpr to SCC if it doesn’t need to be preserved

  • aco/ra: use bitset for sgpr_operands_alias_defs

  • aco/ra: explicitly assign scratch SGPR for linear phis

  • aco: remove Pseudo_instruction::tmp_in_scc

  • aco/insert_NOPs: implement vector-based RegCounterMap as replacement for VGPRCounterMap

  • aco/insert_NOPs: use RegCounterMap as replacement for the CounterMap implementation

  • aco/insert_NOPs: add early exit to handle_valu_partial_forwarding_hazard_instr

  • aco/print_asm: allow for empty blocks with arbitrary offsets

  • aco/assembler: constify assembly functions

  • aco/assembler: Actually insert s_inst_prefetch instructions when aligning blocks for loops

  • aco/assembler: change ctx.loop_header to uint32_t instead of Block*

  • aco/assembler: chain branches instead of emitting long jumps

  • aco: remove definition from SOPP branch instructions

  • aco: remove definition from Pseudo branch instructions

  • aco/assembler: Don’t emit target basic block index when chaining branches

  • aco/print_ir: don’t print disconnected empty blocks

  • aco/optimizer_postRA: set branch()->never_taken if exec is constant non-zero

  • aco: move try_optimize_branching_sequence() to postRA optimizations

  • aco/jump_threading: remove branch sequence optimization

  • aco: move branch lowering optimization into separate file ‘aco_lower_branches.cpp’

  • aco/lower_branches: remove edges between blocks if there is no direct branch

  • ac/lower_ngg: Fix collecting buffer offsets from 4 lanes on gfx12

  • ac/lower_ngg: move break blocks after loop in streamout code generation for gfx12/ACO

  • ac/lower_ngg: move readlane into break blocks in streamout code generation for gfx12/ACO

  • nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks

  • aco/jump_threading: don’t remove loop preheaders

  • aco/assembler: Find loop exits using the successor’s loop nest depth

  • aco: consider s_cbranch_exec* instructions in needs_exec_mask()

  • aco/lower_branches: do eliminate_useless_exec_writes_in_block() during branch lowering.

  • aco/lower_branches: implement try_remove_simple_block() in lower_branches()

  • aco: move try_merge_break_with_continue() to lower_branches()

  • aco/lower_branches: allow for non-fallthrough loop exits in try_merge_break_with_continue()

  • aco: delete aco_jump_threading.cpp

  • aco/lower_branches: stitch linear blocks if there is exactly one successor with one predecessor

  • nir/from_ssa: only consider divergence if requested

  • Revert “nir: add nir_clear_divergence_info, use it in nir_opt_varyings”

  • aco/insert_NOPs: refactor VALUReadSGPRHazard detection

  • aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12

  • nir/loop_analyze: only iterate loop header phis in compute_induction_information()

  • nir/loop_analyze: remove nir_loop_variable::in_if_branch and nir_loop_variable::in_nested_loop

  • nir/loop_analyze: remove nir_loop_variable::in_loop

  • nir/loop_analyze: directly record induction variables into nir_loop_info

  • nir/loop_analyze: don’t initialize nir_loop_variable separately

  • nir/loop_analyze: replace nir_loop_variable array with hash table

  • nir/loop_analyze: insert only induction vars into hash map

  • nir/loop_analyze: ignore terminating induction variable in guess_loop_limit()

  • nir/loop_analyze: re-use the same nir_loop_variable struct before and after the increment

  • nir/loop_analyze: store nir_loop_induction_variable hash table in loop_info

  • nir/loop_analyze: stack-allocate loop_info_state

Daniel Stone (22):

  • ci: Don’t run Meson tests in critical-path jobs

  • ci: Slash ASan and UBSan build coverage

  • ci: Give much more time to ASan and UBSan jobs

  • ci: Let rootfs builds run for 2 hours (!)

  • pipe_loader: Fix pipe_i915 with the dynamic loader

  • ci: Disable Werror on wrapped subprojects

  • ci: Remove obsolete compiler-wrapper

  • ci: Move build containers above test containers

  • ci/fedora: Install which into build image

  • ci: Define LLVM_VERSION as a container property

  • ci: Require LLVM_VERSION to be set explicitly

  • ci/debian: Upgrade Debian images to LLVM 19

  • ci: Fix dependency on lint job

  • ci: Fix kernel section nesting

  • ci: Move dEQP message into section

  • ci: Pass build targets to dEQP CMake

  • ci: Don’t build Vulkan for GL dEQP

  • ci: Trim down VVL external builds

  • ci: Capture Ninja log

  • ci: Only build Perfetto in build-test jobs

  • ci: Only build what we use for testing jobs

  • ci: Move r300/nine/nvk builds out of critical path

Danylo Piliaiev (31):

  • ir3/parser: Print the line where parsing error occurred

  • nir/nir_opt_offsets: Do not fold load/store with const offset > max

  • freedreno/registers: Define Fragment Shading Rate registers

  • ir3,tu: Add support for Fragment Shading Rate and plumb it into Turnip

  • tu/a7xx: Implement VK_KHR_fragment_shading_rate

  • ir3/parser: Add fullnop and fullsync sections for debugging

  • tu: Enable UBWC for 3D images without mipmaps

  • freedreno/fdl: Pass fd_dev_info to fdl6_layout

  • tu,freedreno: Enable linear mipmap tail for UBWC images

  • tu: Disable fragmentShadingRateWithShaderSampleMask due to issues

  • tu,ir3: Add workaround for reading shading rate on A7XX gen1,gen2

  • tu: Handle cmdbuf and rp_blit flags of TU_DEBUG_STALE_REGS_FLAGS

  • tu/perfetto: Always emit submission event and time it

  • tu/perfetto: Add app and engine names to the command buffer tracepoint

  • ir3: Make allocation of consts more generic and order independent

  • ir3: Use generic consts alloc for driver params

  • tu,ir3: Make push consts be able to start from higher than c0.x offsets

  • ir3: Use generic const alloc for everything and call it once

  • tu: Allocate consts for driver params as early as possible

  • tu: Do not re-calculate static blend LRZ state

  • freedreno/regs: Set correct shr for GRAS_LRZ_BUFFER_PITCH.ARRAY_PITCH

  • tu: Fix LRZ for arrayed depth

  • tu: Handle 8x MSAA for LRZ

  • freedreno,tu: Unify LRZ layout calculations

  • tu: Track at which draw call LRZ is disabled

  • tu: Do not disable LRZ for whole RP if it is disabled in RP

  • ir3: Consider const alloc alignment in free space size calcs

  • tu: Fix stale A7XX_GRAS_LRZ_CNTL2 in 3d blits or !valid lrz case

  • tu/a7xx: Always have depth/stencil in corresponding resolve groups

  • tu: Get correct src view when storing gmem attachment

  • tu: Handle mismatched mutability when resolving from GMEM

Dave Airlie (9):

  • nir/functions: force inlining for barriers.

  • v3dv: report correct error on failure to probe

  • venus: handle device probing properly.

  • vulkan: update to 302 headers for av1 encode

  • lavapipe: fix beta build due to changes in AMDX ext

  • radv/video: set max slice counts to 1 for h264/5 encode

  • anv: add default av1 tables from media-driver

  • genxml: add av1 fields

  • anv: add initial support for AV1 decoding

David (Ming Qiang) Wu (3):

  • frontends/va: adding PIPE_FORMAT_P012

  • frontends/va: add PIPE_VIDEO_PROFILE_AV1_PROFILE2

  • radeonsi/vcn: support 12bit YUV420 AV1 decoding

David Heidelberg (14):

  • util: Drop 3Dnow optimisation leftovers

  • util: Remove MMX/MMXext detection code

  • util: Drop ancient Intel CPU detection

  • util: drop XOP detection code

  • llvmpipe: align with u_cpu_detect struct changes

  • compiler/rust: drop duplicated bindgen check

  • ci/freedreno: update Adreno 306 expectations

  • ci/freedreno: increase Adreno 618 timeout to 1h

  • docs: remove deprecated component list and licenses

  • docs: Clarify project name and include Mesa3D

  • docs: move license(s) to licenses directory

  • c11: use SPDX-License-Identifier header

  • licenses: add missing licenses

  • drm-uapi: update licenses statement

David Rosca (148):

  • radeonsi/vcn: Fix coding AV1 render size

  • frontends/va: Add minus_1 to AV1 render_width/height

  • gallium: Add PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE

  • frontends/va: Support skip clear on surface creation

  • frontends/vdpau: Support skip clear on surface creation

  • radeonsi: Support PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE

  • radeonsi/vcn: Stop clearing decode internal buffers

  • radv/video: Fix H264 slice control

  • radv/video: Fix HEVC slice control

  • radv/video: Report correct encodeInputPictureGranularity

  • radv/video: Avoid selecting rc layer over maximum

  • radv/video: Use 64x16 alignment for HEVC encode

  • radv/video: Override pic_init_qp_minus26 in PPS

  • radeonsi/vcn: Use correct frame context buffer for preencode on VCN5

  • radeonsi: Check all supported formats in si_vid_is_target_buffer_supported

  • frontends/va: Create surfaces with correct fourcc for RT format

  • frontends/va: Stop reallocating to prefered format in EndPicture

  • frontends/va: Stop reallocating from progressive to interlaced in EndPicture

  • frontends/va: Stop reallocating buffers for protected playback

  • frontends/va: Stop reallocating according to JPEG sampling factor

  • frontends/va: Check if target buffer is supported in EndPicture

  • frontends/va: Stop reallocating buffers in EndPicture

  • frontends/va: Use compositor blit with different number of planes

  • frontends/va: Only use interlaced surfaces when progressive is not supported

  • pipe: Remove video update_decoder_target

  • radeonsi/vpe: Set correct surface swizzle mode

  • radeonsi/vpe: Don’t allow DCC surfaces

  • frontends/va: Return correct pixel formats in surface attributes query

  • frontends/va: Change default fourcc for RGB 10bit to X2R10G10B10

  • gallium/vl: Implement rendering to 3-plane YUV formats

  • gallium/vl: Don’t support planar RGB as video format

  • frontends/va: Enable 3-plane YUV formats as postproc output

  • radeonsi/vcn: Support tiling for JPEG decode

  • radv/video: Fix IB signature checksum

  • radv/video: Always use setup reference slot when valid

  • ac/surface: Add RADEON_SURF_VIDEO_REFERENCE

  • radeonsi: Support PIPE_BIND_VIDEO_DECODE/ENCODE_DPB

  • radeonsi/vcn: Create decode DPB surfaces with PIPE_BIND_VIDEO_DECODE_DPB

  • radeonsi/vcn: Create encode DPB surfaces with PIPE_BIND_VIDEO_ENCODE_DPB

  • frontends/va: Add support for VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_3

  • frontends/va: Store picture type for buffers in encode DPB

  • radeonsi/vcn: Don’t allow encoding H264 B-frame references

  • frontends/va: Move mjpeg sampling_factor to pipe_mjpeg_picture_desc

  • radeonsi/vcn: Remove code handling buffer_get_virtual_address failure

  • radeonsi/vcn: Unmap bitstream buffer in radeon_dec_destroy

  • radeonsi/vcn: Gracefully handle decode errors and report to frontend

  • radeonsi/vcn: Make sure JPEG target buffer format matches sampling factor

  • radeonsi/vcn: Cleanup JPEG supported formats

  • radeonsi/vpe: Silence expected errors with unsupported output format

  • gallium/vl: Add plane order for Y8_400 format

  • gallium/vl: Fix plane order for IYUV format

  • frontends/va: Stop converting formats in Put/GetImage

  • radeonsi: Update minimum supported encode size for VCN5

  • radeonsi/vcn: Align bitstream buffer to 128 when resizing

  • radeonsi/uvd: Align bitstream buffer to 128 when resizing

  • radeonsi/vcn: Enable write combine for decode

  • radeonsi/vcn: Don’t keep last fence

  • radeonsi/vcn: Use local variable for destory fence

  • pipe: Remove PIPE_DEFAULT_DECODER_FEEDBACK_TIMEOUT_NS

  • frontends/va: Get AV1 decode subsampling_x/y

  • radeonsi/vcn: Return error when decoding 12bit VP9 and 4:2:2/4:4:4 AV1

  • frontends/va: Fix decoding VC1 interlaced video

  • frontends/va: Don’t allow Render/EndPicture without BeginPicture

  • frontends/va: Don’t allow EndPicture without calling driver begin_frame

  • ac/parse_ib: Parse VCN IB_COMMON_OP_WRITEMEMORY

  • radv/amdgpu: Set VCN version for ac_parse_ib

  • frontends/va: Fix deinterlace filter

  • radeonsi/vcn: Change required FW version for rc_per_pic_ex on VCN3

  • radv/video: Fix DPB tier2 surface params

  • radv/video: Use correct array index for decode target and DPB images

  • radv/video: Remove dt_field_mode handling code

  • radv: Fix sampling from image layers of video decode target

  • ac/surface: Don’t force linear for VIDEO_REFERENCE with emulated image opcodes

  • frontends/va: Get buffer feedback with locked mutex in MapBuffer

  • radeonsi/vcn: Use compute only context

  • gallium/vl: Fix unbinding sampler views

  • gallium/vl: Create sampler state also when gfx is not supported

  • gallium/vl: Add rgba compute shader

  • gallium/vl: Add param to create compute only vl_compositor

  • gallium: Add param to create compute only multimedia context

  • frontends/va: Use compute only context if driver prefers compute

  • radeonsi/vcn: Fix crash when failing to allocate internal buffers

  • frontends/va: Only report surface alignment when non-zero

  • frontends/va: Allow creating DRM PRIME surfaces without surface descriptor

  • frontends/va: Set csc matrix in PutSurface

  • gallium/vl: Fix creating buffers with auxiliary planes

  • radeonsi: Add radeon_bitstream and use it in radeon_vcn_enc

  • radeonsi/vce: Remove support for FW 50 and older

  • radeonsi/vce: Set more header params

  • radeonsi/vce: Move dual pipe context to offset 0 of CPB

  • radeonsi/vce: Use app DPB management

  • radeonsi/vce: Support slice encoding

  • radeonsi/vce: Support VBAQ

  • radeonsi/vce: Support quality presets

  • radeonsi/vce: Support min/max QP and max frame size

  • radeonsi/vce: Support intra refresh

  • radeonsi/vce: Support raw packed headers

  • radeonsi/vce: Set input pic swizzle mode on GFX9

  • radeonsi/vce: Cleanup

  • radeonsi/uvd: Stop clearing decode internal buffers

  • radeonsi/uvd: Optimize bitstream buffer resizing

  • radeonsi/uvd: Set decode target swizzle mode on GFX9

  • radeonsi/uvd_enc: Rework DPB allocation

  • radeonsi/uvd_enc: Use app DPB management

  • radeonsi/uvd_enc: Consider input surface size for padding

  • radeonsi/uvd_enc: Support Pre-Encode

  • radeonsi/uvd_enc: Support VBAQ

  • radeonsi/uvd_enc: Support quality presets

  • radeonsi/uvd_enc: Support slice encoding

  • radeonsi/uvd_enc: Support intra refresh

  • radeonsi/uvd_enc: Support temporal layer rate control

  • radeonsi/uvd_enc: Support min/max QP and max frame size

  • radeonsi/uvd_enc: Support dynamic rate control changes

  • radeonsi/uvd_enc: Support raw packed headers

  • radeonsi/uvd_enc: Set input pic swizzle mode on GFX9

  • radeonsi: Enable implemented VCE/UVD encode features

  • gallium/vl: Fix sampler view components for Y8_400 format

  • gallium/vl: Add vl compositor layer mirror

  • gallium/vl: Clear remaining planes in YUV conversion

  • gallium/vl: Use matrix for scale and crop in cs compositor

  • gallium/vl: Implement rotation and mirror in cs compositor

  • frontends/va: Simplify format check in PutSurface

  • frontends/va: Disable color conversion for luma-only source formats

  • frontends/va: Stop using util_compute_blit

  • frontends/va: Refactor vlVaPostProcCompositor to be usable outside processing

  • frontends/va: Support rotation and mirror for processing

  • frontends/va: Implement format conversions in PutImage/GetImage

  • gallium/auxiliary: Remove util_compute_blit

  • radeonsi: Fix reporting support for AV1 Profile2

  • radeonsi/vcn: Fix AV1 coded size for VCN 5.0

  • radeonsi: Report surface alignment for AV1 encode

  • gallium/vl: Add compute shader deinterlace filter

  • frontends/va: Stop using extra context for deinterlacing

  • frontends/va: Implement QuerySurfaceStatus as SyncSurface with 0 timeout

  • frontends/va: Don’t flush before resource_get_handle

  • frontends/va: Remove vlVaBuffer derived_image_buffer

  • frontends/va: Add surface pipe_fence for vl_compositor rendering

  • gallium/vl: Don’t flush in vl_compositor yuv_deint and rgb_to_yuv

  • frontends/va: Add context mutex

  • frontends/va: Unlock driver mutex for SyncSurface/Buffer fence wait

  • frontends/va: Fix decoding VC1 streams with multiple slices

  • ac/vcn_dec: Fix AV1 film grain on VCN5

  • radeonsi/video: Avoid stream handle duplicates in PID namespace

  • frontends/vdpau: Set H264 chroma_format_idc

  • radeonsi/vcn: Set correct chroma format for H264 decode

  • radeonsi/uvd: Set correct chroma format for H264 decode

  • radv/video: Fix setting balanced preset for HEVC encode with SAO enabled

  • radv/video: Move IB header from begin/end to encode_video

David Tobolik (2):

  • rusticl/style: use Arc::clone instead of .clone()

  • rusticl/style: add util for conversion with err

Deborah Brouwer (36):

  • freedreno/ci: add prefix for a630-vk-asan tests

  • ci: Remove duplicate slash before $RESULTS_DIR

  • ci/b2c: update RESULTS_DIR for .b2c-test jobs

  • ci: add a tool to summarize a failed pipeline

  • ci/pipeline_message: add unit tests for tool

  • ci: move pipeline_summary tool to .marge/hooks

  • ci: debian/x86_64_pyutils remove redundant rules

  • ci: python-test rename artifacts

  • ci: yaml-toml-shell-test: use pyutils container

  • ci: separate python tests and artifacts

  • ci: post gantt: use logging instead of print

  • ci: add some static typing to the gantt scripts

  • ci: make the gantt scripts available as modules

  • ci: post gantt: add –marge-user-id option

  • ci: post gantt: add –project-id option

  • ci: post gantt: add pipeline-id to gantt filename

  • ci: post gantt: ignore pipeline_summary message

  • ci: gantt chart: include in-progress jobs

  • ci: add –ci-timeout option for gantt scripts

  • ci: add pytests for the gantt chart scripts

  • ci: update token retrieval method for gantt charts

  • ci: collapse yamllint and shellcheck sections

  • ci: run-pytest.sh: allow script to run locally

  • ci: add .flake8 linting to ci scripts and tests

  • ci: update_traces_checksum: fix E501 line too long

  • ci: update the pyutils container

  • ci: stop using a venv for run-pytest.sh

  • ci: set python version 3.11 for run-pytest.sh

  • ci: pipeline_message: catch module loading errors

  • ci: pipeline_message: improve job list formatting

  • ci: pipeline_message: add test to parse error logs

  • ci: pipeline_message: ignore `error_type` errors

  • ci: pipeline_message: ignore harmless build logs

  • ci: pipeline_message: ignore `generated` errors

  • ci: pipeline_message: parse `fatal` messages

  • ci: pipeline_message: reset empty errors

Derek Foreman (3):

  • vulkan/wsi/wayland: Fix time calculation

  • vulkan/wsi/wayland: Avoid spurious discard event at startup

  • vulkan/wsi/wayland: Move timing calculations to the swapchain

Detlev Casanova (3):

  • ci/fluster/lava: Add fluster in LAVA rootfs

  • ci/fluster: Add radeonsi-raven-vaapi-fluster jobs

  • ci/deqp-runner: uprev from 0.20.2 to 0.20.3

Dylan Baker (25):

  • VERSION: bump to 25.0

  • docs: reset new_features.txt

  • docs/release-calendar: update one more time for pushed back release

  • docs: add release notes for 24.3.0

  • docs/relnotes/24.3.0: Add SHA sums

  • docs/release-calendar: remove 24.3 RC dates

  • docs: Add calendar entries for 24.3 release.

  • anv: advertise Vulkan 1.4

  • anv: bump max number of push constants to 256

  • anv: Add new Vulkan 1.4 features and properties

  • anv: bump conformance version to 1.4

  • maintainer-scripts: Bump Vulkan release version to 1.4

  • docs: add release notes for 24.3.1

  • docs: Add SHA sums for 24.3.1

  • docs: update calendar for 24.3.1

  • clc: Tell clang to track imported dependencies

  • docs: add release notes for 24.3.2

  • docs: Update checksums for 24.3.2

  • docs: update calendar for 24.3.2

  • docs/release-calendar: Move next release to January 2nd

  • intel/tests: Fix coverity warning about possibly leaked memory

  • intel/tests: Fix missing assignment of error condition

  • docs: add release notes for 24.3.3

  • docs: Add SHA sums to 24.3.3 release notes

  • docs: update calendar for 24.3.3

Eric Engestrom (139):

  • meson: bump spirv-tools version needed to v2022.1

  • radeonsi/ci: add more flakes seen recently

  • radv/ci: add more flakes seen recently

  • broadcom/ci: add more flakes seen recently

  • freedreno/ci: add more flakes seen recently

  • ci: upgrade the fedora image from 38 to 41

  • ci/build: drop “verify after bump to F39” as that did not help

  • ci/build: add workaround for incorrect maybe-uninitialized error

  • ci: move error handling functions at the end

  • ci: use quiet alias for commands

  • ci: make error handling quieter

  • broadcom/ci: add flakes seen recently

  • freedreno/ci: add flakes seen recently

  • nvk+zink/ci: add flakes seen recently

  • radv+zink/ci: add flakes seen recently

  • ci: raise priority of release manager pipelines

  • ci: reduce priority of nightly pipeline jobs from 50 to 45

  • meson: move openmp block out of the middle of the x11 deps block

  • meson: define only once the versions of the x11 deps

  • radv/ci: document flakes seen recently

  • broadcom/ci: document flakes seen recently

  • nvk/ci: document flakes seen recently

  • freedreno/ci: document flakes seen recently

  • docs: update calendar for 24.2.7

  • docs: add release notes for 24.2.7

  • docs: add sha sum for 24.2.7

  • turnip/ci: document regression

  • ci/crosvm: remove noise inside deqp-runner output

  • v3dv/ci: mark whole group as flaky

  • docs: fix invalid expression in new pipe cap

  • docs: fix invalid expression in teflon docs

  • intel/ci: disable CML jobs because of networking issues

  • intel/ci: add missing .intel-common-manual-rules to .{iris,crocus,i915g}-manual-rules

  • ci/build: drop mold wrapper for `ninja install`

  • ci: drop override forcing ld to be gold (and forcing gold to be installed everywhere)

  • ci: when installing mold, make its use automatic

  • ci: bump image tags

  • radeonsi/ci: drop two failures that are mysteriously fixed by using mold?

  • ci/container: move deqp build section into the script itself

  • ci/container: move apitrace build section into the script itself

  • ci/container: move crosvm build section into the script itself

  • ci/container: move deqp-runner build section into the script itself

  • ci/container: move fossilize build section into the script itself

  • ci/container: move gfxreconstruct build section into the script itself

  • ci/container: move kdl build section into the script itself

  • ci/container: move libclc build section into the script itself

  • ci/container: move llvm-spirv build section into the script itself

  • ci/container: move mold build section into the script itself

  • ci/container: move ninetests build section into the script itself

  • ci/container: move piglit build section into the script itself

  • ci/container: move rust build section into the script itself

  • ci/container: move vkd3d-proton build section into the script itself

  • ci/container: move vulkan-validation build section into the script itself

  • ci/container: move wayland build section into the script itself

  • ci/container: add sections around the other build scripts

  • ci/container: close debian_{setup,cleanup} sections

  • ci/lava: add setup-test-env.sh to the rootfs

  • ci/container: add section around strip-rootfs.sh

  • ci: bump image tags

  • zink+nvk/ci: fix deqp binary used for gles tests

  • zink+radv/ci: fix deqp binary used for gles tests

  • ci/deqp: move testlog-to-* tools to /deqp

  • ci/deqp: only compress caselists when they exist

  • ci/deqp: build testlog tools on android

  • ci/deqp: fetch & checkout exactly the commit/tag/branch requested

  • ci/deqp: avoid downloading 1.47 GiB multiple times

  • ci/deqp: error out in case of invalid build API

  • ci/deqp: build glcts in gles build, for gles*-khr tests

  • ci/deqp: add build of `main` branch

  • ci/deqp: make sure the main commit is actually from the main branch

  • ci/deqp: fully isolate deqp builds

  • ci: bump image tags

  • ci/container: setup sections in all image builds

  • radv/ci: document regression of test_shader_sm66_is_helper_lane in 7469f99e…25b8f4f7

  • meson: simplify logic a bit

  • meson: drop unused variables

  • meson: reuse variable

  • meson/megadriver: s/_/-/ in an argument name to be consistent

  • meson/megadriver: simplify setting common megadriver arguments

  • meson/megadriver: support various lib suffixes

  • ci/deqp: simplify paths since we are already in /deqp-$deqp_api/

  • ci/deqp: fix the “is this a build on main?” check

  • ci/deqp: support having commit backports and local patches for main too

  • ci/deqp: simplify generating the version description file

  • ci/deqp: mention the deqp api in the version string

  • ci/deqp: only print the commit list header when the list is not empty

  • ci/lava: turn the $BUILD_VK check into a proper if block

  • ci/deqp: add a deqp-vk build on the `main` branch

  • ci: bump image tags

  • radv/ci: use deqp-vk-main in radv jobs

  • docs: update calendar for 24.2.8

  • docs: add release notes for 24.2.8

  • docs: add sha sum for 24.2.8

  • ci/meson: make meson wrap fallback list more readable

  • ci/meson: add FORCE_FALLBACK_FOR variable for build jobs to use

  • docs/release-calendar: add 25.0 branchpoint and RCs schedule

  • docs/release-calendar: fixup sed fail

  • docs/release-calendar: push the 25.0 branchpoint back by 2 weeks

  • docs: update calendar for 24.3.4

  • docs: add release notes for 24.3.4

  • docs: add sha sum for 24.3.4

  • docs/release-calendar: push back the 24.3.x releases by one week

  • docs: update url to vulkan features & extensions

  • anv,gfxstream,panvk,zink: update urls to vulkan docs

  • radv,lvp: fix url to VkAabbPositionsKHR docs

  • ci: make linker warnings fatal

  • VERSION: bump for 25.0.0-rc1

  • [25.0-only] hk: comment out dead variable

  • .pick_status.json: Update to 5b856a741d6dc18d409a0c06ad6492cc3ee9a6bd

  • .pick_status.json: Mark 0ee5015da4c386c0ef8b6ff12fd2bb34022d86a6 as denominated

  • .pick_status.json: Update to e49df902b4c1b98569921d8b858e6e3855bf10e0

  • .pick_status.json: Update to e192d7d615dec9c9c04447c4b9ab0244d6380944

  • .pick_status.json: Mark 39969409f6fb60b21aea36be4d5424718fcc26b8 as denominated

  • VERSION: bump for 25.0.0-rc2

  • .pick_status.json: Update to fdaf7c7b9647874e66e79653050f9d0999dc9134

  • docs/android: drop libglapi.so now that it’s gone

  • .pick_status.json: Mark 5f54beb30728f6510ce50071ddaef5f9157b16ef as denominated

  • gfxstream: fix signedness of shifts

  • gfxstream: drop dead variables

  • gfxstream: use `range` variable for its intended purpose

  • gfxstream: mark unused variables as such

  • .pick_status.json: Update to ee9edd46254884ab7fe6c96518e23d421d5f5344

  • llvmpipe/tests: include math.h for INFINITY

  • ci: don’t run on tag pipelines

  • ci: only trigger the CI for release managers when pushing to staging branch

  • .pick_status.json: Update to 18f0807408425da11cb1d8cd1d73de369317440d

  • .pick_status.json: Update to 30a3d567c8b996fde86b07d2bad018013a54ff44

  • ci: run containers builds on staging branches

  • .pick_status.json: Mark 13e987669ccee373948753e113e9ce7e9bdbef55 as denominated

  • VERSION: bump for 25.0.0-rc3

  • .pick_status.json: Update to e41438275e005bbb20fc9c8115d7d29343c292d8

  • ci: debian-testing-ubsan is used by tests

  • ci/yaml-toml-shell-py-test: don’t run on post-merge pipelines

  • ci/yaml-toml-shell-py-test: run on direct push pipelines

  • .pick_status.json: Update to a9b6a54a8cce0aab44c81ea4821ee564b939ea51

  • .pick_status.json: Update to 06d8afff640c66e51517bf4bebd2a58abb2fa055

  • .pick_status.json: Update to 2361ed27f34774f0a73324915a9ddb57f43e112a

  • .pick_status.json: Update to 56aac9fdecad0f7d335f82653832927486f07d44

  • .pick_status.json: Update to 6b20b0658489afe745a28b8f09c57067e45b47f3

Eric R. Smith (28):

  • util: rename PIPE_FORMAT_Y8_U8V8_422_UNORM

  • dri, mesa: fix NV16 texture format

  • egl, mesa: add support for NV15 and NV20 textures

  • dri: fix NV15 and NV20 definitions to make sure they will be used

  • panfrost: add panfrost support for NV15, NV16 and NV20

  • panvk: fix depth bias calculation

  • panfrost: add a perf warning when resources need to be converted

  • panfrost: convert resources before binding them to images

  • panfrost: check afbc status in panfrost_query_compression_modifiers

  • mesa: when blitting between formats clear any unused components

  • aux: add support for dumping the swizzle in pipe_blit_info

  • mesa: update more drivers to handle pipe_blit_info swizzle_enable

  • format: Add R8_G8B8_422_UNORM format

  • panvk: update feature support

  • panvk: split device and instance version numbers

  • panvk: advertise version 1.1 support

  • panfrost: fix read/write resource confusion in afbc_pack

  • panfrost: fix potential memory leak

  • panvk: fix fs_required()

  • panfrost: apply DEPTH_STENCIL flag consistently

  • panfrost: Allow ATEST input to be a FAU index

  • panfrost: ensure sample_mask is written before color

  • panvk: re-enable fragmentStoresAndAtomics for v10

  • drm-uapi: update drm_fourcc.h to latest version

  • panfrost: support MTK 16L32S detiling

  • panfrost: avoid potential divide by 0 calculating timer_resolution

  • panfrost: fix YUV center information for 422

  • panfrost: fix backward propagation of values in loops

Erico Nunes (2):

  • ci/lima: update piglit ci expectations

  • ci/lima: enable again

Erik Faye-Lund (134):

  • panvk: drop unused include

  • panfrost: use mesa_log infra instead of stdio

  • glx: avoid null-deref

  • panfrost: use 64-bits for layout calculations

  • panvk: set correct max extents for images

  • panvk: support binding swapchain memory

  • panvk: wire up swapchain image creation

  • panvk: remove duplicate property

  • panvk: implement sampleRateShading

  • panvk: check for maxResourceSize-overflow in vkCreateImage

  • panvk: document reason for maxResourceSize-limit

  • docs: mark GL_ARB_shader_subroutine as always supported

  • docs: mark GL_ARB_get_program_binary as always supported

  • docs: update GL_OES_shader_image_atomic support

  • docs: update GL_ARB_multi_draw_indirect support

  • docs: refer to panfrost by version

  • docs: fixup a few mistakes with panfrost

  • docs: add missing panfrost extensions

  • lima: fixup typo

  • lima: add assert to validate list-lenght

  • lima: avoid memleak on error

  • panfrost: sanity-check alignment

  • panvk: correct signedness of timestamps

  • panvk: widen type before multiplying

  • mesa/main: properly check for EXT_memory_object

  • mesa/main: properly check for EXT_memory_object_fd

  • mesa/main: properly check for EXT_memory_object_win32

  • mesa/main: properly check for EXT_semaphore

  • mesa/main: properly check for EXT_semaphore_win32

  • st/mesa: check requirements for MESA_texture_const_bandwidth

  • mesa: error-check GL_TEXTURE_TILING_EXT params

  • panvk: report minmax-support for sampled formats

  • panvk: expose KHR_dedicated_allocation

  • vulkan/meta: plug a couple of memory leaks

  • panvk: free preload-shaders after compiling

  • panvk, nvk: spell width correctly

  • panvk/ci: correct name of skips-file

  • panvk/ci: remove duplicate skips

  • panvk/ci: add some missing skips

  • panvk/ci: update ci results for g610

  • panvk/ci: add a few flakes

  • panvk/ci: add a full panvk job

  • panfrost: match 4-bit format order

  • panfrost: add missing 4-bit formats

  • panvk: expose EXT_4444_formats

  • panvk/ci: update g52 results

  • panvk/ci: update g610 results

  • panvk: expose scalarBlockLayout

  • panvk/ci: remove duplicate skips

  • panvk/ci: update g52 results

  • panvk/ci: update g52-vk-full job

  • panvk: do not expose subgroup support

  • panvk: disable imageCubeArray on bifrost

  • panvk: soften the language around opt-in

  • panvk: do not require opt-in for panvk on v10

  • panvk/ci: correct timeouts as crash

  • panvk/ci: fixup g52 skip sorting

  • panvk/ci: add a few more g52 skips

  • panvk: fixup bad indent

  • panvk: only validate the push-sets that we update

  • panvk: back out of vk 1.1 support

  • panvk: make vk-version helper internal to source

  • docs: add new panvk features

  • panvk: fix image size for cube-arrays on bifrost

  • Revert “panvk: disable imageCubeArray on bifrost”

  • st/mesa: document ARB_texture_float quirk

  • pan/cs: fix broken allocation-failure check

  • panfrost: clean up mmap-diagnostics

  • panfrost: report errors from panfrost_bo_mmap

  • panfrost: handle mmap failures

  • panfrost: handle NULL-batches

  • panfrost: propagate cs_builder error instead of asserting

  • panfrost: handle pool-allocation errors

  • panfrost: handle errors allocating csf oom-handler

  • panfrost: try to survive start-up alloc fails

  • pan/ci: update t860 ci xfails

  • panvk: drop fragmentStoresAndAtomics support for now

  • vulkan: add vk_descriptor_type_is_dynamic helper

  • v3dv: use vk_descriptor_type_is_dynamic

  • turnip: use vk_descriptor_type_is_dynamic

  • dozen: use vk_descriptor_type_is_dynamic

  • panvk: use vk_descriptor_type_is_dynamic

  • radv: use vk_descriptor_type_is_dynamic

  • asahi: use vk_descriptor_type_is_dynamic

  • turnip: use vk_descriptor_type_is_dynamic

  • pvr: use vk_descriptor_type_is_dynamic

  • panvk: use vk_descriptor_type_is_dynamic

  • lavapipe: use vk_descriptor_type_is_dynamic

  • anv: use vk_descriptor_type_is_dynamic

  • hasvk: use vk_descriptor_type_is_dynamic

  • dozen: use vk_descriptor_type_is_dynamic

  • nvk: use vk_descriptor_type_is_dynamic

  • panvk/ci: update expected failures

  • docs: fixup broken markup

  • docs: fixup link in radv docs

  • docs/ci: treat warnings as errors

  • docs: update panvk status

  • panvk/ci: drop needless envvar

  • Revert “panfrost: Disable CRC by default”

  • pan/ci: update t760 checksum

  • pan/ci: update opencl expectations

  • docs/panfrost: document vulkan support

  • docs: update panvk status

  • docs/features: fixup panvk KHR_shader_draw_parameters-support

  • pan/va: fix base-level for nir_texop_lod

  • pan/ci: add some occasional flakes

  • docs/features: add a few missing extensions

  • docs/features: mark panfrost as supporting GL_OES_texture_view

  • pan/ci: drop empty trailing variables-list

  • panfrost: reuse tiler hierarchy mask selection from panvk

  • panfrost: limit maximum texture size

  • panfrost: do not artificially limit texture-sizes

  • pan/midgard: use macros for mir_prev_op / mir_next_op

  • pan/midgard: constify pointers

  • pan/compiler: don’t pass midgard_instruction by value

  • panvk: expose subgroup operations

  • panvk: expose vk1.1 on v10 hardware

  • pan/bi: bump iter_count to 2000

  • panvk: do not expose EXT_subgroup_size_control on bifrost

  • panvk/ci: update expected failures

  • panfrost: mark helper as static

  • panfrost: handle allocation errors when afbc-packing

  • panfrost: unify emit_tls and emit_fbd

  • panfrost: propagate allocation scratchpad allocation errors

  • panfrost: propagate errors from panfrost_batch_create_bo

  • panfrost: in-place map/unmap shouldn’t grow

  • gallium/aux: do not assert on map-failures

  • meson: build panvk by default on arm

  • panvk: fix line-rasterization of bifrost

  • panvk/ci: add back incorrectly removed crash

  • pan/ci: add flaky tests to the flake-list

  • pan/ci: add fail from llvm 19 upgrade

  • panvk: correct number of read bytes for dynamic buffers

  • panvk: report passing the VK CTS

Ernst Persson (1):

  • intel/vulkan: Add bvh build dependency

Evan (1):

  • amd/vpelib: Shaper Refactor

Faith Ekstrand (27):

  • vulkan: Allow the same item to show up twice in core version <requires>

  • vulkan: Add Vulkan 1.4 feature aliases

  • treewide: Stop putting enum in front of Vulkan enum types

  • vulkan: Update XML and headers to 1.4.303

  • nvk: Increase push constant space to 256B

  • nvk: No-op implement VK_KHR_global_priority

  • nvk: Add new Vulkan 1.4 features and properties

  • nvk: Advertise Vulkan 1.4

  • nvk: Only support Vulkan 1.4 on Turing+

  • nvk: Move Vulkan 1.4 features to the 1.4 section

  • nvk: Move Vulkan 1.4 properties to the 1.4 section

  • nvk: Set a command buffer error if pushbuf alloc fails

  • nvk: Call nir_opt_access

  • nak: Use ldc.constant for load_global when CAN_REORDER is set

  • nvk: Handle pCounterBuffers == NULL in Begin/EndTransformFeedback

  • nvk: Fix scissor bounds

  • nvk: Rename nvk_descriptor_set::mapped_ptr

  • nvk: Respect VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_EXT

  • nvk: Implement descriptorBufferPushDescriptors

  • nvk: Pull shaders from the state command buffer in nvk_cmd_process_cmds()

  • nvk: Handle shader==NULL in nvk_cmd_upload_qmd()

  • nvk: Allow sparse loads on EDB buffers

  • nak: Handle sparse texops with unused color destinations

  • nvk: Use suld for EDB uniform texel buffers

  • nvk: Align UBO/SSBO addresses down rather than up

  • nak: Use suld.constant when ACCESS_CAN_REORDER is set

  • nvk: Use suld.constant for EDB uniform texel buffers

Felix DeGrood (6):

  • iris: Use vfg distribution mode = RR_STRICT for Xe2+

  • anv: Use vfg distribution mode = RR_STRICT for Xe2+

  • anv: allow compressed buffers types on vkd3d titles

  • anv: remove unnecessary driconf entries for anv_enable_buffer_comp

  • vk/overlay-layer: defer log creation to swapchain creation

  • intel/perf: add new perf consts to support more metrics

Feng Jiang (2):

  • virgl: Ensure that PIPE_SHADER_CAP_MAX_CONST_BUFFERS is less than PIPE_MAX_CONSTANT_BUFFERS

  • radv/rt: Fix memleak in radv_init_header()

Francisco Jerez (27):

  • intel/fs/xe2: Fix up subdword integer region restriction with strided byte src and packed byte dst.

  • intel/brw/xe3+: Relax SEND EOT register assignment restrictions.

  • intel/brw: Saturate shifted subgroup index to avoid reading past the end of register file.

  • intel/brw: Use urb_read_length instead of nr_attribute_slots to calculate VS first_non_payload_grf.

  • intel/brw/xe3+: Mask subgroup shuffle index to be within valid range to avoid VRT hangs.

  • anv/gfx12.5: Request subgroup size 8 for RT trampoline shader.

  • intel/brw: Allow specifying a required subgroup size for fragment shaders.

  • intel/blorp: Specify a subgroup size requirement of 16 for fast clear or repclear shaders.

  • intel/common/xe2+: Allow SIMD32 PS for all multisample cases.

  • intel/brw/xe3: Define XE3_MAX_GRF.

  • intel/brw/xe3: Extend regalloc sets to maximum Xe3 GRF size.

  • intel/brw/xe3+: Bump number of SBID tokens for Xe3.

  • intel/brw/xe3+: Disable round-robin allocation heuristic on Xe3+.

  • intel/brw: Indent body of brw_compile_fs() not applicable to xe3+.

  • intel/brw: Indent conditional block from brw_compile_fs() not applicable to Xe2+.

  • intel/brw: Exit early from run_fs() if compilation failed before optimization loop.

  • intel/brw/xe3+: brw_compile_fs() implementation for Xe3+.

  • intel/brw/xe3+: Optimize CS/TASK/MESH compile time optimistically assuming SIMD32.

  • intel/brw: Report number of GRF registers used in brw_stage_prog_data.

  • intel/brw: Define ptl_register_blocks() helper.

  • intel/genxml/xe3+: Update definitions for shader state setup.

  • iris/xe3+: Set RegistersPerThread during shader state setup based on prog_data.

  • intel/blorp/xe3+: Set RegistersPerThread during shader state setup based on prog_data.

  • anv/xe3+: Set RegistersPerThread during shader state setup based on prog_data.

  • anv/xe3+: Set RegistersPerThread for bindless shader dispatch.

  • iris/xe3+: Enable VRT.

  • anv/xe3+: Enable VRT.

Frank Binns (2):

  • pvr: add TI j721s2 as a supported device

  • pvr: add 36.53.104.796 (BXS-4-64) to the list of supported GPUs

Friedrich Vock (15):

  • vulkan/rmv: Correctly set heap size

  • vulkan/runtime/bvh: Set leaf_node_count for updates

  • radv,driconf: Apply DOOM Eternal/idTech workarounds for Indiana Jones

  • aco/lower_to_hw_instr: Check the right instruction’s opcode

  • radv/rt: Remove nir_intrinsic_execute_callable instrs in monolithic mode

  • aco: Fix dead instruction/index handling for try_insert_saveexec_out_of_loop

  • nir: Serialize all parameter attributes

  • nir,vtn: Add return info to parameters

  • nir: Add parameter divergence info

  • vtn: Set parameter type in glsl_type_add_to_function_params

  • nir: Add indirect calls

  • nir: Apply passes to all functions

  • nir: Add nir_instr_is_before helper

  • nir: Free liveness info when invalidating metadata

  • nir: Add indirect call optimizations

GKraats (1):

  • i915g: fix glClearColor using a 1 byte color format

Georg Lehmann (79):

  • radv: run copy prop before vectorizing

  • nir/opt_16bit_tex_image: optimize extract half sources

  • nir: add nir_def_all_uses_ignore_sign_bit

  • pan/bi: use nir_def_all_uses_ignore_sign_bit

  • aco: use nir_def_all_uses_ignore_sign_bit

  • nir: handle fmul(a,a)/ffma(a,a,b) in nir_def_all_uses_ignore_sign_bit

  • aco/gfx8: use ds_swizzle_b32 rotate mode

  • nir: return def for debug info in nir_instr_def

  • nir/instr_set: replace nir_instr_get_def_def with nir_instr_def

  • nir/instr_set: support instrs with no def

  • nir: cse terminate/demote

  • nir/opt_undef: replace undef in a separate pass

  • nir/opt_undef: use some nir helpers

  • nir/opt_undef: keep undefs used by partial undef vectors

  • nir/opt_undef: handle unpack/pack like mov/vec

  • aco/isel: use undef Operands for p_create_vector created from nir vecs

  • util: add BITSET_LAST_BIT_BEFORE

  • nir/move_discards_to_top: single final iteration

  • nir/move_discards_to_top: don’t move across is_helper_invocation

  • radv/ci: document test_shader_sm66_is_helper_lane as fixed

  • freedreno/ci: update a630 KSP checksum

  • nir/opt_intrinsic: rework sample mask opt with vector alu

  • nir/opt_intrinsic: fix sample mask opt with demote

  • radv: optimize sample mask comparisons

  • aco/optimizer: label fcanonicalize like a copy if there is nothing to flush

  • nir/opt_algebraic: optimize ffma(b2f, b2f, c)

  • nir/opt_algebraic: optimize d3d9 ftrunc

  • nir/opt_algebraic: optimize d3d9 ceil

  • nir/opt_algebraic: mark a - ffract(a) as nan incorrect.

  • radv: fix reporting mesh/task/rt as supported dgc indirect stages

  • radv: rework vk_property initialization

  • aco/gfx12: disable vinterp ddx/ddy optimization

  • aco/gfx12+: do not use v_pack_b32_f16 to pack untyped data

  • radeonsi/ci: add vangogh ubo fail

  • zink: spec@ext_framebuffer_multisample@blit-mismatched-formats was fixed

  • aco/gfx11+: use v_and_b32 to extract local id 0

  • radv: track holes in the clip/cull masks

  • nir: add constant clip/cull distance optimization

  • radv: use nir_opt_clip_cull_const

  • nir/uub: properly limit float support to 32bit

  • nir: add unsigned upper bound support for f2i32

  • nir: add unsigned upper bound support for fsat

  • aco/gfx12: don’t assume memory operations complete in order

  • aco/ra: don’t write to exec/ttmp with mulk/addk/cmovk

  • aco/ra: disallow s_cmpk with scc operand

  • aco/ra: don’t write to scc/ttmp with s_fmac

  • nir/opt_remove_phis: rematerialize equal alu

  • nir/opt_algebraic: optimize min(max(a, b), a)

  • nir: optimize unpacking 8bit values from a 64bit source

  • aco/isel: skip and(exec) for top level demote_if/terminate_if

  • aco: rename p_early_exit_if to if_not

  • aco: allow p_exit_early_if_not with exec condition

  • aco/insert_exec: exit shader using exec for top level discard

  • aco: create v_cmpx with s_andn2(exec, v_cmp)

  • nir: sink/move alu with two identical, non constant sources.

  • amd: switch to FRONT_FACE_ALL_BITS(0)

  • nir: add load_front_face_fsign

  • amd: support load_front_face_fsign

  • nir: add nir_alu_srcs_negative_equal_typed

  • nir,amd: optimize front_face ? a : -a

  • aco/optimizer: fix signed extract of sub dword temps with SDWA

  • aco/insert_exec: reset top exec for p_discard_if

  • radv: run peephole_select in optimize_nir_algebraic

  • nir/peephole_select: allow load_vector/scalar_arg_amd

  • aco: guard small_vector move/copy operator against self assignment

  • aco: support less trivial component types in small_vec

  • aco: implement some more std::vector functions for small_vec

  • nir/opt_algebaric: convert fadd(a, a) to a * 2.0

  • aco: update is_dual_issue_capable for gfx11.5+

  • aco/sched_ilp: continue open clauses

  • aco/sched_ilp: add dependencies of later clause instrs more aggressively

  • aco/sched_ilp: only remove WaW/WaR for inter clause dependencies

  • aco/sched_ilp: reorder VINTRP

  • aco/sched_ilp: new latency heuristic

  • aco/sched_ilp: rename priority to wait_cycles

  • aco/sched_ilp: use more realistic memory latencies

  • aco/sched_ilp: base latency and issue cycles on aco_statistics

  • nir: fix range analysis for frcp

  • nir: fix frsq range analysis

Gert Wollny (6):

  • virgl/vtest: take handle from host when using protocol version >=3

  • virgl/vtest: When trying to use protocol 3 check host feature

  • virgl/vtest: change interface of virgl_vtest_submit_cmd

  • virgl/vtest: Add support for creating blob resources

  • ci: Upref virglrenderer version

  • radeon/evergreen: ensure equal sizes for depth-stencil npot textures

Guilherme Gallo (9):

  • ci/lava: Set default exit code to 1 for failed jobs

  • ci/lava: Improve exception handling for job failures

  • ci/lava: Uprev freezegun

  • ci/intel: Set HWCI modules for puff DUT

  • ci/iris: Force UART for puff boards

  • ci/iris: Rebalance iris-cml-deqp jobs

  • ci/iris: Fix iris-cml-traces expectations

  • ci/iris: Update iris-cml-deqp CI expectations

  • ci/container: set up S3_JWT_FILE also for container jobs

Gurchetan Singh (17):

  • util: add c++ guards to u_mm.h

  • gfxstream: move isHostVisible function

  • gfxstream: nuke android::base::SubAllocator

  • gfxstream: use vulkan_lite_runtime

  • gfxstream: nuke EntityManager.h include

  • gfxstream: aemu: vendor it

  • gfxstream: modify libaemu for Mesa use case

  • gfxstream: guest: use internal version of AEMU headers + impls

  • gfxstream: use canonical Mesa dependencies

  • gfxstream: conditionals for using gfxstream::aemu

  • gfxstream: delete qemu_pipe target

  • gfxstream: for Android, look for the autogenerated files

  • gfxstream: change output location

  • gfxstream: remove abort()

  • gfxstream: fix issues with VK1.4 build

  • gfxstream: remove references to Fuchsia Goldfish

  • gfxstream: fix some integration bugs

Hans-Kristian Arntzen (11):

  • vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO.

  • radv: Fix missing gang barriers for task shaders.

  • radv/winsys: Report VA mappings in bo_log too.

  • radv: Add sparse mappings to radv_check_va.py.

  • wsi/x11: Do not use allocation callbacks on a thread.

  • wsi/wayland: Only use commit timing protocol alongside present time.

  • wsi/wayland: Don’t fallback to broken legacy throttling with FIFO

  • wsi/wayland: Handle FIFO -> MAILBOX transitions correctly

  • wsi/wayland: Remove unused present_mode member.

  • wsi/wayland: Add forward progress guarantee for present wait.

  • radv: Add radv_invariant_geom=true for Indiana Jones.

Hsieh, Mike (1):

  • amd/vpelib: Refactor 3D LUT parameters

Hyunjun Ko (10):

  • anv: define ANV_VIDEO_H264_MAX_DPB_SLOTS

  • anv: Enable remapping picture ID

  • anv: handle negative value of slot index for h265 decoding.

  • intel/genxml: define MEMORYADDRESSATTRIBUTES for Gen12.5 with TILEF

  • anv/video: Fix to return supported video format correctly.

  • anv: calculate global parmeters correctly for AV1 decoding

  • anv: support in-loop super resolution for AV1 decoding

  • anv: fix to set default cdf buf correctly.

  • anv: change bool to VkResult

  • anv: Fix to set CDEF flter flag correctly for AV1 decoding

Iago Toral Quiroga (15):

  • v3d: add a V3D_DEBUG option to force synchronous execution of jobs

  • broadcom: handle double buffer on V3D 7.1 tile size calculations

  • v3d: group tile spec into a struct inside the job

  • v3d: save a pointer to the TILE_BINNING_MODE_CFG packet in the CL

  • v3d: do tile state BO allocation later

  • v3d: only enable double-buffer for jobs where it might make sense

  • v3dv: add missing support for double-buffer on V3D 7.x

  • v3d: drop blank line

  • v3d: store size of qpu program for compiled shaders

  • broadcom: add helpers for double-buffer heuristic

  • v3d: use heuristic to enable double-buffer mode

  • v3dv: use the double buffer heuristic helpers

  • broadcom: move double-buffer heuristic helpers to the compiler

  • v3dv: fix missing access bit flag when checking for texel buffer reads

  • v3dv: fix crash on 32-bit builds

Ian Romanick (57):

  • brw/emit: Add correct 3-source instruction assertions for each platform

  • brw/copy: Don’t copy propagate through smaller entry dest size

  • brw/cse: Don’t eliminate instructions that write flags

  • brw/lower: Don’t emit spurious moves to or from NULL register

  • brw/opt: Always do copy prop, DCE, and register coalesce after lower_regioning

  • brw/opt: Always do both kinds of copy propagation before lower_load_payload

  • brw/build: Add scalar_group() helper

  • brw/lower: Lower invalid source conversion to better code

  • Fix copy-and-paste bug in nir_lower_aapoint_impl

  • brw/lower: Don’t “fix” regioning of broadcast

  • brw: Use resize_sources several more places

  • brw/build: Use SIMD8 temporaries in emit_uniformize

  • brw/copy: Allow copy prop into src1 of broadcast

  • nir/algebraic: Optimize some trivial bfi

  • brw/algebraic: Fix ADD constant folding

  • brw/algebraic: Fix MUL constant folding

  • brw/emit: Fix typo in recently added ADD3 assertion

  • brw/algebraic: Partial constant folding of ADD3

  • brw/const: Allow mixing signed and unsigned immediate sources

  • brw/copy: Don’t try to be clever about ADD3 constant propagation

  • brw: Emit immediate value for MAD in canonical position

  • brw/copy: Commute immediates for MAD multiplicands

  • brw/algebraic: Constant fold multiplicands of MAD

  • brw/algebraic: Don’t restrict MAD(a, b, 1) optimization to float32

  • brw/const: Refactor checking whether an immediate source is allowed

  • brw/const: Allow constants in integer MAD

  • brw/const: Allow HF constants in MAD on Gfx11

  • brw/const: Remove TODO that isn’t allowed by the hardware

  • brw/algebraic: Pull brw_constant_fold_instruction out of the switch statement

  • brw/emit: Fix BROADCAST when value is uniform and index is immediate

  • brw: Add devinfo parameter to fs_inst::regs_read

  • brw: Basic infrastructure to store convergent values as scalars

  • brw/lower: Allow uniform and scalar sources to many kinds of SEND

  • brw/nir: Fix up handling of sources that might be convergent vectors

  • brw/lower: Adjust source stride on DF is_scalar sources to MAD on Gfx9

  • brw/lower: Properly handle UNIFORM globals address in lower_trace_ray_logical_send

  • brw/emit: Allow scalar sources to HF math instructions on Xe2

  • brw/nir: Prepare try_rebuild_source for scalar values

  • brw/build: Prepare BROADCAST for scalar values

  • brw/nir: Treat load_const as convergent

  • brw/nir: Treat some load_uniform as convergent

  • brw/nir: Treat load_workgroup_id as convergent

  • brw/nir: Treat some ALU results as convergent

  • brw/nir: Treat some load_ubo as convergent

  • brw/nir: Treat load_inline_data_intel as convergent

  • brw/nir: Treat load_reloc_const_intel as convergent

  • brw/nir: Treat load_btd_{global,local}_arg_addr_intel and load_btd_shader_type_intel as convergent

  • brw/nir: Treat load_*_uniform_block_intel as convergent

  • brw/nir: Treat some resource_intel as convergent

  • brw/nir: Eliminate nir_to_brw_state::uniform_values

  • brw/nir: Don’t try optimize around emit_uniformize

  • brw/nir: Simplify get_nir_image_intrinsic_image and get_nir_buffer_intrinsic_index

  • brw/nir: Treat some ballot as convergent

  • brw/nir: Don’t generate scalar byte to float conversions on DG2+ in optimize_extract_to_float

  • iris: Add missing nir_metadata_preserve in iris_lower_storage_image_derefs

  • crocus: Add missing nir_metadata_preserve in crocus_lower_storage_image_derefs

  • brw/copy: Fix handling of offset in extract_imm

Icenowy Zheng (4):

  • zink: do not set transform feedback bits when not available

  • meson: prefer ‘python3’ to ‘python’ when finding python3

  • zink: emit consts as uint only on IMG proprietary drivers

  • zink: use lazy descriptors for IMG proprietary drivers

Igor Torrente (2):

  • Zink: Add NVK to the non `driver_workarounds.implicit_sync` list

  • NVK: Enable RW DMA-BUF export

Ivan Avdeev (1):

  • radv: add a flag to indicate ray tracing support

Iván Briano (6):

  • intel/rt: fix ray_query stack address calculation

  • intel/decoder: fix INTEL_DEBUG=bat

  • anv: remove unused/misleading/wrong parameters from the RT trampoline

  • vulkan: calculate remaining layers of 2d view of 3d image correctly

  • anv: disable logic op for float/srgb formats

  • hasvk: disable logic op for float/srgb formats

James Hogan (3):

  • glsl: Expose gl_ViewID_OVR back to GLSL 1.30

  • mesa: Fix multiview attachment completeness check

  • mesa: Fix FramebufferTextureMultiviewOVR num_views check

Janne Grunau (1):

  • panvk: Silence warning on incompatible DRM render devices

Jason Macnak (3):

  • Simplify ApiInfo

  • Pass VkSnapshotApiCallInfo-s through VkDecoderGlobalState

  • Update VkDecoderSnapshot locking

Jesse Natalie (4):

  • microsoft/compiler: Put holes in driver_location based on I/O variable sizes

  • microsoft/clc: Initialize printf buffer for tests

  • microsoft/compiler: Skip POS for io compaction

  • microsoft/compiler: Update clip/cull split pass to handle clip/cull getting merged

Jianxun Zhang (5):

  • anv,hasvk,genxml: Rename genxml files using verx10

  • isl: Refactor WA 22015614752

  • iris: Allow compression on multi-sampled stencil (xe2)

  • isl: Allow CCS in more cases (xe2)

  • isl: Move a CCS restriction in GFX 12.x

Job Noorman (87):

  • ir3/ra: prevent moving source intervals for shared collects

  • ir3,tu: include ir3 debug flags in shader hash key

  • ir3,tu: filter debug flags included in the hash key

  • ir3: fold shared movs into other movs

  • nir: add ir3-specific bitwise triop opcodes

  • nir/search: make is_only_used_by_iadd reusable

  • nir/search: add is_only_used_by_{iand,ior} helpers

  • ir3: fix backend support for bitwise triops

  • ir3: add codegen for bitwise triops

  • ir3: add pass to select bitwise triops

  • ir3/isa: allow rpt6/rpt7

  • ir3: add workaround for predication hardware bug

  • nir/lower_subgroups: support unknown subgroup size

  • ir3: use generic lowering for 64b scan/reduce

  • ir3: remove unused ir3_nir_lower_64b_subgroups

  • nir: add read_getlast_ir3 intrinsic

  • ir3: add codegen for read_getlast_ir3

  • ir3: add helper to get the subgroup size

  • ir3: rename cluster_size to brcst_cluster_size

  • nir/lower_subgroups: add extra filter data to options

  • nir/lower_subgroups: disable boolean reduce when not supported

  • ir3: add support for clustered subgroup reductions

  • tu: advertise VK_SUBGROUP_FEATURE_CLUSTERED_BIT

  • nir/lower_subgroups: add option to only lower clustered rotates

  • ir3: lower clustered rotates to shuffles

  • tu: advertise VK_SUBGROUP_FEATURE_ROTATE_CLUSTERED_BIT_KHR

  • ir3: don’t update builder cursor for IR3_CURSOR_AFTER_BLOCK

  • ir3: add ir3_after_instr_and_phis helper

  • ir3: use generic INSTR0 implementation for ir3_NOP

  • ir3: refactor builders to use ir3_builder API

  • ir3: reformat after refactoring in previous commit

  • ir3: add reformatting commits to .git-blame-ignore-revs

  • ir3/isa: fix conflict between stib.b and stsc

  • ir3/isa: fix cat3-alt immed src

  • ir3/isa: fix isaspec for sad.s32

  • ir3: teach backend about sad

  • ir3: add codegen for sad

  • ir3/cp: only mark mad srcs as swapped when swap succeeded

  • ir3/cp: extract common src swapping code

  • ir3/cp: make try_swap_mad_two_srcs more generic

  • ir3/cp: add support for swapping srcs of sad

  • ir3/validate: print file/line info

  • ir3,freedreno: remove binning outputs after vs ucp lowering

  • ir3/cp: swap back correct srcs when swap failed

  • ir3: always set wrmask for movmsk

  • ir3: emit uniform iadd3 as two adds

  • ir3: output early-preamble stat as integer

  • ir3/ra: fix non-trivial collect detection

  • ir3/ra: allocate shared collects dst over its srcs when possible

  • ir3/parser: fix parsing integer as float

  • ir3/a7xx: properly handle alias scope and type

  • ir3/a7xx: disasm halfness of alias dst

  • ir3/a7xx: implement and document unknown alias field

  • ir3/a7xx: handle alias.rt dst

  • ir3/a7xx: document alias.rt

  • ir3/print: add support for alias

  • ir3: teach backend about alias

  • ir3: introduce alias goups

  • ir3: add validation for alias

  • ir3: add ir3_compiler::has_alias

  • ir3: add support for alias.tex

  • ir3: optimize alias register allocation by reusing GPRs

  • ir3/legalize: insert (ss) to read consts after stc

  • ir3/legalize: insert (sy) to read consts after ldc.k

  • ir3/dce: support partial writes from collects

  • ir3: add some preamble helpers

  • ir3: make find_end a global helper

  • tu,ir3: inform ir3 of dynamically remapped FS slots

  • ir3: make shader output struct non-anonymous

  • ir3: reuse ir3_find_output in ir3_find_output_regid

  • tu: add chip param to tu6_emit_fs_outputs

  • tu: add support for aliased render target components

  • freedreno: add chip param to emit_fs_output

  • freedreno: add support for aliased render target components

  • ir3: add support for alias.rt

  • ir3: disable alias.rt pre-a750

  • ir3: account for inserted nops in delay calculation

  • freedreno: move ForEachMacros into freedreno

  • freedreno: remove unused entries from ForEachMacros

  • freedreno: add missing entries to ForEachMacros

  • ir3: schedule alias.rt at the end of the preamble

  • ir3: rematerialize preamble defs in block dominated by sources

  • ir3: add helper to calculate src read delay

  • ir3: make delay slots a compiler property

  • ir3/a7xx: update delays slots

  • ir3/a7xx: enable delayed src2 read for all cat3 instructions

  • ir3: fix emitting descriptor prefetches at end of preamble

John Anthony (2):

  • panvk: Enable storageBuffer16BitAccess

  • panvk: Enable VK_KHR_vertex_attribute_divisor

Jordan Justen (6):

  • intel/dev: Add PTL 0xb0b0 PCI ID

  • intel/dev: Split hwconfig warning check into hwconfig_item_warning()

  • intel/dev: Split apply and check paths for hwconfig

  • intel/dev: Don’t process hwconfig table to apply items when not required

  • intel/dev: Add intel_check_hwconfig_items()

  • iris: Check that mem_fence_bo was created

Jose Maria Casanova Crespo (9):

  • v3d: Enable Early-Z with discards when depth updates are disabled

  • rpi4/ci: mark another flaky timeline_semaphore test

  • rpi4/ci: another detected flaky timeline_semaphore test

  • vc4/ci: fails udpate after last piglit uprev

  • rpi4/ci: Increase timeout for rusticl jobs.

  • v3d: Don’t load/store if rasterizer discard is enabled

  • v3d/ci: update rpi expectations by last piglit uprev

  • v3d: Apply FBO resources invalidations on job creation

  • Revert “ci: take igalia farm offline”

Joshua Duong (1):

  • gfxstream: update auto-generated comments.

José Roberto de Souza (16):

  • intel/dev/xe: Fix access to eu_per_dss_mask

  • intel/dev/xe: Fix size of eu_per_dss_mask

  • intel/genxml/xe2: Add STATE_SYSTEM_MEM_FENCE_ADDRESS instruction

  • anv: Always create anv_async_submit in init_copy_video_queue_state()

  • anv: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS

  • iris: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS

  • iris: Add support for damage region

  • anv: Allow larger SLM sizes for task and mesh shader

  • anv: Check VkResult of perf query batch buffer

  • anv: Check VkResult main batch buffer before start companion batch buffer

  • iris: Drop BO_ALLOC_COHERENT from iris_utrace_create_ts_buffer()

  • iris: Rename BO_ALLOC_COHERENT to BO_ALLOC_CACHED_COHERENT

  • anv: Return scanout PAT entry for scanout and external buffers in discrete GPUs

  • anv: Allow WSI blit_src Image to be kept compressed when transitioning to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR

  • iris: Make sure a uncached heap is choosen for scanout and shared buffers when LLC is not available

  • iris: Pick scanout PAT entry for scanout buffers

Juan A. Suarez Romero (26):

  • util/format: nr_channels is always <= 4

  • v3dv: remove unused assignments

  • v3dv: fix BO allocation

  • v3dv: free pointers on multisync error

  • v3dv: ensure there is always a perfmon and counter

  • broadcom/compiler: ensure offset source exists

  • broadcom/compiler: fix fp16 conversion operations

  • v3d: make v3d_flush_resource reallocate non-shareable resources

  • vc4: ensure sharing tiled resources are of proper format

  • v3d: fix BO allocation

  • v3d: remove intermediate variable

  • v3d: find linear modifier when required

  • vc4: find linear modifier when required

  • v3d/ci: clean some asan failures

  • v3d: avoid 0-size variable length array

  • v3dv: fix assigned value is garbage or undefined

  • vc4: initialize variable

  • v3dv: check requirements for USAGE_INPUT_ATTACHMENT

  • freedreno: a2xx: fix maybe uninitialized variable

  • radeonsi/vcn: fix maybe uninitialized

  • v3d: fix format overflow error

  • virgl: fix member access to a NULL pointer struct

  • etnaviv: cast assertion

  • ci/build: add ubsan build jobs

  • broadcom/ci: add ubsan jobs for broadcom drivers

  • ci: take igalia farm offline

Jung-uk Kim (1):

  • FreeBSD: Disable support for “-mtls-dialect” for FreeBSD

Juston Li (1):

  • util/cache_test: Fix racey Cache.List test

Kai Wasserbäch (1):

  • fix(FTBFS): clc/clover: pass a VFS instance explicitly

Karmjit Mahil (21):

  • tu: Fix push_set host memory leak on command buffer reset

  • tu: Fix potential alloc of 0 size

  • nir: Fix `no_lower_set` leak on early return

  • tu: Fix memory leaks on VK_PIPELINE_COMPILE_REQUIRED

  • nir/algebraic: turn `u{ge,lt} a, 1` to `i{ne,eq} a, 0`

  • nir,ir3: Add icsel_eqz

  • nir: Fix the spelling of compare

  • freedreno/rddecompiler: clang-format fix

  • freedreno/rddecompiler: Fix some unsused function warnings

  • ir3: Fix some Wsign-compare when compiling a generate-rd.cc

  • util/idalloc: Fix util_idalloc_foreach() build issue

  • util/idalloc: Minor refactor of util_idalloc_foreach()

  • tu: Fix `clear_values` leak

  • tu: Fix FDM patchpoint memory leak

  • tu: Fix leaking of some descriptor sets

  • tu: Initialize tu_tiling_config even when tiling isn’t possible

  • tu: Free pre_chain patchpoint data

  • util/simple_mtx: Add ASSERTED to parameter used only in an assert

  • vulkan: Add inital vram-report-limit layer

  • freedreno/replay: Define __user for msm_kgsl

  • loader/wayland: Fix missing timespec.h include

Karol Herbst (77):

  • nv/codegen: Do not use a zero immediate for tex instructions

  • nvc0: return NULL instead of asserting in nvc0_resource_from_user_memory

  • clover: drop support for nir drivers

  • gallium: drop PIPE_SHADER_IR_NIR_SERIALIZED

  • rusticl/kernel: fix kernel variant selection

  • vtn: handle struct kernel arguments passed by value

  • nir/lower_cl_images: lower scalar image_loads to vec4

  • rusticl/mem: add restrictions for CL_DEPTH, CL_DEPTH_STENCIL and msaa images

  • rusticl/image: fix clEnqueueFillImage for CL_DEPTH

  • rusticl/device: advertize cl_khr_depth_images if supported

  • rusticl: enable cl_khr_depth_images

  • rusticl: check for overrun status when deserializing

  • rusticl/kernel: convert name and type_name to Option<CString>

  • rusticl/mesa: make driver_name() return a &CStr

  • rusticl/program: check if provided binary pointers are null

  • rusticl: rework query APIs

  • rusticl/api: add a write_len_only variant for writing API properties

  • rusticl/api: add a write_iter variant for writing API properties

  • rusticl/program: use write_len_only for CL_PROGRAM_BINARIES

  • rusticl/program: use write_iter for CL_PROGRAM_DEVICES

  • rusticl/program: pass the slice directly for CL_PROGRAM_IL

  • rusticl/program: use write_len_only for CL_PROGRAM_IL

  • rusticl/platform: pass the slice directly for CL_PLATFORM_EXTENSIONS_WITH_VERSION

  • rusticl/api: use constant arrays instead of Vecs for queries

  • rusticl/context: use write_iter for CL_DEVICES_FOR_GL_CONTEXT_KHR

  • rusticl/proc: make generated entry points unsafe

  • rusticl/api: mark get_info and get_info_obj as unsafe

  • rusticl/util: add Properties::is_empty() and len()

  • rusticl/util: add Properties::iter()

  • rusticl/util: make Properties::props private

  • rusticl/util: reimplement Properties over Vec of scalars

  • rusticl/api: simplify CLProp implementation of Properties

  • rusticl/api: use Properties for 0 terminated arrays consistently

  • rusticl/util: make Properties::from_ptr unsafe

  • rusticl/api: remove Option around Properties

  • rusticl/util: rename Properties::from_ptr to new

  • rusticl/util: fix duplicate key detection in Properties::new

  • rusticl/platform: silence static_mut_refs warning

  • rusticl/util: fix ptr_to_integer_transmute_in_consts warning

  • rusticl: fix clippy::needless-lifetimes

  • rusticl: fix clippy::doc-lazy-continuation

  • rusticl/queue: add a life check to prevent applications dead locking

  • rusticl: stop using system headers for CL and GL

  • include: Update the OpenCL headers to latest

  • rusticl/mesa: remove PipeTransfer::res

  • rusticl/mem: remove mem_type argument from new_image

  • rusticl/device: remove unused functions

  • rusticl/mesa/context: use Default for pipe_grid_info initialization

  • rusticl/mesa: add missing files to meson.build

  • rusticl/queue: make QueueContext::dev public

  • rusticl/mem: pass around QueueContext instead of PipeContext

  • rusticl/mesa/resource: port to NonNull

  • rusticl/device: fix CL_DEVICE_HALF_FP_CONFIG query

  • rusticl/device: fix default device enumeration

  • rusticl/kernel: take set kernel arguments into account for CL_KERNEL_LOCAL_MEM_SIZE

  • rusticl/kernel: fix image_size of 1D buffer images

  • rusticl/mesa: set take_ownership to true for set_sampler_views

  • rusticl/mesa: add PipeSamplerView wrapper

  • rusticl/mesa: use PipeSamplerView over the raw type

  • rusticl/kernel: create the sampler views earlier

  • rusticl/mem: add functions to create sampler and image views to Image

  • rusticl/mesa: rework image and sampler view creation APIs

  • rusticl/kernel: store memory arguments as Weak references

  • rusticl/device: add unsynchronized mapping functions to helper context

  • rusticl/mem: simplify is_svm implementation

  • rusticl/mem: add Allocation type

  • rusticl/mem: reimplement has_same_parent and rename it to backing_memory_eq

  • rusticl/mem: rework last user of get_parent() and remove it

  • rusticl/mem: add Allocation::is_user_alloc_for_dev

  • rusticl/mem: use get_res_for_access instead of get_res_of_dev

  • trace: copy pipe_caps

  • trace: add get_compute_state_info

  • rusticl/mem: set bind flags for gl imports

  • rusticl/mesa: add PipeContext::device_reset_status

  • rusticl/queue: check device error status

  • rusticl/kernel: call nir_lower_variable_initializers earlier

  • rusticl/mem: do not apply offset with in copy_image_to_buffer

Kenneth Graunke (35):

  • brw: Fix emit_a64_oword_block_header UNIFORM -> VGRF copies

  • brw: Fix try_rebuild_source’s ult32/ushr handling to use unsigned types

  • nir: Use load_global_constant for reorderable nir_var_mem_global access

  • nir/algebraic: Reassociate fadd into fmul in DP4-like pattern

  • brw: Drop image deref handling from brw_analyze_ubo_ranges

  • brw: Drop “regular uniform” concept from UBO push analysis

  • brw: Drop a few crocus references in comments

  • brw: Use nir_combined_align in brw_nir_should_vectorize_mem

  • brw: Only consider components read for UBO loads

  • brw: Only consider components read for UBO push analysis

  • brw: Simplify choose_oword_block_size_dwords()

  • nir: Allow large overfetching holes in the load store vectorizer

  • anv: Don’t consider nir_var_mem_global for vectorizer robustness checks

  • brw: Tune vectorizer conditions to allow overfetching with holes

  • brw: Fix register unit calculation in SIMD32 LOAD_PAYLOAD lowering

  • brw: Allow SIMD32 math instructions on Xe2

  • brw: Combine convergent texture buffer fetches into fewer loads

  • iris: Tune the BO cache’s bucket sizes

  • brw: Don’t rely on SIMD splitting in opt_combine_convergent_txfs

  • brw: Limit maximum push UBO ranges to 64 registers in the NIR pass.

  • brw: Don’t shrink UBO push ranges in the backend

  • brw: Delete pull constant lowering

  • brw: Delete assign_constant_locations and push_constant_loc[]

  • brw: Fix vectorizer hole_size condition after signedness change

  • nir: Add a nir_def_first_component_read() helper

  • brw: Add more safeguards against misaligned OWord Block messages

  • brw: Skip fetching unread leading components of UBO loads

  • brw: Make get_nir_src_imm() usable for non-32-bit-sizes.

  • brw: Skip unnecessary work for trivial emit_uniformize of IMMs

  • brw: Skip unread leading/trailing components in convergent block loads

  • brw: Add a new MEMORY_MODE_CONSTANT option

  • brw: Allow CSE of MEMORY_MODE_CONSTANT loads

  • brw: Align and combine constant-offset UBO loads in NIR

  • brw: Always use MEMORY_LOAD for load_ubo_uniform_block_intel intrinsics

  • brw: Fix Xe2 spilling code to limit to SIMD32 rather than SIMD16

Kevin Chuang (3):

  • anv: Implement encode shader to fit in ANV BVH

  • anv: Add INTEL_DEBUG for bvh dump and visualization tools

  • anv/bvh: Dump BVH synchronously upon command buffer completion

Kevron Rees (1):

  • anv, drirc: Add workaround to speed up Spiderman reg allocation

Konstantin (5):

  • nir/lower_non_uniform_access: Group accesses using the same resource

  • radv/printf: Guard against helper invocations

  • radv: Do not overwrite VRS rates when doing fast clears

  • vulkan/meta: Add a pipeline cache

  • vulkan: Fix the argument order of update_as

Konstantin Seurer (39):

  • util: Fix some brackets in util_dynarray_.*_ptr

  • nir: Add missing access flags to print_access

  • radv: Lower non-uniform access after vectorization

  • amd: Add ac_shader_debug_info

  • aco: Handle nir_debug_info_instr

  • aco: Pass debug information to the driver

  • radv: Add a helper for accessing the shader binary

  • radv: Store debug info inside radv_shader

  • radv: Dump nir shaders before compiling

  • nir: Add a first_line parameter to gather_debug_info

  • nir: Do not gather source locations for phis

  • radv: Add RADV_DEBUG=nirdebuginfo

  • gallivm: Add float operation behavior flags to lp_type

  • gallivm: Preserve -0 and nan

  • lavapipe: Implement VK_KHR_shader_float_controls2

  • gallivm: Use an accurate log2 implementation for lodq

  • lavapipe: Implement VK_KHR_compute_shader_derivatives

  • radv: Fix encoding empty acceleration structures

  • llvmpipe: Disable anisotropic filtering for explicit lod

  • llvmpipe: Use a simpler and faster AF implementation

  • llvmpipe: Remove unused AF code

  • llvmpipe: Move max_anisotropy to static sampler state

  • lavapipe: Advertise vulkan 1.4

  • meson: Require glslangValidator when building lavapipe

  • lavapipe: Check the pool type in handle_reset_query_pool

  • meson: Include the loader subdir when building lavapipe

  • gallivm: Take helper invocations into account when skipping branches

  • nir/print: Print less unused shader info

  • nir/tests: Improve shader creation

  • nir/tests: Add a helper for comparing a shader against a string

  • nir/tests: Add reference shaders

  • nir: Add a test runner

  • nir/print: Do not print trailing spaces after preds/succs

  • docs: Add documentation for NIR unit testing

  • llvmpipe: Fix half-pixel sample offset with AF

  • llvmpipe: Avoid a crash when using 5 coords with AF

  • radv/rmv: Use radv_rmv_log_resource_destroy more

  • radv/meta: Stop using strings for meta keys

  • gallivm: Remove loop limiting

Koo, Anthony (1):

  • amd/vpelib: Add system event logging

Lars-Ivar Hesselberg Simonsen (26):

  • panvk: Set fs.multisampled sysval for v10+

  • panvk: Add frag->frag barrier before resolve

  • panvk: update expectations for G610

  • pan/genxml: Fix decode of exception_handler 0x0

  • pan/cs: Add mask support for reg_perm

  • panvk: Build cmd_fb_preload on explicit fb_info

  • panvk: Add incremental rendering support on v10+

  • panfrost: Disable AFRC texture/sampler reswizzle

  • panvk: Disable AFBC for mutable formats on v7

  • panfrost: Only allow AFBC(RGB) and AFBC(BGR) on v7

  • panfrost: Limit reswizzle to AFBC formats

  • panfrost: Decouple reswizzling from texture build

  • panfrost: Standardize naming of sampler reswizzle

  • panvk: Remove ZS texture_swizzle_replicate_x

  • panvk: Fix descriptor decode

  • panvk: Fix valgrind issue in nir_lower_descriptors

  • panvk: Fix valgrind issue in panvk_compile_shaders

  • pan/genxml: Fix vertex_packet Attribute on v9+

  • panvk: Use LD_VAR[_IMM] + ADs for varyings

  • panvk: Limit AD allocation to max var loads in v9+

  • panvk: Use LD_VAR_BUF[_IMM] when possible

  • panvk: Fix barriers in secondary cmdbufs w/o rp’s

  • panfrost: Do not evaluate_per_sample for non-MSAA

  • Revert “panfrost: remove is_blit flag”

  • Revert “panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload”

  • panvk: Set missing shader_modifies_coverage flag

Leder, Brendan Steve (2):

  • amd/vpelib: Refactor OCSC and update missing check

  • amd/vpelib: Move bg color

Leonard Göhrs (1):

  • ci/lava: update lavacli from version 1.5.2 to 2.2.0

Lina Versace (3):

  • anv: Sort extensions in enablement table

  • anv: Update features.txt

  • anv: Fix feature pipelineProtectedAccess

LingMan (10):

  • mesa: Bump required Rust version to 1.78

  • nak/hw_test: Use std::mem::offset_of!()

  • compiler/rust: Use std::mem::offset_of!()

  • mesa: Add rustfmt.toml

  • rusticl: Use C-string literals

  • rusticl: Use C-string literals for spirv extension names

  • rusticl/cl_prop: Use C-string literals

  • rusticl/core: Use C-string literals for XPlatManager::get_proc_address_func

  • rusticl: Use C-string literals for NirShader::add_var

  • rusticl: Use C-string literals for DiskCache::new

Lionel Landwerlin (96):

  • anv: fix extent computation in image->image host copies

  • anv: update shader descriptor resource limits

  • anv: split generated draw flags from mocs/dword-count

  • intel: make sure intel_wa.h can be included by opencl code

  • anv: implement Wa_16011107343/22018402687 for generated draws

  • brw: allocate physical register sizes for spilling

  • anv: fix descriptor asserts

  • anv: fix incorrect aspect flag for depth/stencil formats

  • anv: fix missing push constant reallocation

  • anv: prevent access to destroyed vk_sync objects post submission

  • anv: track allocated descriptor pool sizes

  • anv: indent driconf code

  • anv: add a workaround for X4 Foundations

  • anv: document the X4 Foundations workaround a bit more

  • anv: move helpers out of genX_pipeline.c/anv_private.h

  • anv: remove 3DSTATE_RASTER from pipeline

  • anv: remove 3DSTATE_MULTISAMPLE from the pipeline

  • anv: remove 3DSTATE_VF_STATISTICS from pipeline

  • anv: pass anv_device to batch_set_preemption

  • anv: rework vertex input helper

  • anv: split vertex buffer emission in a different function

  • anv: move gfx tracking values to anv_cmd_graphics_state

  • anv: move tracking of tcs_input_vertices/fs_msaa_flags to hw state

  • anv: split runtime flushing code for reuse

  • brw: change fs_msaa flags checks to test compiled flag first

  • brw: rename brw_sometimes to intel_sometimes

  • brw: move barycentric_mode enum to intel_shader_enums.h

  • brw: move fs_msaa_flags logic to intel_shader_enums.h

  • fix

  • Revert in correct commit “fix”

  • anv: move primitive_topology to anv_gfx_dynamic_state

  • anv: try to avoid using cmd_buffer in gfx runtime flushing

  • anv: reuse device local variable in hw state emission

  • anv: rework Wa_18038825448 to track state on anv_gfx_dynamic_state

  • anv: avoid using cmd_buffer for TBIMR state computation

  • anv: avoid using cmd_buffer for flushing runtime

  • anv/iris: leave 4k alignments for clear colors with modifiers

  • brw: use transpose unspill messages when possible

  • anv: report formats supported by the common bvh framework

  • anv: fix missing bindings valid dynamic state change check

  • anv: set pipeline flags correct for imported libs

  • vulkan: make acceleration structure debug markers virtual

  • vulkan: add an enum for the build step

  • vulkan: track encode step of the BVH building

  • anv: add BVH building tracking through u_trace

  • intel/decoder: fix COMPUTE_WALKER handling

  • anv: document UBO descriptor range alignments

  • blorp: use 2D dimension for 1D tiled images

  • hk: fix timeline value type

  • anv: fix index buffer size changes

  • anv: limit the memcpy data for push constants

  • vulkan/runtime: avoid emitting empty build_leaves

  • anv: add tracepoints timestamp mode for empty dispatches

  • anv: rework tbimr push constant workaround

  • anv: ensure null-rt bit in compiler isn’t used when there is ds attachment

  • anv: use the correct MOCS for depth destinations

  • intel: fix generation shader on Gfx9

  • brw: introduce a new register type for the address register

  • brw: use phys_nr() more in generation

  • brw: split validation iteration into blocks

  • brw: add infra to make use of the address register in the IR

  • brw: add scheduler support for address registers

  • brw: avoid having the scratch surface handle partially written

  • brw: move final send lowering up into the IR

  • brw: fix coarse_z computation on Xe2+

  • brw: handle load_printf_buffer_size intrinsic

  • anv: handle printf buffer size relocations

  • nir: make lower-level printf helper respect buffer size

  • anv: update debug printf example code

  • anv: remove print lowering

  • blorp: disable PS shaders with depth/stencil HiZ ops

  • brw: fix CSE with negation

  • anv: don’t look at pipelines to figure out CPS values

  • compiler: add VARYING_BIT_PRIMITIVE_COUNT

  • anv/Wa_18019110168: copy the primitive count writes

  • anv/brw: rework primitive count writing

  • libcl: add MIN2/MAX2 macros

  • libcl_vk: add some vulkan enums/structures for DGC

  • spirv: build vtn_bindgen for Anv/Iris

  • brw/elk: move internal kernel parsing out of intel_clc

  • meson: build mesa_clc for Anv/Iris

  • intel/cl: switch to SPIRV as shader storage

  • meson: rework mesa-clc=system handling

  • intel: rework CL pre-compile

  • meson: required SPIRV-Tools LLVM workaround on LLVM17+

  • intel: fix dependency for internal CL shaders

  • anv: use flags for format capabilities

  • anv: pass physical device to format helpers

  • anv: add a drirc to disable border colors without format

  • anv: expose A4B4G4R4_UNORM_PACK16 support with CBCWF is disabled

  • anv: dirty pipeline & push constants after internal CS shaders

  • anv: reduce alignment for small heaps

  • brw: fixup scoreboarding for find_live_channels

  • anv,driconf: Add sampler coordinate precision workaround for Dynasty Warriors

  • anv: disable VF statistics for memcpy

  • anv: ensure Wa_16012775297 interacts correctly with Wa_18020335297

Lorenzo Rossi (1):

  • nvk: fix preprocess buffer alignment

Louis-Francis Ratté-Boulianne (3):

  • panfrost: Split up allocation and packing of tiler descriptor

  • panfrost: Select the effective tile size as part of pan_fb_info

  • panfrost: Re-emit texture descriptor if the data size has changed

Lu Yao (1):

  • zink: fix decomposed_attrs val error when zink_vs_key->size is 4

Lucas De Marchi (1):

  • intel/tools: Fix Xe KMD error dump parser

Lucas Stach (26):

  • etnaviv: drm: properly handle BO list member

  • etnaviv: drm: assert mutual exclusivity between cache and zombie list

  • etnaviv: drm: use list_first_entry

  • etnaviv: stall after RS/BLT operation when draw_stall debug option is enabled

  • etnaviv: Update headers from rnndb

  • etnaviv: add debug switch to disable texture descriptor usage

  • etnaviv: fix polygon offset for 24bpp depth buffers

  • ci/etnaviv: drop gl-1.4-polygon-offset fail

  • etnaviv: isa: fix typo in SRC2_USE map

  • etnaviv: Update headers from rnndb

  • etnaviv: clean up component use setting in linker

  • etnaviv: fix flatshading

  • etnaviv: emit full varying component use

  • ci/etnaviv: drop GC2000 flat shading fails

  • etnaviv: split dummy RT backing store from reloc

  • etnaviv: fix rendering without vertex buffers/attributes

  • ci/etnaviv: drop failures caused by missing vertex attributes

  • etnaviv: fix polygon offset disable

  • etnaviv: memcpy varying setup from stack

  • etnaviv: emit varying interpolation state on halti5

  • etnaviv: fix flatshading on halti5 GPUs

  • etnaviv: only emit used PA_SHADER_ATTRIBUTES states

  • etnaviv: track TS flushed status as bool

  • etnaviv: dynamically partition the constant memory in unfied uniform mode

  • etnaviv: allow more constants in unified uniform mode

  • etnaviv: hwdb: fix lookup of GC3000 in i.MX6QP

Lukas Lipp (1):

  • wsi: Fix wrong function name for lvp wsi metal surface

M Henning (6):

  • nvk/cmd_buffer: Pass count to set_root_array

  • nvk: Fix invalidation of NVK_CBUF_TYPE_DYNAMIC_UBO

  • nvk: Remove params for dirty_cbufs_for_descriptors

  • nvk: Fix two typos in comments

  • nvk: Fix uninitialized var warnings in host_copy

  • nak/hw_runner: Skip copy call for empty buffer

Manuel (1):

  • gfxstream: Avoid repeated functionality

Manuel Dun (4):

  • gfxstream: Using DETECT_OS_ANDROID from util instead of __ANDROID__

  • gfxstream: Using DETECT_OS_FUCHSIA from util instead of __Fushsia__

  • gfxstream: Using DETECT_OS_LINUX from util instead of __linux__

  • Gfxstream: Initial mingw “compilable” Windows version of mesa/gfxstream

Marc Herbert (5):

  • docs: add “apt-get build-dep” and “dnf buildep”

  • docs: cross-compile: add useful “apt” and “dnf” builddep commands

  • docs: show how to use ccache when cross-compiling

  • docs: show which pkg-config Fedora uses for cross-compilation

  • docs: move cross c*_args from [properties] to [built-in options]

Marek Olšák (353):

  • gallium/radeon: import libdrm_radeon source code, drop the dependency

  • aco: remove unused TCS fields from aco_shader_info

  • ac/nir: get pass_tessfactors_by_reg from nir_gather_tcs_info

  • radeonsi: fix passing TCS wave ID from LS to HS for monolithic LS+HS

  • radeonsi: don’t overwrite info.tess._primitive mode when it can be correct

  • radeonsi: get the value for load_tcs_primitive_mode_amd from shader info

  • radeonsi: replace are_tessfactors_def_in_all_invocs with nir_gather_tcs_info

  • radeonsi: reduce si_shader_key_ge::tes_prim_mode size to 2 bits

  • radeonsi: remove unused function si_get_tcs_out_patch_stride

  • radeonsi: don’t set tess level outputs in patch_outputs_written unconditionally

  • radeonsi: remove unused si_shader_info::output_readmask

  • radeonsi: set *outputs_written in scan_io_usage instead of later

  • radeonsi: split outputs_written_before_tes_gs into ls_es_* and tcs_* masks

  • radeonsi/ci: update navi31 failures

  • glsl: add a helper for duplicated code calling nir_opt_varyings

  • gallium: use struct nir_shader * type in finalize_nir instead of void *

  • st/mesa: call pipe_screen::finalize_nir outside of st_finalize_nir

  • gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER

  • st/mesa: add ST_DEBUG=xfb printing xfb info

  • mesa: capture shaders to disk before invoking the linker

  • nir/opt_varyings: add nir_io_always_interpolate_convergent_fs_inputs

  • nir/opt_varyings: add nir_io_compaction_rotates_color_channels

  • nir/opt_varyings: fix packing color varyings

  • nir/opt_varyings: implement compaction without flexible interpolation

  • nir/opt_varyings: don’t count the cost of the same instruction multiple times

  • radeonsi: fix buffer_size for emulated GS statistics

  • radeonsi: fix an assertion failure in si_shader_ps with AMD_DEBUG=mono

  • radeonsi: handle nir_intrinsic_component in kill_ps_outputs

  • radeonsi: fix gl_FrontFace elimination when one side is culled

  • radeonsi/ci: add options to test llvmpipe, softpipe, virgl, zink

  • nir/print: print fb_fetch_output for variables

  • nir/lower_pntc_ytransform: handle lowered IO

  • nir/lower_clip: fixes for lowered IO without compact arrays

  • nir/lower_clip: rewrite find_output to handle vec2/3 and make it readable

  • nir/lower_fragcoord_wtrans: handle trimmed fragcoord loads

  • nir/lower_two_sided_color: fix for lowered IO

  • nir: add nir_io_semantics::fb_fetch_output_coherent

  • nir: rename nir_io_glsl_opt_varyings to nir_io_dont_optimize and deprecate it

  • nir: add nir_io_separate_clip_cull_distance_arrays to replace PIPE_CAP

  • vc4/lower_blend: don’t read non-existent channels

  • nir: make use_interpolated_input_intrinsics a nir_lower_io parameter

  • ac/surface: adjust HiZ enablement

  • radeonsi: prepare for making SI_NGG_CULL_TRIANGLES/LINES VS only, rename them

  • radeonsi: optionally return MESA_PRIM_UNKNOWN from si_get_input_prim

  • radeonsi: rewrite/replace gfx10_ngg_get_vertices_per_prim

  • radeonsi: return a better value for load_initial_edgeflags_amd

  • radeonsi: clean up and rename gfx10_edgeflags_have_effect

  • radeonsi: add helper si_shader_culling_enabled

  • radeonsi: only compute and use min_direct_count on gfx7-8

  • radeonsi: enable NGG culling for non-monolithic TES and GS

  • radeonsi: don’t use nir_io_dont_optimize because it’s deprecated

  • r300: don’t lower sin/cos in finalize_nir

  • nir/opt_varyings: use a hash table to make cloning SSA faster

  • amd: import libdrm_amdgpu ioctl wrappers

  • util,amd: add inlinable versions of drmIoctl/drmCommandWrite*

  • nir: allow cloning indirect array derefs in nir_clone_deref_instr

  • nir/lower_io_to_temporaries: fix interp_deref_at_* lowering

  • radeonsi: don’t call set_framebuffer_state in si_destroy_context

  • radeonsi: handle a failure to create gfx_cs

  • winsys/amdgpu: fix FD mismatch

  • Revert “gbm: mark surface buffers as explicit flushed”

  • nir/lower_clip: don’t set cursor to fix crashes due to removed instructions

  • nir/lower_clip: separate code for IO variables and intrinsics

  • nir/lower_clip: set clip_distance_array_size outside of create_clipdist_vars

  • nir/lower_clip: convert nir_lower_clip_gs to nir_shader_intrinsics_pass

  • nir/lower_clip: implement ClipVertex lowering for GS + lowered IO correctly

  • vc4: lower clip planes in st/mesa

  • nir/opt_varyings: always call remove_dead_varyings in init_linkage

  • nir/opt_varyings: add a default callback for varying_estimate_instr_cost

  • nir/opt_varyings: replace options::lower_varying_from_uniform with a cost number

  • nir/algebraic: use is_used_once in a few iand/ior patterns

  • nir/algebraic: optimize (a & b) & (a & c) ==> (a & b) & c

  • nir/algebraic: optimize (a | b) | (a | c) ==> (a | b) | c

  • nir/algebraic: optimize (a & b) | (a | c) => a | c, (a & b) & (a | c) => a & b

  • gallium: replace PIPE_SHADER_CAP_INDIRECT_INPUT/OUTPUT_ADDR with NIR options

  • st/mesa: replace EmitNoIndirectInput / EmitNoIndirectOutput with NIR options

  • util/bitset_test: test the return value of BITSET_TEST_RANGE_INSIDE_WORD better

  • util/bitset: add BITSET_GET_RANGE_INSIDE_WORD

  • nir/linking_helpers: don’t promote interpolated varyings to flat

  • nir/opt_varyings: remove redundant conditions from a while loop

  • nir/opt_varyings: fix compaction with sparse indirect FS inputs

  • nir/opt_varyings: count the number of unused components for compaction correctly

  • nir/opt_varyings: fix max_slot for color varying compaction

  • nir/opt_varyings: make top-level compaction code for TES, TCS, GS separate

  • nir/opt_varyings: change try_move_postdominator param to nir_instr type

  • amd,zink: remove options.varying_estimate_instr_cost callbacks

  • nir/opt_varyings: propagate indirect uniform/UBO loads into the next shader

  • nir/opt_varyings: add inter-shader code motion for uniform/UBO indexing

  • nir/opt_varyings: fix getting deref variables for sysvals

  • nir/opt_varyings: remove rare dead output stores after inter-shader code motion

  • nir/opt_varyings: fix compile failures in the disabled PRINT code

  • amd/ci: add piglit failures due to a overzealous test

  • nir/lower_io_passes: lower indirect IO for TCS

  • radeonsi: pass cull face state via user SGPRs for shader culling

  • radeonsi: revert to always returning true for load_cull_any_enabled_amd

  • radeonsi: try to fix Navi14 regression in debug builds

  • radeonsi: don’t compute total_direct_count in si_draw if it’s unused

  • radeonsi/ci: handle glinfo errors better

  • radeonsi/ci: stop using a global flakes list, only use a per-chip flakes list

  • radeonsi/ci: remove most flakes and some skips, update navi31 failures

  • radeonsi/ci: remove –slow

  • radeonsi/ci: update navi31 failures

  • r600: fix a constant buffer memory leak for u_blitter

  • ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM

  • ac: update SPI_GRP_LAUNCH_GUARANTEE_* register values for gfx12

  • ac/surface/gfx12: enable DCC 256B compressed blocks and reorder modifiers

  • radeonsi/gfx12: set DB_RENDER_OVERRIDE based on stencil state

  • radeonsi/gfx12: adjust HiZ/HiS logic

  • ac/nir: reserve the first LDS vec4 for the HS tf0/1 group vote in TCS

  • ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11

  • ac/nir: allow a TCS input to be available from both VGPRs and LDS

  • ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads

  • ac/nir: add new helpers for computing the TCS LDS/offchip size accurately

  • radeonsi: remove unused parameter tcs_vgpr_only_inputs from si_get_nir_shader

  • radeonsi: switch to the new TCS LDS/offchip size computation

  • radv: switch to the new TCS LDS/offchip size computation

  • ac/nir: call nir_gather_tcs_info only once for RADV

  • nir/opt_varyings: set all IO types to float to facilitate full vectorization

  • nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated

  • st/mesa: don’t use nir_opt_fragdepth because it’s incorrect with MSAA

  • mesa: set correct XFB prim mode for draw validation after resuming XFB

  • mesa: fix printing _NEW_* flags

  • gallium: pass XFB primitive mode to set_stream_output_targets

  • st/mesa: add a pass that unlowers IO intrinsics to variables

  • glsl,st/mesa: always lower IO for GLSL, unlower IO for drivers

  • v3d: enable uniform expression propagation from outputs to the next shader

  • ci: update fail lists and trace checksums

  • virgl/ci: disable virgl-traces because it doesn’t upload results

  • radeonsi/ci: don’t copy skips.csv to the results directory

  • radeonsi/ci: update failures and flakes

  • radeonsi: fix a gfx10.3 regression due to a gfx12 change

  • radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled

  • radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together

  • radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3

  • radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass

  • radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs

  • radeonsi: make NGG streamout output primitive type known at compile time

  • radeonsi/gfx12: fix DrawTransformFeedback(stream != 0)

  • radeonsi/gfx12: tune streamout performance

  • radeonsi: make nir->info and si_shader_info::base identical

  • radeonsi: remove some uses of enum pipe_shader_type

  • radeonsi: make si_init_shader_args static

  • radeonsi: call si_init_shader_args in si_get_nir_shader

  • radeonsi: use nir->info instead of sel->info.base

  • radeonsi: disable luminance alpha formats on gfx6

  • radeonsi,radv: fix incorrect min_esverts for NGG subgroup calculation

  • ac: remove unused code

  • ac/llvm: remove unused code

  • radeonsi/ci: update failures

  • radeonsi: fix a TCS regression

  • radeonsi: switch si_get_blitter_vs to IO intrinsics

  • radeonsi: remove unused code

  • amd: update addrlib

  • radeonsi: fix a front face regression (crash)

  • nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads

  • radv: reduce maxGeometryShaderInvocations to 32

  • ac/nir: handle disabled PS VGPRs in ac_nir_load_arg_at_offset

  • amd: lower load_pixel_coord in NIR

  • amd: lower load_frag_coord in NIR

  • amd: lower load_local_invocation_id in NIR

  • amd: lower load_first_vertex/base_instance/draw_id/view_index in NIR

  • amd: lower load_invocation_id in NIR

  • amd: lower load_sample_id in NIR

  • amd: lower load_sample_pos in NIR

  • amd: lower load_frag_shading_rate in NIR

  • amd: lower load_front_face in NIR

  • ac,radeonsi: move load_vector_arg flags to common code

  • amd: lower load_barycentric_pixel/centroid/sample in NIR

  • amd: lower load_barycentric_at_offset in NIR

  • amd: lower load_gs_wave_id_amd in NIR

  • amd: lower load_vertex_id/instance_id and overwrite_vs_arguments in NIR

  • radeonsi: don’t return 0 from si_get_max_workgroup_size

  • ac/nir: extract a load_subgroup_id lowered helper

  • amd: lower load_local_invocation_index in NIR

  • amd: lower load_subgroup_invocation in NIR

  • amd: lower load_tess_rel_patch_id/primitive_id/tess_coord and overwrite.. in NIR

  • ac/llvm: remove already lowered cases

  • ac/nir: lower more loads in ac_nir_lower_intrinsics_to_args instead of drivers

  • ac/nir: clean up ac_nir_lower_indirect_derefs

  • ac/nir: add helper ac_nir_load_arg_upper_bound

  • ac/nir: set arg_upper_bound_u32 for vs_rel_patch_id

  • ac/nir: split local_invocation_ids to 3 separate VGPR inputs

  • ac/nir: set upper ranges for range analysis while lowering system values

  • radeonsi: lower sysval intrinsics as late as possible

  • amd: optimize atomics before lowering intrinsics

  • radeonsi: use nir_opt_sink

  • radeonsi: use nir_opt_move

  • vulkan: silence an unused variable warning

  • llvmpipe: silence an unused result warning

  • util/disk_cache: silence unused result warnings

  • nir: set nir_io_semantics::num_slots to at least 1 in build helpers

  • nir: set src_type and dest_type to float implicitly for IO build helpers

  • nir: don’t set num_slots/src/dest_type/write_mask when they’re set automatically

  • nir: flip the early exit condition in nir_lower_io_temporaries

  • nir: remove redundant option linker_ignore_precision

  • nir: use IO intrinsics in nir_lower_bitmap

  • nir: use IO intrinsics in nir_lower_drawpixels

  • mesa: remove unused PROGRAM_SYSTEM_VALUE

  • mesa: remove unused PROGRAM_WRITE_ONLY

  • st/mesa: fold st_translate_prog_to_nir into prog_to_nir

  • st/mesa: run DCE before st_unlower_io_to_vars

  • st/mesa: use IO intrinsics in st_nir_lower_fog

  • st/mesa: use IO intrinsics in st_nir_lower_position_invariant

  • st/mesa: switch ATI_fs to IO intrinsics

  • st/mesa: unlower IO for internal shaders if needed

  • st/mesa: switch Z/S DrawPixels shaders to IO intrinsics

  • st/mesa: switch GL_SELECT shader to IO intrinsics

  • st/mesa: switch st_nir_make_passthrough_shader to IO intrinsics

  • st/mesa: switch st_pbo_create_vs and st_pbo_create_gs to IO intrinsics

  • st/mesa: switch PBO create_fs to IO intrinsics

  • st/mesa: switch st_nir_make_clearcolor_shader to IO intrinsics

  • st/mesa: don’t use nir_copy_var

  • st/mesa: recompute IO bases for ARB_vp/fp

  • glsl: remove unused code

  • glsl: fix corruption due to blake3 hash not being set for nir_opt_undef

  • radeonsi: ignore PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY for TC-compatible HTILE

  • radeonsi: simplify and fix enable_tc_compatible_htile_next_clear logic

  • radeonsi: re-enable non-TC-compatible HTILE for write-only Z/S

  • mesa: switch ARB_vp/fp to IO intrinsics

  • mesa: switch fixed-func fragment program to IO intrinsics

  • nir/algebraic: use is_used_once for comparison patterns

  • nir/algebraic: add and improve pack/unpack patterns

  • nir/algebraic: optimize pack_split(unpack(a).x, unpack(a).y) -> a

  • radeonsi: fix a perf regression due to slow reply from GEM_WAIT_IDLE for timeout=0

  • radeonsi: always use RADEON_USAGE_DISALLOW_SLOW_REPLY

  • ac: update ATOMIC_MEM definitions

  • ac/nir: sort xfb info to facilitate vectorization of xfb stores

  • ac/nir: vectorize streamout stores for legacy pipeline optimally

  • ac/nir/ngg: vectorize streamout stores for NGG optimally

  • ac/nir/ngg: fold so_vertex_index * so_stride into immediate offset

  • ac/nir/ngg: export positions after streamout to improve performance

  • ac,radeonsi: scalarize overfetching loads

  • radeonsi: lower descriptors sooner to allow vectorizing descriptor loads

  • amd: vectorize SMEM loads aggressively, allow overfetching for ACO

  • radeonsi: don’t set BREAK_PRIMGRP/WAVE_AT_EOI when tessellation is disabled

  • radeonsi: only set BREAK_PRIMGRP/WAVE_AT_EOI when TES/GS need PrimID sysval after TES

  • radeonsi/gfx12: enable alt_hiz_logic

  • radeonsi/gfx12: set DIS_PG_SIZE_ADJUST_FOR_STRIP after shader compilation

  • radeonsi/gfx12: use ACO if LLVM is 19 or older

  • radeonsi/gfx12: use ACO for streamout because it’s faster

  • mesa: rework enablement of force_gl_names_reuse

  • mesa: enable GL name reuse by default for all drivers except virgl

  • ac/nir: remove broadcast_last_cbuf because it can be deduced from NIR

  • ac/nir: split ac_nir_lower_ps into 2 passes

  • nir: add barycentric coordinates src to load_point_coord_maybe_flipped

  • ac: use Z_EXPORT_FORMAT=32_AR for Z + Alpha mrtz exports

  • ac/llvm: lower vector load_const in NIR

  • ac/llvm: remove the low-optimizing compiler option

  • radeonsi: add si_screen::use_aco to shader cache key to fix shader cache failures

  • radeonsi: remove unused variables from si_shader_context (LLVM)

  • radeonsi: make many shader functions static or move them to .c files

  • radeonsi: remove unused functions

  • nir: add next_stage param to nir_slot_is_varying & nir_remove_sysval_output

  • Revert “ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov”

  • nir: add a pass that moves output stores to the end of the shader

  • st/mesa: move VS & TES output stores to the end before unlowering IO

  • mesa: switch fixed-func vertex program to IO intrinsics

  • st/mesa: assert that all incoming shaders use lowered IO

  • st/mesa: remove dead/no-op code due to IO being always lowered

  • glsl: remove dead code due to IO being always lowered

  • glsl: simplify nir_lower_io_to_temporaries logic

  • nir: remove dead code due to IO being always lowered in st/mesa

  • st/mesa: inline st_finalize_nir_before_variants

  • nir: remove handling IO variables from passes used by st/mesa

  • gallium/u_threaded: move tc_batch_execute after all call functions

  • gallium/u_threaded: make the execute function table private

  • gallium/u_threaded: use TC_END_BATCH to terminate the loop

  • gallium/u_threaded: replace the function table with a switch and direct calls

  • gallium/u_threaded: inline all tc_call functions

  • gallium/u_threaded: sort cases in batch_execute by their occurrence

  • zink/ci: skip KHR-Single-GL46…SizedDeclarationsPrimitive due to random timeout

  • dri: put shared-glapi into libgallium.*.so

  • glapi: stop using the remap table

  • glapi: remove the remap table

  • loader: improve the existing loader-libgallium non-matching version error

  • glapi: rename exported symbols so as not to conflict with old libglapi

  • freedreno/ci: skip a dmat3 div test timing out

  • radv: don’t call ac_nir_lower_ps_early

  • ac/nir: optimize front_face in ac_nir_lower_ps_early

  • ac/nir: lower sample_pos in ac_nir_lower_ps_early

  • ac/nir: lower barycentric_at_offset/sample in ac_nir_lower_ps_early

  • ac/nir: lower fbfetch_output in ac_nir_lower_ps_early

  • ac/nir: return progress from ac_nir_lower_ps_early

  • ac/nir: return progress from ac_nir_lower_ps_late

  • ac/nir: handle FRAG_RESULT_COLOR with dual src blending in ac_nir_lower_ps_early

  • ac/nir: switch passes to use nir_shader_intrinsics_pass

  • ac/nir: drop 16x EQAA support from ac_get_ps_iter_mask

  • ac/nir: clamp vertex color outputs in the right place

  • radeonsi: sample shading state fixes

  • ac,aco,radeonsi: replace SampleMaskIn with 1 << SampleID if full sample shading

  • ac/nir: simplify force_*_sample_interp options in ac_nir_lower_ps_early

  • ac/nir: simplify force_*_center_interp options in ac_nir_lower_ps_early

  • ac/nir: optimize barycentric_at_sample(sample_id) in ac_lower_ps_early

  • ac/nir: optimize frag_coord <-> pixel_coord in ac_nir_lower_ps_early

  • ac/nir: eliminate sample_mask_in without MSAA in ac_nir_lower_ps_early

  • ac/nir: cosmetic stuff for ac_nir_lower_ps

  • aco: implement replacing frag_coord with pixel_coord in PS prolog

  • aco: simplify how broadcast_last_cbuf is implemented in PS epilog

  • aco: implement replacement of sample_mask_in with helper_invocation in PS prolog

  • ac/nir: compute ddx/ddy for barycentric_at_offset at the beginning of shaders

  • ac/nir: lower sample_pos to load_sample_positions_amd when frag_coord is center

  • nir/opt_varyings: handle user barycentrics

  • mesa: enable GL name reuse for virgl

  • radeonsi: disallow compute queues on Raven/Raven2 due to hangs

  • ac/nir: clamp vertex color outputs in the right place

  • radeonsi: get sample positions from user SGPRs instead of memory

  • radeonsi: fix PS prolog not counting used fragcoord VGPRs correctly

  • radeonsi: implement replacing frag_coord with pixel_coord at draw time

  • radeonsi: don’t set the alpha ref user SGPR if alpha test doesn’t use it

  • radeonsi: simplify how broadcast_last_cbuf is implemented for PS epilogs

  • radeonsi: use load_pixel_coord for polygon stipple lowering

  • radeonsi: remove si_nir_kill_ps_outputs and use ac_nir_lower_ps_early instead

  • radeonsi: add load_polygon_stipple_buffer_amd instead of using si_shader_args

  • radeonsi: call si_init_gs_output_info in si_get_nir_shader

  • radeonsi: add si_nir_shader_ctx holding parameters from si_get_nir_shader

  • radeonsi: call si_nir_late_opts unconditionally

  • radeonsi: set the “first” parameter of si_nir_opts correctly

  • radeonsi: simplify how the NIR name of shader variants is modified

  • radeonsi: cosmetic changes in get_nir_shader

  • radeonsi: reorder NIR passes in get_nir_shader (part 1)

  • radeonsi: reorder NIR passes in get_nir_shader (part 2)

  • radeonsi: reorder NIR passes in get_nir_shader (part 3)

  • radeonsi: split and restructure get_nir_shader

  • radeonsi: get LS+HS and ES+GS together in get_nir_shader instead of separately

  • radeonsi: set uses_vmem_load/sampler in get_nir_shaders

  • radeonsi: move/rewrite PS color input gathering for shader variants

  • radeonsi: use barycentrics from load_point_coord_maybe_flipped

  • radeonsi: lower indirect indexing sooner

  • radeonsi: move spi_ps_input_config functions up

  • radeonsi: split si_fixup_spi_ps_input_config

  • radeonsi: get SPI_PS_INPUT_ENA from shader variant NIR for ACO

  • radeonsi: minor restructuring of si_llvm_compile_shader

  • radeonsi: verify that SPI_PS_INPUT_ENA from LLVM is equal to ACO

  • radeonsi: remove ac_shader_config from si_shader_part

  • radeonsi: precompute COMPUTE_PGM_RSRC3

  • radeonsi: set SHARED_VGPR_CNT for compute for ACO

  • radeonsi: set SHARED_VGPR_CNT for gfx shaders for ACO

  • radeonsi: gather PS inputs from shader variant NIR

  • radeonsi: don’t set BASE in si_nir_lower_ps_color_input

  • radeonsi: remove si_shader_info code that is no longer needed

  • radeonsi: implement replacement of sample_mask_in with helper_invocation

  • radeonsi: ignore pipe_rasterizer_state::force_persample_interp

  • radeonsi: fix interpolateAt* with non-GL4 ARB_sample_shading

  • radeonsi/ci: add more gfx11 flakes

  • radeonsi: set gl_FragCoord to pixel center to fix GLCTS failures

  • radeonsi: validate BITSET_TEST_RANGE_INSIDE_WORD assertion at compile time

  • radeonsi: remove SI_TRACKED__UNUSED_GAP

  • radeonsi: dead code removal and move some code out of headers

  • radeonsi: remove redundant divergence analysis and smem flagging

  • radeonsi: remove an incorrectly defined modifier

  • winsys/amdgpu: disable DCC for gfx12 when using AMD_FORCE_FAMILY

  • ac/fake_hw_db: deobfuscate GPU name strings

  • gallium,st/mesa: allow reporting compile failures from create_vs/fs/.._state

Mark Collins (5):

  • util: Add file modification notifier utility

  • tu/util: Support toggling TU_DEBUG options at runtime

  • tu/lrz: Check for TU_DEBUG(nolrz) late

  • freedreno/docs: Document TU_DEBUG_FILE

  • util/u_debug: Ignore newlines in `parse_*_string`

Martin Krastev (7):

  • svga/ci: enable vmware farm

  • svga/ci: set vmware piglit job parallelism to 2

  • svga/ci: triage piglit failures

  • svga/ci: update svga/ci KERNEL_TAG

  • svga/ci: drop FDO_CI_CONCURRENT to 1

  • svga/ci: disable vmware farm

  • svga/ci: enable vmware farm

Martin Roukala (né Peres) (39):

  • zink/ci: document new-ish vangogh flakes

  • ci: disable mupuf’s farm

  • Revert “ci: disable mupuf’s farm”

  • ci: disable mupuf’s farm

  • Revert “ci: disable mupuf’s farm”

  • freedreno-ci: document more a618-gl flakes

  • freedreno-ci: document a a750-gl flake

  • turnip/ci: document the a750-vkcts expectations

  • turnip/ci: bump the vkcts a750 timeout by 15 minutes

  • turnip/ci: skip a vkd3d test that causes a GPU hang on a750

  • nvk/ci: update the ga106 expectations

  • zink/ci: update the nvk-ga106 expectations

  • zink/ci: update the radv expectations

  • radv/ci: update the vkcts expectations

  • ci/test: make the .b2c-${arch}-test-* jobs provide a default b2c

  • ci/tests: de-duplicate the b2c version between architectures

  • ci/test: uprev to b2c v0.9.14

  • freedreno/ci: use the default b2c

  • r300/ci: use the default b2c

  • i915g/ci: use the default b2c version

  • ci/b2c: modernize the job description to use run_*

  • ci/b2c: run the machine registration check before the test container

  • radeonsi/ci: update the vangogh expectations

  • radeonsi/ci: run on ACO changes

  • radeonsi/ci: run a fraction of glcts-vangogh in pre-merge

  • ci/init-stage2: use the common scripts from the build artifact

  • ci/b2c: use the runner description rather than ID

  • ci/b2c: allow defining a boot watchdog

  • freedreno/ci: use the boot watchdog to ensure the a750 boots

  • zink/ci: update nvk expectations

  • zink/ci: update RADV expectations

  • radeonsi/ci: update the vangogh expectations

  • ci/b2c: allow jobs to select a file in the dtb url

  • ci/b2c: allow using another initrd that contains firmware

  • freedreno/ci: uprev the a750 kernel to msm-next

  • ci: fix the artifact name

  • zink/ci: use the debian-built-testing for nvk

  • ci/b2c: fix the S3 artifact for amd64 manual vk/gl

  • turnip/ci: re-introduce the `multiviewport` flakes

Mary Guillemard (56):

  • agx: Add support for EGL_NV_context_priority_realtime

  • panfrost: Report default value for GROUP_PRIORITIES_INFO in drm-shim

  • pan/kmod: Expose medium priority on panfrost

  • panvk: Implement global priority extensions

  • panvk: Advertise VK_EXT_tooling_info

  • panvk: Advertise VK_KHR_shader_non_semantic_info

  • panvk: Advertise VK_KHR_shader_relaxed_extended_instruction

  • panvk: Implement VK_KHR_zero_initialize_workgroup_memory

  • bi: Execute nir_opt_algebraic after nir_lower_pack

  • panvk: Implement VK_EXT_sampler_filter_minmax for v10

  • panvk: Only flag rw_nc pool as uncached on v10+

  • panvk: Take rasterization samples into account in draw

  • panfrost: Remove faulty assert in cs_loop_conditional_*

  • panvk: Wire occlusion queries to internals

  • panvk: Implement occlusion queries for JM

  • panvk: Implement occlusion queries for CSF

  • panvk: Expose precise occlusion queries

  • panvk: Advertise VK_EXT_host_query_reset

  • panvk: Enable depthClamp and depthBiasClamp

  • panvk: Enable shaderInt16

  • panvk: Advertise VK_KHR_index_type_uint8

  • panvk: Advertise VK_KHR_map_memory2

  • panvk: Disable integer array indices clamping

  • panvk: Advertise VK_EXT_image_robustness

  • panvk: Advertise VK_EXT_pipeline_robustness

  • panvk: Call vk_free on queue array instead of vk_object_free

  • panvk: Use vk_zalloc for queue array allocation

  • panvk: Update Mali-G52 CI baseline

  • panvk: Add a nightly job for Mali-G52

  • nak: Fix 8-bit selection for vectors

  • nak: Simplify 16-bit vector selection to not use try_from

  • meson: Add mesa-clc and install-mesa-clc options

  • meson: Add precomp-compiler and install-precomp-compiler options

  • asahi: Remove unneeded dependencies for asahi_clc

  • util/bitpack_helpers: Use UINT64_MAX instead of ~0ULL

  • util/bitpack_helpers: Make fixed packs CL safe

  • nir,agx: Allow nir_precomp_print_blob to print a static array

  • libcl: Respect NDEBUG for assert

  • panforst: Update ForEachMacros

  • pan/genxml: Move pack_header to an external file

  • libcl: Add VkQueryType and VkQueryResultFlagBits definitions

  • pan/genxml: Switch unpack to use uint32_t

  • pan/genxml: Emit struct details before pack function

  • pan/genxml: Move [un]pack internals to use packed structs

  • pan/genxml: Enforce explicit packed types on pan_[un]pack

  • pan/genxml: Switch pan_section_ptr to cast to packed type

  • pan/genxml: Switch [un]pack codegen to macros

  • pan/genxml: Switch __gen_unpack to macros

  • panfrost: Fix group priorities in drm-shim

  • panfrost: Fix PROGRESS_LOAD destination register

  • pan/bi: Properly encode LEA_BUF_IMM

  • pan/bi: Remove shift lanes invalid encodings

  • pan/bi: Fix invalid CLPER encoding

  • pan/bi: Use 2D dimension with TEX_FETCH with CUBE on Valhall

  • pan/decode: Fix indirect branch calculation for 64-bit

  • panvk: Disallow unknown GPU models early in physical device init

Matt Turner (16):

  • anv: Align anv_descriptor_pool::host_mem

  • vulkan: Skip memcpy() call if passed null pointers

  • anv: Protect memcpy/memset/qsort calls against NULL arguments

  • anv: Avoid null ptr dereference

  • intel: Avoid unaligned pointer access

  • vulkan: Avoid pointer aliasing

  • nir: Get correct number of components

  • intel/decoder: Avoid duplicate symbols when expat is not available

  • brw: Avoid reading past the end of `p->store`

  • brw: Pass brw_codegen to next_offset

  • brw: Bounds check access to `p->store`

  • brw: Pass number and sizeof separately to calloc

  • elk: Avoid reading past the end of `p->store`

  • elk: Pass brw_codegen to next_offset

  • elk: Bounds check access to `p->store`

  • elk: Pass number and sizeof separately to calloc

Matthew Brost (1):

  • anv/xe: Bind queue per anv_queue

Mauro Rossi (4):

  • nvk/android: Avoid building error in nak bindings

  • nvk/android: Advertise Vulkan 1.1 for Android 12L and lower

  • nvk/android: Add support for ANDROID_native_buffer

  • android: remove shared-glapi building rules

Maíra Canal (3):

  • v3dv: Check multiple DRM primary nodes before picking the display fd

  • v3dv: delete `v3dv_debug.h`

  • v3dv: use Mesa log infrastructure instead of using stderr

Mel Henning (27):

  • nak: Fix two warnings of elided_named_lifetimes

  • gallium/winsys/nouveau: Don’t mark the api PUBLIC

  • nak: Add nak_nir_mark_lcssa_invariants

  • compiler/rust/bitset: Fix the bitset iterator

  • compiler/rust: Fix running tests

  • compiler/rust/bitset: Add a basic test

  • compiler/rust/bitset: Removed unused start param

  • compiler/rust/bitset: Make BitSetIter private

  • compiler/rust/bitset: impl FromIterator

  • compiler/rust/bitset: Remove impl Not

  • compiler/rust/bitset: Add a lazy expression API

  • compiler/rust/bitset: Take a stream in union_with

  • nak: Migrate liveness to new bitset expression api

  • compiler/rust/bitset: Don’t expose words

  • compiler/rust/bitset: Test next_unset()

  • nak: Add ShaderModel::hw_reserved_gprs()

  • nak: Add gpr_limit_from_local_size

  • nir_validate: Handle unstructured control flow

  • nak: lower_load_ssbo_descriptor modifies cf

  • nir: Update num_blocks in sort_unstructured_blocks

  • nvk: Fix an assertion in nvk_slm_area_ensure

  • nak: Return VK_ERROR_UNKNOWN on assertion failure

  • nak: Fix a spelling error

  • nak/opt_copy_prop: Fix IAdd3 overflow check

  • nak/opt_copy_prop: Add force_alu_src_type

  • nak/opt_copy_prop: Force alu src for IAdd2X/IAdd3X

  • driconf: force_vk_vendor on Deep Rock Galactic+NVK

Mi, Yanfeng (2):

  • anv:Fix memory grow calculation overflow issue

  • anv:increase instruction heap to 3Gb

Michael Cheng (2):

  • anv : Add tracepoint for as_build

  • intel : Expose Shader hashes for utrace and Perfetto

Michel Dänzer (4):

  • Revert “util/mesa-db: Further simplify mesa_db_compact”

  • Revert “util: Use persistent array of index entries”

  • Revert “winsys/amdgpu: fix FD mismatch”

  • winsys/amdgpu: Always use amdgpu_device_get_fd for aws->fd

Michel Zou (1):

  • ac/gpu_info: Fix missing prototype mingw error

Mike Blumenkrantz (38):

  • zink: restrict implicit feedback loop detection using miplevels/layers

  • mesa: use default params for clearbuffer functions

  • zink: rework query result checking

  • zink: use internal map flag for qbos

  • glsl: make gl_ViewID_OVR visible to all shader stages

  • glsl: enable OVR_multiview if OVR_multiview2 is enabled

  • lavapipe: stop storing texture handle for samplers

  • vk/sampler: split out sampler init from create

  • lavapipe: split out sampler init from create

  • lavapipe: split out bda descriptor function params from struct

  • lavapipe: fix bitmask type for sampler updating

  • lavapipe: move workgraph lowering up and delete pipeline param

  • lavapipe: unsupport NV_device_generated_commands

  • lavapipe: stop using pipeline layouts in some places

  • lavapipe: handle VK_REMAINING_ARRAY_LAYERS with HIC

  • lavapipe: fix 3D->2D blitting

  • lavapipe: abort on unsupported depth copy ops

  • lavapipe: support zs<->color copies

  • lavapipe: maintenance8

  • zink: enable maintenance8

  • glsl: plumb num_views down to shader_info::view_mask

  • zink: fix viewport detection when switching last stage shaders

  • zink: add radv ci fail

  • zink: disable shader objects when viewmask is set

  • zink: fix replacing incompatible pipelines

  • egl: never select swrast for vmwgfx

  • zink: deduplicate VkDevice and VkInstance

  • aco: exclude novalidateir from codegen flags

  • zink: check for bound gfx stages before dereferencing

  • zink: add zink_resource_reference() util function

  • zink: refcount needs_present resource

  • ci: mark radv-raven-traces-restricted with allow_failure

  • zink: emit SpvCapabilityDemoteToHelperInvocation for IsHelperInvocation

  • zink: also refcount needs_present from frontbuffer flush

  • zink: guard rebar check against fallback heap detection

  • radv: fix error reporting for VkExternalMemoryTypeFlagBitsKHR

  • zink: only enable unsynchronized_texture_subdata with HIC

  • zink: never try to oom flush during unsync texture upload

Mike Lothian (1):

  • gallium/radeon: Fix r600_pci_ids.h include

Mykhailo Skorokhodov (1):

  • drirc/anv: force_vk_vendor=-1 for Bellwright

Nanley Chery (22):

  • anv: Support non-0/1 sRGB fast-clear colors on gfx9

  • anv: Store fast-clear colors with the view swizzle

  • anv: Drop fast-clear value conversion check

  • intel/blorp: Assert 3D Ys fast-clear restriction

  • intel/isl: Allow CCS on 3D 64bpp+ Tile64

  • intel: Allow CCS on 3D surfaces for gfx120

  • intel/isl: Fix DecompressInL3 assignment on gfx12.5

  • anv: Enable storage accesses with modifiers on gfx12+

  • anv: Enable more storage compression on gfx12+

  • anv: Only consider R32 image formats as supporting atomics

  • anv: Allow compressed memtypes with default buffer types

  • anv: Slow clear if fast-clear cost is not mitigated

  • iris: Reduce fast-clear post-amble flushes

  • iris: Use L3 Fabric flush in fast-clear post-amble on TGL

  • anv: Reduce fast-clear post-amble synchronization

  • anv: Use L3 Fabric flush in fast-clear post-amble on TGL

  • anv: Drop bpc check for non-zero fast clears

  • Revert “anv: turn off non zero fast clears for CCS_E”

  • anv: Inline can_fast_clear_with_non_zero_color

  • anv: Allow more single subresource fast-clears with FCV

  • anv: Drop can_fast_clear_with_non_zero_color()

  • anv: Limit slow clear heuristic to ACM and prior

Patrick Lerda (8):

  • r600: fix the evergreen sampler when the minification and the magnification are not identical

  • r600: restructure r600_create_vertex_fetch_shader() to remove memcpy()

  • r600: ensure that the last vertex is always processed on evergreen

  • r600: evergreen stencil/depth mipmap blit workaround

  • r600: reverse fix spec ext_packed_depth_stencil getteximage

  • winsys/radeon: fix radeon_winsys_bo_from_handle() related race condition

  • r600: fix r600_init_screen_caps() has_streamout issue

  • r600: fix r600_init_shader_caps() has_atomics issue

Paulo Zanoni (3):

  • brw: don’t forget the base when emitting SHADER_OPCODE_MOV_RELOC_IMM

  • brw: don’t read past the end of old_src buffer in resize_sources()

  • brw: increase brw_reg::subnr size to 6 bits

Pavel Ondračka (27):

  • r300: group KIL for R300/R400

  • r300: run nir_opt_algebraic in the backend

  • r300: always transform sin/cos input for fs

  • r300/ci: update RV410 CI expectations

  • ci: bring back some i915g testing

  • i915/ci: update CI expectations

  • r300: disable ATI2N textures on R400

  • r300: disable microtiling for scanout buffers

  • r300/ci: update CI expectations

  • r300: fix uninitialized use in transform_vertex_ROUND

  • nir: add support for clamping in nir_lower_tex_shadow

  • etnaviv: always clamp shadow sampler comparison reference value

  • r300: fix presubtract assert

  • r300: move shadow lowering to NIR

  • r300: reswizzle some shadow texture calculations to use w channel

  • r300: delete backend shadow lowering code

  • r300: use ssa-like form for gl_FragCoord transformation

  • r300: add some more nir cleanup compiler passes

  • r300: use ssa-like form for backend texture lowering

  • r300: don’t allocate fs registers when translating from NIR

  • r300: get rid of the register rename pass

  • r300: get rid of some texture fixups

  • r300: remove support for register arrays from nir_to_rc

  • r300: fix memory leak in contant remaping

  • ci: fix debian-build-testing BUILDTYPE

  • i915/ci: use debian-build-testing instead of debian-testing

  • i915: rework shader compile failures reporting

Peyton Lee (5):

  • frontends/va: add support for VAProcColorStandardExplicit

  • frontends/va: add support for VAProcColorStandardExplicit

  • frontends/va: function process_frame has return value

  • radeonsi/vpe: optimize software functions

  • radeonsi/vpe: add destroy_fence function

Philipp Zabel (11):

  • teflon: Use correct convolution params struct

  • teflon: Mark dilated convolutions and fused activation as not supported

  • teflon: Support fused ReLU activation

  • etnaviv/nn: Enable fused ReLU activation

  • teflon: Add is_signed parameter to ml_subgraph_invoke and ml_subgraph_read_output

  • etnaviv/nn: Add support for signed 8-bit tensors

  • teflon/tests: prep test executor for signed convolutions

  • teflon/tests: Enable int8 tests

  • etnaviv/ml: Create combined input tensors for addition first

  • teflon: Reject per-axis quantization

  • teflon: Support fused ReLU6 activation via output saturation

Pierre-Eric Pelloux-Prayer (40):

  • radv: set info->family_overridden when RADV_FORCE_FAMILY is used

  • ac/surface: add flags to surface metadata

  • radeonsi: refuse to import texture with family_overriden being set

  • ac: rename ac_surface_test_common -> ac_fake_hw_db

  • ac: add ‘polaris12’ gpu to ac_fake_hw_db

  • ac: switch AMD_FORCE_FAMILY handling to using ac_fake_hw_db

  • radeonsi/tests: update expected results

  • ac/perfcounter: fix buffer overflow

  • dri: Remove unused function

  • radeonsi/gfx12: disable display dcc for front buffer rendering

  • radeonsi: disable DCC for PIPE_BIND_USE_FRONT_RENDERING

  • glx: return BadMatch for invalid reset notification strategy

  • ac/nir: remove prim_stride_ret arg from ngg_build_streamout_buffer_info

  • radeonsi: use bytes units in streamout

  • DEPENDENCY: ac/llvm: fix sparse code handling

  • radeonsi: fallback to util_blitter_draw_rectangle

  • radeonsi/tests: update results

  • gl/spirv: update subgroup_size if GroupNonUniform is used

  • amd: move all uses of libdrm_amdgpu to ac_linux_drm

  • amd: amdgpu-virtio implementation

  • ac/virtio: disable userptr and local buffers

  • ac/virtio: disable timeline syncobj support

  • radeonsi: enable virtio native context support

  • radv: enable virtio native context support

  • radv/virtio: disable syncobj timeline support

  • ac/virtio: add virtio-only AMDGPU_GEM_CREATE flag

  • radeonsi, radv, virtio: use AMDGPU_GEM_CREATE_VIRTIO_SHARED

  • radeonsi: clear the debug callback on ctx destroy

  • ttn: init source_blake3 and name from tgsi_shader_info

  • ac/llvm: add wqm param to ac_build_quad_swizzle

  • ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov

  • radeonsi: do not use std::max

  • glx: fix glx-create-context-invalid-es-version

  • dri: use _checked variants of xcb requests

  • dri: deal with ARGB1555

  • egl/wayland: validate dri_screen_display_gpu before use

  • amd: add ac_drm_device_get_cookie

  • radeonsi: use ac_drm_device_get_cookie

  • radeonsi: update si_need_gfx_cs_space upper bound

  • radeonsi: disable dcc when external shader stores are used

Qiang Yu (81):

  • ac/surface/tests: support all block sizes

  • ac/surf: add more modifiers to gfx12 supported list

  • radeonsi: disable use_gfx12_xfb_intrinsic when use ACO

  • util/blake3: add _mesa_blake3_from_printed_string

  • radeonsi: add AMD_FORCE_SHADER_USE_ACO for debug

  • nir: do not generate b2i64 when driver want to lower it

  • aco: enable gfx12 support for radeonsi

  • radeonsi: fix unigine heaven crash when use aco on gfx8/9

  • aco: fix voffset missing when buffer store base >=4096

  • radeonsi: fix OpenCL shader compile fail

  • ac/nir: lower access for shared and scratch memory

  • ac,radv: move ac_nir_lower_bit_size_callback to common place

  • radeonsi: fix OpenCL piglit tests fails when using ACO

  • radeonsi: replace ac_nir_lower_subdword_loads

  • ac: remove ac_nir_lower_subdword_loads

  • radeonsi: fix global access ACO compile fail when OpenCL

  • radeonsi: enable ACO by default for pre-GFX10 GPUs

  • radeonsi: unify disk cache id no matter use_aco or not

  • gallium: add pipe_caps struct definition

  • gallium: add u_init_pipe_screen_caps

  • asahi: add agx_init_screen_caps

  • crocus: add crocus_init_screen_caps

  • d3d12: add d3d12_init_screen_caps

  • etnaviv: add etna_init_screen_caps

  • freedreno: add fd_init_screen_caps

  • i915: add i915_init_screen_caps

  • iris: add iris_init_screen_caps

  • lima: add lima_init_screen_caps

  • llvmpipe: add llvmpipe_init_screen_caps

  • nouveau/nv30: add nv30_init_screen_caps

  • nouveau/nv50: add add nv50_init_screen_caps

  • nouveau/nvc0: add nvc0_init_screen_caps

  • panfrost: add panfrost_init_screen_caps

  • r300: add r300_init_screen_caps

  • r600: add r600_init_screen_caps

  • radeonsi: add si_init_screen_caps

  • softpipe: add softpipe_init_screen_caps

  • svga: add svga_init_screen_caps

  • tegra: init screen caps

  • v3d: add v3d_init_screen_caps

  • vc4: add vc4_init_screen_caps

  • virgl: add virgl_init_screen_caps

  • zink: add zink_init_screen_caps

  • nine: change cap macros to use pipe_caps access

  • egl,gallium,glx: replace dri_get_screen_param with pipe_caps access

  • mesa/st: enable extension use pipe_caps access

  • egl,gallium,gbm,mesa: replace get_param with pipe_caps access

  • gallium,mesa: replace get_paramf with pipe_caps access

  • rusticl: use pipe_caps access

  • asahi: remove agx_get_param and agx_get_paramf

  • crocus: remove crocus_get_param and crocus_get_shader_paramf

  • d3d12: remove d3d12_get_param and d3d12_get_paramf

  • etnaviv: remove etna_screen_get_param and etna_screen_get_paramf

  • freedreno: remove fd_screen_get_param and fd_screen_get_paramf

  • i915: remove i915_get_param and i915_get_paramf

  • iris: remove iris_get_param and iris_get_paramf

  • lima: remove lima_screen_get_param and lima_screen_get_paramf

  • llvmpipe: remove llvmpipe_get_param and llvmpipe_get_paramf

  • nouveau/nv30: remove nv30_screen_get_param and nv30_screen_get_paramf

  • nouveau/nv50: remove nv50_screen_get_param and nv50_screen_get_paramf

  • nouveau/nvc0: remove nvc0_screen_get_param and nvc0_screen_get_paramf

  • panfrost: remove panfrost_get_param and panfrost_get_paramf

  • r300: remove r300_get_param and r300_get_paramf

  • r600: remove r600_get_param and r600_get_paramf

  • radeonsi: remove si_get_param and si_get_paramf

  • softpipe: remove softpipe_get_param and softpipe_get_paramf

  • svga: remove svga_get_param and svga_get_paramf

  • tegra: remove tegra_screen_get_param and tegra_screen_get_paramf

  • v3d: remove v3d_screen_get_param and v3d_screen_get_paramf

  • vc4: remove vc4_screen_get_param and vc4_screen_get_paramf

  • virgl: remove virgl_get_param and virgl_get_paramf

  • zink: remove zink_get_param and zink_get_paramf

  • gallium: remove get_param and get_paramf

  • docs,src: replace doc and comments for PIPE_CAP with pipe_caps

  • gallium,mesa: remove uint surffix from pipe_caps

  • radeonsi: remove si_screen.max_texel_buffer_elements

  • etnaviv: remove min/max_texture_gather_offset init

  • lavapipe: fix min_vertex_pipeline_param

  • gallium: fix ddebug and noop screen caps init

  • radeonsi: fix has_non_uniform_tex_access info

  • radeonsi: fix GravityMark corruption when use aco

Rebecca Mckeever (14):

  • panvk: Use vk_image::drm_format_mod instead of pan_image::layout.modifier

  • panvk: Replace tab with spaces

  • panvk: Enable multiplane images and image views

  • pan/texture: s/pan_image_view_get_zs_image/pan_image_view_get_zs_plane/

  • pan/texture: s/pan_image_view_get_rt_image/pan_image_view_get_color_plane/

  • pan/texture: Accept holes in the pan_image_view::planes array

  • pan/desc: Pass an image to pan_force_clean_write_rt()

  • pan/desc: Add a pan_image_view_get_s_plane() helper and use it

  • panvk: Support D32_S8 as a multiplanar format

  • pan/format: Use HW version to determine siting for YUV 422 formats

  • pan/texture: Only use plane_chroma_2p for chroma planes

  • util/hash_table: Add _mesa_hash_table_u64_replace()

  • panvk: Allow a 32-bit binding value in desc id key and use 64-bit keys

  • panvk: Fix assertion in is_disjoint()

Rhys Perry (72):

  • nir: add more intrinsics to nir_intrinsic_can_reorder

  • nir/algebraic: optimize bcsel(ieq(b, 0), a, shift(a, b))

  • nir/algebraic: optimize ushr(a, ishl(iand(b, 3), 3))

  • ac/nir: add ACCESS_CAN_REORDER to lowered load_global_constant

  • aco: optimize nir_op_shfr with <32 src1

  • nir,aco,ac/llvm: add nir_op_alignbyte_amd

  • nir_lower_mem_access_bit_sizes: support 64-bit offsets

  • nir_lower_mem_access_bit_sizes: add nir_mem_access_shift_method

  • nir_lower_mem_access_bit_sizes: pass access to callback

  • nir_lower_mem_access_bit_sizes: support load_constant

  • aco,ac/nir: flag loads to use smem in NIR

  • radv,ac/nir: lower sub-dword loads using nir_lower_mem_access_bit_sizes

  • aco: remove load byte_align

  • radv,ac/nir: split global access using nir_lower_mem_access_bit_sizes

  • nir/algebraic: fix iabs(ishr(iabs(a), b)) optimization

  • nir/algebraic: check bit sizes in lowered unpack(pack()) optimization

  • nir/lcssa: fix premature exit of loop after rematerializing derefs

  • glsl/list: add comments above foreach macros

  • glsl/list: add and use helpers in foreach_list_typed macros

  • glsl/list: remove parenthesis in foreach_list_typed macros

  • glsl/list: remove underscores in foreach_list_typed macros

  • nir/opt_move_discards_to_top: use nir_tex_instr_has_implicit_derivative

  • nir: fix return value of nir_instr_move for some cases

  • nir/opt_move_discards_to_top: remove recursion

  • nir/opt_move_discards_to_top: update variable name

  • nir/opt_move_discards_to_top: use nir_intrinsic_can_reorder

  • nir/opt_move_discards_to_top: add more intrinsics to add_src_to_worklist

  • nir/opt_move_discards_to_top: allow multiple discards to be moved

  • nir/lcssa: use nir_intrinsic_can_reorder

  • nir/algebraic: add ddxy to is_only_used_as_float

  • nir/algebraic: add is_used_once to bcsel(, bcsel()) opts

  • nir/algebraic: optimize more bcsel(, bcsel())

  • aco: add SSA repair pass

  • aco: use repair pass for LCSSA workaround

  • aco: require WQM after demote in control flow

  • aco: skip code if exec is empty

  • aco/tests: add tests for empty exec masks

  • aco: don’t use uniform continues if exec might be empty

  • aco: make small_vec copyable

  • aco: use small_vec in RegCounterMap

  • nir/tests: fix SSA dominance in opt_if_merge tests

  • aco/gfx12: insert wait between VMEM WaW

  • aco: force linear for event_vmem_sample and event_vmem_bvh

  • aco: don’t CSE p_shader_cycles_hi_lo_hi

  • radv: constant fold after lowering memory accesses

  • radv: fix expanded push constant loads when all are inlined

  • radv: skip loading unused push constants

  • ac/nir: have ac_nir_lower_mem_access_bit_sizes preserve >128 bit SMEM

  • nir: make load_helper_invocation non-reorderable

  • nir/move_discards_to_top: don’t move across more intrinsics

  • nir: make ballot ALU and mbcnt_amd operations reorderable

  • aco: fix max_workgroup_count[0]

  • aco: decrease max_workgroup_size

  • radv: increase maxComputeWorkGroupCount[0]

  • aco/tests: fix skip_lines=True with remaining characters in matches

  • aco/util: fix bit_reference::operator&=

  • aco: use VOP3 v_mov_b16 if necessary

  • v3dv: fix SSA dominance error

  • microsoft/compiler: invalidate loop analysis in dxil_nir_lower_double_math

  • microsoft/compiler: repair SSA in dxil_nir_split_tess_ctrl

  • d3d12: fix phi handling in d3d12_lower_primitive_id

  • d3d12: store only once in d3d12_emit_points

  • nir: rerun loop analysis if the parameters change

  • nir/loop_analyze: use a sparse array and stop indexing SSA defs

  • nir/gcm: stop preserving nir_metadata_loop_analysis

  • nir/liveness: stop requiring instr indices

  • nir/validate: validate metadata

  • nir/validate: preserve dominance during SSA validation

  • nir/validate: validate ssa dominance by default

  • radv: set has_image_bvh_intersect_ray for null winsys

  • aco: don’t use divergence information for most ALU defs

  • nir/divergence: assume all instructions are loop invariant if no continues

Rob Clark (11):

  • vdrm+tu+fd: Make cross-device optional

  • freedreno/registers: Add GMU_CORE_FW_VERSION

  • freedreno/a6xx: Align lrz setup with tu

  • freedreno/a6xx: Add nolrzfc debug option

  • freedreno/a6xx: Align lrz height to 32

  • tu: Align lrz height to 32

  • freedreno/a6xx: Use LATE_Z with OC + discard

  • freedreno/a6xx: Fix timestamp emit

  • ir3: Add preamble instr count metric

  • freedreno/pps: Fix multiple counter collection runs

  • tu: Fix raytracing query with vdrm

Robert Mader (2):

  • v3d: Support SAND128 base modifier

  • freedreno: Support offset query for multi-planar planes

Rohan Garg (5):

  • intel/compiler: disable mesh autostrip for WA 16020916187

  • iris: use CALLOC_STRUCT instead of calloc for readability

  • isl: disable aux when creating uncompressed TileY/Tile64 surfaces from compressed ones

  • anv: refactor choose_isl_tiling_flags to pass fewer arguments

  • iris: assert that we’re not exporting a TILE64 surface

Roland Scheidegger (1):

  • llvmpipe: Fix overflow issues calculating loop iterations for aniso

Roman Stratiienko (1):

  • v3dv/android: Suppress AHB-related log spam

Ruijing Dong (2):

  • radeosi/vcn: enable EFC for VCN5.0+ when gfx >= 12

  • radeonsi/vcn: center mv map buffer changed in vcn5.x

Russell Greene (1):

  • perfetto: fix macos compile

Sagar Ghuge (30):

  • anv: Enable MCS_CCS compression on Gfx12+

  • blorp: Use the calculated execution mask

  • anv: Update include dir for anv_tests

  • anv: Split GRL code path in separate file

  • anv: Add header to track BVH data structures

  • anv: Add shader to build BVH header

  • anv: Add shader to copy acceleration structures

  • anv: Implement cmd_fill_buffer_addr callback

  • anv: Move update buffer code in helper

  • anv: Implement write_buffer_cp callback

  • anv: Implement flush_buffer_write_cp callbck

  • anv: Implement cmd_dispatch_unaligned callback

  • anv: Implement acceleration structure API

  • anv: Add helper to copy data from src to dest anv_address

  • intel: Use the common RT BVH framework

  • intel/compiler: Extend nir_intrinsic_load_topology_id_intel for xe3

  • intel/genxml: Drop morton walk field from Xe2

  • intel/genxml: Update COMPUTE_WALKER_BODY

  • intel: Use Morton compute walk order

  • intel/genxml: Update SAMPLER_STATE structure

  • anv: Switch to ANISOTROPIC_FAST filter mode

  • iris: Switch to ANISOTROPIC_FAST filter mode

  • intel: Set correct maxComputeSharedMemorySize for Xe3+

  • intel/genxml: Add coarse pixel related changes

  • anv: Add pipelined coarse pixel state

  • intel/genxml: Update URB related instructions and structures

  • iris: Use 3DSTATE_URB_ALLOC_* instructions

  • blorp: Use 3DSTATE_URB_ALLOC_* instructions

  • anv: Use 3DSTATE_URB_ALLOC_* instructions

  • intel/brw/xe3+: Don’t compile SIMD32 if there is ray queries

Sam Lantinga (1):

  • util: Fixed crash in HEVC encoding on 32-bit systems

Samuel Pitoiset (241):

  • aco: cleanup using fixed registers in the trap handler shader

  • aco: save/restore SCC in the trap handler shader

  • aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8

  • aco: add a helper to dump SGPR to memory for the trap handler

  • aco: fix storing SQ_WAVE_STATUS in the trap handler shader

  • aco: declare phys regs for tba_hi/tma_hi

  • radv,aco: dump m0 and exec from the trap handler

  • vulkan/runtime: return same cmdbuf level from the command pool freelist

  • docs: add missing documentation for RADV_DEBUG=psocachestats

  • radv: remove unused parameter to radv_fill_nir_compiler_options()

  • radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler

  • aco: do not reorder s_trap instructions

  • radv: cleanup printing SGPRS dumped from the trap handler

  • radv,aco: dump more SQ_WAVE regs from the trap handler

  • radv,aco: add a separate function to compile the trap handler shader

  • aco: simplify postprocessing the trap handler shader

  • radv,aco: use the trap handler layout struct while compiling the shader

  • radv: fix the TMA descriptor size

  • radv: compute the TMA BO size instead of using a constant

  • radv,aco: save/restore overwritten VGPRs in the trap handler shader

  • nir: add nir_intrinsic_debug_break instruction

  • spirv: handle NonSemantic.DebugBreak to emit nir_debug_break()

  • aco: emit nir_intrinsic_debug_break

  • radv: emit nir_debug_break instructions when the trap handler is enabled

  • radv: do not always invalidate L2 for GPUs with non-coherent RBs on GFX10+

  • radv: move the GFX11 special case for mips to radv_image_is_pipe_misaligned()

  • radv: determine the first mip that is pipe misaligned on GFX10+

  • radv: use vk_image_view_subresource_range() when possible

  • radv: pass the image subresource range to radv_{src,dst}_access_flush()

  • radv: optimize the pipe misaligned L2 cache invalidation on GFX11

  • aco: fix saving/restoring VGPRS in the trap handler on GFX9

  • aco: use a 64-bit mov to save exec in the trap handler shader

  • aco: add a new variant for vop1() with two operands

  • aco: fix validation for v_movrels_b32 and friends

  • aco: restore m0/exec before exiting the trap handler

  • aco: use all invocations from the current wave in the trap handler

  • aco: save/restore VGPRS on GFX8 in the trap handler shader

  • aco: drop the second M0 operand for s_set_gpr_idx_on

  • radv,aco: dump VGPRS from the trap handler shader

  • radv: mark live invocations when dumping VGPRS with the trap handler

  • radv: dump SPIR-V and NIR for the faulty shader detected with the trap

  • radv: fix ignoring src stage mask when dst stage mask is BOTTOM_OF_PIPE

  • radv: consider VK_PIPELINE_STAGE_2_NONE like BOTTOM_OF_PIPE

  • radv: destroy meta resources properly when creating the device failed

  • radv: add a helper to destroy a logical device

  • radv: add a new drirc option to disable DCC for mips and enable it for RDR2

  • radv,aco: dump LDS from the trap handler

  • radv: remove VK_VALVE_descriptor_set_host_mapping

  • radv: fix skipping on-disk shaders cache when not useful

  • radv: mark VERDE (GFX6) as Vulkan 1.3 conformant

  • radv: fix dumping debug/perftest options when there are holes

  • radv: add a pipeline helper to skip shaders cache

  • radv: fix dumping the trap handler shader disassembly

  • radv: fix printing with RADV_DEBUG=psocachestats

  • radv: only pass relevant stages when emitting DGC push constants

  • radv: capture shader executable info at shader creation time

  • radv: allow shaders caching with RADV_DEBUG=hang and the trap handler

  • vulkan: add MESA_VK_TRACE_PER_SUBMIT

  • radv: finish tools after cleaning meta resources

  • radv: add new start/stop sqtt helpers for capturing with SQTT

  • radv: add support for capturing RGP per-submit

  • radv: add address binding report support for BOs imported with a fd

  • radv: add address binding report support for BOs imported with a ptr

  • radv: add a small helper to dump VM fault with the GPU hang report

  • radv: dump address binding report with RADV_DEBUG=hang

  • radv: try to detect use-after-free with address binding report

  • zink/ci: skip one more modifier test on POLARIS10

  • radv: promote VK_KHR_dynamic_rendering_local_read to core 1.4 API

  • radv: promote VK_KHR_global_priority to core 1.4 API

  • radv: promote VK_KHR_index_type_uint8 to core 1.4 API

  • radv: promote VK_KHR_line_rasterization to core 1.4 API

  • radv: promote VK_KHR_maintenance5 to core 1.4 API

  • radv: promote VK_KHR_maintenance6 to core 1.4 API

  • radv: promote VK_KHR_map_memory2 to core 1.4 API

  • radv: promote VK_KHR_push_descriptor to core 1.4 API

  • radv: promote VK_KHR_shader_subgroup_rotate to core 1.4 API

  • radv: promote VK_EXT_pipeline_robustness to core 1.4 API

  • radv: add new Vulkan 1.4 features/properties

  • radv: advertise Vulkan 1.4 on GFX8+

  • radv: bump VKCTS conformance version to 1.4.0.0 for some GFX8+ GPUs

  • radv/ci: mark few tests as expected failures

  • ac/parse_ib: fix parsing SDMA CONSTANT_FILL packet

  • ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets

  • radv: fix stencil only copies of depth/stencil images with SDMA

  • radv: enable DGC IES for compute with ESO

  • radv: fix initializing HTILE when the image has VRS rates

  • ci: update VKCTS main to a9f7069b9a5ba94715a175cb1818ed504add0107

  • radv: remove redundant drirc for incorrect dual-source blending

  • radv: add radv_disable_dcc_stores and enable for Indiana Jones: The Great Circle

  • radv: only dump device name info on Linux with RADV_DEBUG=hang

  • radv: dump the Mesa version with RADV_DEBUG=hang

  • radv/meta: add missing vk_meta_device_finish()

  • radv/meta: move vk_meta_device_init() to radv_device_init_meta()

  • radv: disable alphaToOne except for Zink

  • ac/nir: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11

  • aco: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11

  • radv: fix alpha-to-coverage with alpha-to-one when MRTZ is also exported

  • radv: remove remaining discard to demote options

  • radv: fix disabling DCC for stores with drirc

  • radv: simplify determining some fragment shader info with epilogs

  • radv: fix alpha-to-coverage with alpha-to-one without MRTZ

  • Revert “radv: disable alphaToOne except for Zink”

  • spirv: add an options to lower SpvOpTerminateInvocation to OpKill

  • radv: add radv_lower_terminate_to_discard and enable for Indiana Jones

  • radv: mark HAWAII (GFX7) as Vulkan 1.3 conformant

  • radv: report same buffer aligment for DGC preprocessed buffer

  • Revert “radv: fix creating unlinked shaders with ESO when nextStage is 0”

  • radv/ci: fix expected list of failures for TAHITI

  • radv: fix missing variants for the last VGT stage with shader object

  • ci: uprev vkd3d-proton to c965c1351fd6915a65bb7f647319536252a24a93

  • radv: fix capturing RT pipelines that return VK_OPERATION_DEFERRED_KHR for RGP

  • radv: reorganize query code by adding separate begin/end helpers

  • radv: remove dead code in radv_CmdCopyQueryPoolResults()

  • radv: add few more query helpers for copying results

  • radv: only enable emulated mesh/task shader queries on GFX10.3

  • radv/nir: fix checking if task shader invocations query is enabled

  • radv: fix getting the number of vertices per prim for the last VGT stage

  • radv: rename GDS queries to emulated queries

  • radv/nir: simplify lowering of query intrinsics

  • radv: cleanup enabling the global BO list when BDA is used

  • radv: check descriptor indexing features for enabling the global BO list

  • radv: rework emitting SPI_SHADER_Z_FORMAT

  • radv: rename color output state to fragment output state

  • radv: add support for VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA

  • radv: use VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA for meta pipelines

  • radv: pass extra graphics pipeline create info using pNext

  • radv/meta: rework creating meta pipelines for query resolves

  • radv/meta: convert the copy/fill pipelines to vk_meta

  • radv/meta: convert the copy VRS to HTILE pipelines to vk_meta

  • radv/meta: convert the FMASK expand pipelines to vk_meta

  • radv/meta: convert the FMASK copy pipelines to vk_meta

  • radv/meta: convert the DCC retile pipelines to vk_meta

  • radv/meta: convert the HTILE expand CS pipelines to vk_meta

  • radv/meta: convert the DCC decompress CS pipelines to vk_meta

  • radv/meta: convert the clear HTILE mask pipelines to vk_meta

  • radv/meta: convert the DCC comp-to-single pipelines to vk_meta

  • radv/meta: convert DGC pipeline layout to vk_meta

  • radv/meta: convert the query resolve pipelines to vk_meta

  • radv/meta: convert the image-to-buffer pipelines to vk_meta

  • radv/meta: convert the buffer-to-image pipelines to vk_meta

  • radv/meta: convert the image-to-image pipelines to vk_meta

  • radv/meta: convert the clear image pipelines to vk_meta

  • radv/meta: convert the compute resolve pipelines to vk_meta

  • radv/meta: remove radv_meta_create_compute_pipeline()

  • vulkan: add a new vk_meta option to use the rect list pipeline path

  • vulkan: use the meta pipeline cache for graphics pipelines

  • radv/meta: convert the HTILE expand GFX pipelines to vk_meta

  • radv/meta: convert the HW resolve GFX pipelines to vk_meta

  • radv/meta: convert the fast-clear GFX pipelines to vk_meta

  • radv/meta: convert the blit GFX pipelines to vk_meta

  • radv/meta: convert the clear GFX pipelines to vk_meta

  • radv/meta: convert the resolve GFX pipelines to vk_meta

  • radv/meta: use only one push constant range for blit2d pipelines

  • radv/meta: convert the blit2d GFX pipelines to vk_meta

  • radv/meta: remove unused radv_meta_create_xxx() helpers

  • radv: fix destroying DGC pipelines

  • radv: disable RT with LLVM completely

  • radv/meta: remove a workaround for building accel structs with LLVM

  • radv/meta: always initialize emulated etc2 on-demand

  • radv/meta: move initializing emulated astc to radv_device_init_meta()

  • radv/meta: stop initializing RT accel structs

  • radv: fix adding the BO to cmdbuf list when emitting buffer markers

  • radv/meta: fix loading the meta pipeline cache

  • radv/meta: reduce length of some cache keys

  • radv/meta: add radv_meta_get_noop_pipeline_layout()

  • radv/meta: do not create redundant pipeline layout objects

  • radv: disable logic op for float/srgb formats

  • ac/descriptors: fix configuring NBC views on GFX12

  • aco: fix VS prologs on GFX12

  • radv: disable VRS coarse shading with 8x MSAA on GFX12

  • radv: configure the VRS surface swizzle mode on GFX12

  • radv: fix programming WALK_ALIGN8_PRIM_FITS_ST on GFX12

  • radv: program DB_RENDER_OVERRIDE correctly on GFX12

  • ac/nir: fix lowering subgroup ID for compute shaders on GFX12

  • ac/nir: fix a comment typo in load_subgroup_id_lowered()

  • ac/gpu_info: add cp_dma_use_L2

  • radv: fix CP DMA clears/copies on GFX12

  • aco: always use ds_bpermute for shuffle/rotate on GFX12

  • radv: fix configuring the attribute ring size on GFX12

  • radv: rename attr_ring to ge_rings

  • radv: change the BASE_HI field for VGT_TF_MEMORY_BASE_HI on GFX12

  • ac/surface: honor RADEON_SURF_PREFER_xxx_ALIGNMENT on GFX12

  • radv: advertise VK_MESA_image_alignment_control on GFX12

  • radv: fix emitting SPI_SHADER_GS_OUT_CONFIG_PS with NULL FS on GFX12

  • radv: fail to initialize when the AMD GPU generation is unsupported

  • radv: mark AMD CDNA as unsupported

  • radv: add GFX12 support to the null winsys

  • ac/nir: fix skipping streamout when no buffers are bound on GFX12

  • vulkan: Update XML and headers to 1.4.305

  • radv: promote VK_EXT_depth_clamp_zero_one to KHR

  • radv: bump maxViewportDimensions to 32K on GFX12

  • radv: add a helper to report if cooperative matrix is enabled

  • zink/ci: add lists for RADV/GFX1200

  • radv: remove duplicate definition of SQTT_BUFFER_ALIGN_SHIFT

  • ac/sqtt: update programming SQTT on GFX12

  • radv: add support for VkMemoryBarrierAccessFlags3KHR

  • radv: adjust the source aspect for color to depth/stencil image copies

  • radv: advertise VK_KHR_maintenance8

  • radv: do not overallocate the number of exports for streamout on GFX12

  • radv: fix transform feedback on GFX12

  • radv: declare a new user SGPR for emulating queries on GFX12

  • radv: lower emulated queries with global atomics on GFX12

  • radv: allocate memory for the shader query buffer on GFX12

  • radv: emit the shader buffer query VA on GFX12

  • radv: use global atomics for generated/written primitives query on GFX12

  • radv: re-emit streamout state for GFX12 when the user SGPR changes

  • radv: exclude layer when recomputing FS input bases

  • ac/cmdbuf: program SPI_SHADER_GS_MESHLET_CTRL to 0 in the GFX12 preamble

  • radv: program COMPUTE_DISPATCH_INTERLEAVE on GFX12

  • radv: add support for BO metadata on GFX12

  • radv: add a new helper to set image BO metadata

  • ac/gpu_info: add gfx12_supports_display_dcc

  • radv: fix an assertion about DCC and modifier on GFX12

  • radv: fix the number of drm modifier planes for DCC on GFX12

  • ci: update VKCTS main to a9988483c0864d7190e5e6264ccead95423dfd00

  • radv/ci: update descriptor buffer skipped tests

  • radv: fix disabling logic op for srgb/float formats when blending is enabled

  • radv: disable video support on GFX12

  • radv: disable VK_KHR_cooperative_matrix on GFX12

  • radv: fix programming pitches for LINEAR_SUB_WINDOW on GFX12

  • radv: fix programming mip level for TILED_SUB_WINDOWS on GFX12

  • radv/ci: add expected list of failures for GFX1200

  • radeonsi: fix programming DCC for SDMA on GFX12

  • radv: use stage instead of entrypoint to determine valid gfx stages

  • docs: add a note about GFX12 (RDNA4) on RADV

  • ac,radeonsi: add SDMA DCC tiling for GFX12+

  • ac/descriptors: allow to configure DCC for buffer descriptors

  • radv/amdgpu: add support for AMDGPU_GEM_CREATE_GFX12_DCC

  • radv/meta: add missing pipeline lookups

  • radv/meta: stop using string keys also for DGC and query objects

  • util/disk_cache: add a new helper to create a disk cache

  • vulkan/runtime: allow to use a different disk cache

  • radv: fix caching on-demand meta shaders

  • radv: fix adding the BO to cmdbuf list when starting conditional rendering

  • radv: fix fetching draw vertex data from counter buffers with transform feedback

  • radv/meta: disable conditional rendering for fill/update buffer operations

  • radv: fix adding the VRS image BO to the cmdbuf list on GFX11

  • ac,radv,radeonsi: add new GFX12_DCC_WRITE_COMPRESS_DISABLE tiling flag

  • ac/gpu_info: add gfx12_supports_dcc_write_compress_disable

  • radv: add initial DCC support on GFX12

  • radv: fix adding the BO for unaligned SDMA copies to the cmdbuf list

Saroj Kumar (1):

  • ac/surface: fix missing NULL check in gfx12_select_swizle_mode()

Sathishkumar S (1):

  • radeonsi/vcn: enable roi decode and rgb targets on JPEG_5_0_1

Scott Moreau (1):

  • dri: Fix hardware cursor for cards without modifier support

Serdar Kocdemir (4):

  • Change C style cast on extension structs

  • Wrap queue related functions on codegen

  • The BumpPool of VkStream is not freeAll’ed

  • gfxstream: add VK_DRIVER_FILES to devenv

Sergi Blanch Torne (6):

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci: disable Collabora’s farm due to unexpected power cut

  • Revert “ci: disable Collabora’s farm due to unexpected power cut”

Shashank Sharma (1):

  • amd: add new AMDGPU_INFO subquery for userqueue metadata

Sil Vilerino (26):

  • vl/vl_winsys: Add missing include for function declaration

  • u_dynarray.h: Fix warning C4267 conversion from ‘size_t’ to ‘type’, possible loss of data

  • u_math.h: Change power of two assert to fix warning C4146: unary minus operator applied to unsigned type, result still unsigned

  • src/gallium/auxiliary/util/u_draw.h: Fix C4244 ‘argument’ : conversion from ‘type1’ to ‘type2’, possible loss of data

  • util: Fix warning C4244 ‘argument’ : conversion from ‘type1’ to ‘type2’, possible loss of data

  • src/compiler: Fix warning C4244 ‘argument’ : conversion from ‘type1’ to ‘type2’, possible loss of data

  • src/compiler: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data.

  • d3d12: Fix warning C4267 conversion from ‘size_t’ to ‘type’, possible loss of data

  • d3d12: Fix warning C4244 ‘argument’ : conversion from ‘type1’ to ‘type2’, possible loss of data

  • d3d12: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data.

  • d3d12: Fix warning C4018 signed/unsigned mismatch

  • d3d12: Add offset limit check to d3d12_resource_from_memobj

  • d3d12_bufmgr.cpp: Fix warning C4244 for x86 builds assign uint64_t to size_t

  • util: cpu_detect.c Fix warning C5274: behavior change: _Alignas no longer applies to the type ‘<unnamed-tag>’ (only applies to declared data objects)

  • d3d12_video_encoder_bitstream_builder_h264: Fix warning C4244 for x86 builds assign uint64_t to size_t

  • d3d12_resource: Fix warning C4244 for x86 builds assign uint64_t to uintptr_t

  • d3d12_video_dec_h264: Fix warning C4244 uint64_t to size_t cast

  • d3d12_video_dec_vp9.cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘const unsigned int’, possible loss of data

  • d3d12_video_dec_hevc.cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘const unsigned int’, possible loss of data

  • d3d12_video_proc.h/cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘const unsigned int’, possible loss of data

  • d3d12_video_enc_av1.cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘unsigned int’, possible loss of data

  • d3d12_video_enc_h264.cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘unsigned int’, possible loss of data

  • d3d12_video_enc_hevc.cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘unsigned int’, possible loss of data

  • d3d12_video_dec.h/cpp: Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘unsigned int’, possible loss of data

  • d3d12_video_enc.h/cpp Fix warning C4244: ‘argument’: conversion from ‘uint64_t’ to ‘unsigned int’, possible loss of data

  • d3d12: Enable Warnings C4267, C4996, C4146, C4244, C4389, C4838, C4302, C4018 in src/gallium/drivers/d3d12 subtree

Simon Perretta (70):

  • pvr: add initial pco stub/boilerplate

  • pvr, pco: Add new compiler framework and shader gen stubs

  • pco: add env debug option parsing

  • pco: stubs for SPIR-V/NIR compilation options

  • pvr: connect basic pco functions to the driver

  • pvr: remove pipeline shader hard-coding support

  • pvr: add device info and functions for calculating available temps

  • pvr: add shader compilation stubs

  • pvr: track pipeline flags

  • pvr: add device info for additional iterator features

  • pvr: fix GetInstanceProcAddr ubsan warning when _instance == NULL

  • pvr: drop PVRX macro

  • pco: suppress warning for functions passing structs

  • pco: pygen stubs

  • pco, pygen: enum emit support, define some enums and op/ref mods/types

  • pco, pygen: define basic isa field types

  • pco, pygen: define and emit isa instruction group header variant fields

  • pco, pygen: isa instruction group header validation and encoding support

  • pco, pygen: isa lower source definitions

  • pco, pygen: isa upper sources definitions

  • pco, pygen: isa internal source selector definitions

  • pco, pygen: isa destination definitions

  • pco, pygen: isa main alu ops

  • pco, pygen: isa backend alu ops

  • pco, pygen: isa bitwise alu ops

  • pco, pygen: isa control alu ops

  • pco, pygen: query bytes required for each variant

  • pco, pygen: generate op and mod info

  • pco: define data structures and basic builder implementation with ops

  • pco: NIR translation and PCO IR pass boilerplate

  • pco: printing and validation boilerplate

  • pco, pygen: generate string representations of enum elements

  • pco: basic instruction printing

  • pco, pygen: move unnamed tuple structs into classes

  • pco, pygen: add bitset support for op mods

  • pco, pygen: common underscore replacement for op names

  • pco: add verbose printing debug option

  • pco, pygen: distinguish hw ops that are built directly into instruction groups

  • pco, pygen: instruction to instruction group mapping, printing

  • pco: additional ref functions

  • pco: boilerplate nir lowering passes

  • pco, pygen: add initial uvsw op boilerplate

  • pco, pygen: add better exception messages

  • pco: adjust align padding to be per-function instead of per-shader

  • pco, pygen: support querying ref mods, if op/ref mods have been set

  • pco: set up and tear down glsl type singleton with context

  • pco, pygen: add support for instructions with variable srcs/dests

  • pco, pygen: re-order some mods to match their evaluation order

  • pco: print ranges of non-ssa refs with >1 channel, datatypes for immediates

  • pco, pygen: drop unspecified bit sizes for references

  • pco, pygen: add defs and mappings for common ops

  • pco, pygen: restructure igrp alu components into arrays

  • pco, pygen: amend bitfield assertion messages

  • pco, pygen: isa ditr op

  • pco, pygen: isa itrsmp op

  • pco: initial implementation of translation and passes

  • pco: add public print wrappers

  • pco: vector component tracking, vector collation when ingesting NIR

  • pco: re-indexing debug option and additional vector and component tracking

  • pco: add mappings and translation for ditr

  • pco: temporarily add hardcoded vs/fs I/O for testing, BXS-4-64 iteration support

  • pco: add helpers for overriding ref chans and offsetting vals

  • pco: vec coalescing improvement to register allocation

  • pco: add opt subpass for propagating comps referencing hw regs

  • pco: track the number of bytes encoded for each function

  • pvr, pco: rewrite compiler/driver interface for vs & fs I/O

  • pco: modifier propagation optimization, shared opt context boilerplate

  • pco: initial validation boilerplate and SSA checks

  • CODEOWNERS: update for new pco compiler tree

  • pco: fix x86 build

Simon Ser (6):

  • dri: revert INVALID modifier special-casing

  • llvmpipe: handle llvmpipe_resource_map() errors

  • dri: don’t fetch X11 modifiers if we don’t support them

  • egl/wayland: only supply LINEAR modifier when supported

  • egl/wayland: fallback to implicit modifiers if advertised by compositor

  • gbm: fix get_back_bo() failure with gbm_surface and implicit modifiers

Sonny Jiang (1):

  • radeonsi/vcn: Add vcn_5_0_1 support

Tapani Pälli (21):

  • intel/dev: update mesa_defs.json from workaround database

  • anv: utilize ray query bo per queue for Wa_14022863161

  • anv: extend Wa_14017794102 with lineage Wa_14023061436

  • isl: modify existing assert by allowing CCS_E aux usage

  • intel/dev: update mesa_defs.json from workaround database

  • intel/dev: lower amount of max gs threads for Wa_18040209780

  • anv/android: always create 2 graphics and compute capable queues

  • iris: allow bo cache for compressed bos on verx10 == 200

  • drirc/anv: force_vk_vendor=-1 for Marvel Rivals

  • intel/dev: update mesa_defs.json from internal database

  • dri: remove GLsync typedef

  • anv: handle mesh in sbe_primitive_id_override

  • iris: initialize whole pipe_box struct for memcmp

  • intel/compiler: take reg_unit size into account with ubo ranges

  • anv: set dependency between SF_CLIP and CC_PTR states

  • mesa/st: take pixelmaps in to account in drawpixels cache

  • intel/dev: update mesa_defs.json from internal database

  • isl: use workaround framework for Wa_1207137018

  • mesa: enable GL_EXT_conservative_depth extension

  • anv: tighten condition for changing barrier layouts

  • anv: apply cache flushes on pipeline select with gfx20

Thomas H.P. Andersen (2):

  • drirc/nvk: force_vk_vendor=-1 for Artifact Classic

  • nvk: follow naming convention for devices

Tim Huang (1):

  • amd: add GFX v11.5.3 support

Tim Keller (1):

  • dril: Check for null config in dril_target.c

Timothy Arceri (24):

  • glsl/nir: fix function cloning at link time

  • glsl: fix compiler global temp collisions

  • glsl: tidy up glsl_to_nir() params

  • glsl: remove unused member

  • Revert “glsl: Move ForceGLSLAbsSqrt handling to glsl-to-nir.”

  • glsl: remove more now unused params from glsl_to_nir()

  • glsl: don’t copy symbol table to shaders

  • glsl: drop _mesa_glsl_copy_symbols_from_table()

  • glsl: use symbol table directly for builtin functions

  • glsl: drop unused symbol table from gl_shader

  • glsl: disable function return lowering in glsl ir

  • glsl: remove return lowering from glsl ir

  • glsl: drop last remaining lower jump test

  • glsl: remove now unused ir reader

  • glsl: move _mesa_glsl_compile_shader() declaration

  • glsl: remove glsl/program.h

  • nir: allow loops with unknown induction var initialiser to unroll

  • glsl: drop unused ir_equals.cpp

  • glsl: drop unused array refcount code and tests

  • glsl: drop opt_dead_code_local

  • glsl: enable layout qualifier if OVR_multiview enabled

  • glsl: fix num_views validation message

  • glsl: fix num_views linker error

  • glsl: fix return value for subgroupBallot()

Timur Kristóf (109):

  • radv: Mark GS copy shaders as internal.

  • radv: Add ability to dump shaders based on stage.

  • aco: Separate options for printing IR and recording disassembly.

  • radv: Separate option to dump NIR.

  • radv: Separate option to print shader disassembly.

  • radv: Separate option to dump backend IR.

  • radv: Refactor RADV_DEBUG=shaders to be a combination of other options.

  • radv: Slightly reword preoptir debug flag.

  • radv: Also allow filtering SPIR-V dump per stage.

  • radv: Set dump flags in a smarter way by default.

  • amd: Rename GFX1103_R1/R2 to PHOENIX/2

  • radv: Add a flush postamble on GFX6.

  • radv: Don’t flush at the end of each command buffer on GFX6.

  • ac/nir/ngg: Don’t emit dead code with dot_op.

  • ac/nir/ngg: Trade 1 VALU shift for 2 SALU add.

  • ac/nir/cull: Slightly refactor control flow for small primitive culling.

  • ac/nir/ngg: Slightly refactor workgroup scan.

  • ac/nir/ngg: Pass wg_repack_result as pointer instead of returning it.

  • ac/nir/ngg: Workgroup scan over two bools.

  • ac/nir/ngg: Implement optional primitive compaction.

  • ac/nir/ngg: Remove erroneous NUW addition from workgroup scan.

  • radv: Reorder potentially per-primitive FS builtins.

  • radv: Slightly simplify potentially per-primitive FS inputs.

  • radv, aco: Consolidate num_interp + num_prim_interp into num_inputs.

  • radv: Emit SPI_PS_IN_CONTROL when emitting PS inputs on GFX10.3.

  • radv: Remove now unused num_prim_interp from shader_info.

  • radv: Use default 0 for undefined builtin PS inputs.

  • radv: Only set NGG_DISABLE_PROVOK_REUSE for VS.

  • ac/nir/ngg: Add ability to store primitive ID as per-primitive.

  • radv: Reorder FS primitive ID input after layer and viewport.

  • radv: Configure implicit VS primitive ID to be per-primitive.

  • ac/nir/ngg: Use ac_nir_prerast_out in mesh shader lowering.

  • ac/nir/ngg: Simplify updating mesh shader output info.

  • ac/nir: Pass ac_nir_prerast_out to ac_nir_export_parameters.

  • ac/nir: Pass ac_nir_prerast_out to ac_nir_export_position.

  • ac/nir: Introduce ac_nir_store_parameters_to_attr_ring.

  • ac/nir/ngg: Refactor VS/TES attribute ring stores.

  • ac/nir/ngg: Refactor GS attribute ring stores.

  • ac/nir/ngg: Refactor export_pos0_wait_attr_ring.

  • ac/nir/ngg: Remove dead code for attribute ring stores.

  • ac/nir/ngg: Move wait attr ring workaround for GS to better place.

  • ac/nir/ngg: Move emitting GS vertex param exports to if.

  • ac/nir/ngg: Refactor storing per-primitive primitive ID to attribute ring.

  • ac/nir: Mark when pre-rast output is used as varying or sysval.

  • ac/nir: Split GS output usage masks to varying and sysval masks.

  • ac/nir: Only export positions when they are really system values.

  • ac/nir: Only export parameters when they are actually varying.

  • ac/nir: Only store params to attribute ring that are varying.

  • aco: Update documentation

  • radv: Add some documentation.

  • radv: Implement FS layer ID input as a system value.

  • Revert “nir/opt_varyings: Add workaround for RADV mesh shader multiview.”

  • ac/nir/ngg: Don’t mark multiview layer output as varying.

  • amd: Set lower_layer_fs_input_to_sysval in common code, not in drivers.

  • radv: Rename layer_input to reads_layer in PS info.

  • radv: Only print “testing use only” message on GFX12+.

  • ac/nir: Move ac_nir_lower_bit_size_callback to ac_nir.c

  • ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c

  • ac/nir: Move ac_nir callback functions to ac_nir.c

  • ac/nir: Move ac_set_nir_options to ac_nir.c

  • ac: Stop including nir.h in ac_shader_util.h

  • ac/nir: Rename emit_streamout to ac_nir_emit_legacy_streamout

  • ac: Move ac_nir_config struct to ac_nir.h

  • ac/nir: Move ac_nir_create_gs_copy_shader to separate file.

  • ac/nir: Expose ac_nir_unpack_value in ac_nir_helpers.h

  • ac/nir: Move ac_nir_lower_intrinsics_to_args to separate file.

  • ac/nir: Move ac_nir_lower_legacy_vs to separate file.

  • ac/nir: Move ac_nir_lower_legacy_gs to separate file.

  • ac/nir: Move ac_nir_gs_shader_query declaration to ac_nir_helpers.h

  • ac/nir: Move ac_nir_opt_pack_half to separate file.

  • ac/nir: Move ac_nir_lower_mem_access_bit_sizes to separate file.

  • ac/nir: Move ac_nir_lower_sin_cos to separate file.

  • ac/nir: Move pre-rasterization related utilities in separate file.

  • ac/nir: Rename ac_nir_lower_ngg_ms to ac_nir_lower_ngg_mesh.

  • ac/nir: Move ac_nir_lower_ngg_mesh to separate file.

  • ac: Move AC_HS_MSG_VOTE_LDS_BYTES to ac_shader_util.h

  • ac: Stop including ac_nir.h from ac_shader_util.c

  • ac/nir: Move all ac_nir_* files to a new folder.

  • radv: Lower array derefs of vectors outside of shader linking.

  • ac/nir/ngg: Mitigate NGG fully culled bug when GS output is compile-time zero.

  • ac/nir/ngg: Mitigate attribute ring wait bug when primitive ID is per-primitive.

  • aco: Move NGG pos export scheduling determination to drivers.

  • ac/nir/ngg: Remove some superfluous variables from culling code.

  • ac/nir/ngg: Add a few comments explaining some variables.

  • ac/nir/ngg: Remove unused vs_output struct.

  • ac/nir/ngg: Carve out ac_nir_ngg_alloc_vertices_and_primitives.

  • ac/nir/ngg: Use ac_nir_ngg_alloc_vertices_and_primitives in mesh shader lowering.

  • ac/nir/ngg: Carve out ac_nir_create_output_phis.

  • ac/nir/ngg: Carve out NGG streamout code.

  • ac/nir/ngg: Carve out ac_nir_repack_invocations_in_workgroup.

  • ac/nir/ngg: Slightly refactor emitting vertex parameters.

  • ac/nir/ngg: Add radeon_info to NGG lowering options.

  • ac/nir/ngg: Add and use a has_attr_ring_wait_bug field to ac_gpu_info.

  • ac/nir/ngg: Add and use a has_attr_ring field to ac_gpu_info.

  • ac/nir/ngg: Add and use a has_ngg_fully_culled_bug field to ac_gpu_info.

  • ac/nir/ngg: Add and use a has_ngg_passthru_no_msg field to ac_gpu_info.

  • ac/nir/ngg: Use gfx_level from radeon_info.

  • ac/nir/ngg: Remove gfx_level and family from NGG lowering options.

  • ac/nir/ngg: Pass radeon_info to mesh shader lowering.

  • ac/nir/ngg: Use has_attr_ring and has_attr_ring_wait_bug in mesh shader lowering too.

  • ac/nir/ngg: Rework attribute ring wait workaround in VS/TES.

  • ac/nir/ngg: Carve out ngg_gs_process_out_primitive.

  • ac/nir/ngg: Carve out ngg_gs_process_out_vertex.

  • ac/nir/ngg: Rework GS output code for better attribute ring handling.

  • ac/nir/ngg: Remove now unused export_pos0_wait_attr_ring.

  • ac/nir/ngg: Don’t call has_input_primitive in GS lowering.

  • ac/nir/ngg: Move GS lowering to separate file.

  • radv, radeonsi: Disable early prim export on GFX11+.

  • ac/nir/ngg: Use SALU to calculate which threads store to attribute ring in GS.

Tomeu Vizoso (42):

  • etnaviv/ml: Fix includes

  • etnaviv/nn: Fix use of etna_core_info

  • etnaviv/ci: Add expectation files for the VIPNano-SI+ NPU

  • etnaviv/ml: Rework the dumping of tensors

  • etnaviv: Add script to decode weights in Huffman format

  • etnaviv/ml: Split V7 coefficient encoding to a new file

  • etnaviv/ml: Add encoding of coefficients for V8

  • etnaviv/ml: Fix padding for convolutions in V8

  • etnaviv/ml: Implement tiling for V8

  • etnaviv/ml: Set two bits in the NN instruction for V8

  • etnaviv/ml: Disable caching on V8

  • etnaviv/ml: Fix reshuffle TP jobs on V8

  • etnaviv/ml: Only reshuffle when needed on V8

  • etnaviv/ml: Make use of the new depthwise support in V8

  • etnaviv/ci: Update expectations for the NPU in the A311D

  • etnaviv/ml: Zero out the NN config

  • etnaviv/ml: Zero all BOs

  • teflon: Support multiple graph inputs and outputs

  • etnaviv/ml: Adapt to changes in teflon regarding multiple inputs

  • etnaviv/ml: Support addition operations on V8

  • teflon: Add files mentioned in the docs for image classification

  • teflon/docs: Update performance measurements on LibreComputer Alta

  • teflon/docs: Add i.MX8MP to list of supported NPUs

  • teflon/docs: Clarify smoke test instructions

  • teflon: Add tests for the YOLOX model

  • teflon: Support tests with inputs with less than 4 dims

  • teflon: Rename model tests so they aren’t skipped by gtest-runner

  • teflon: Don’t crash when a tensor isn’t quantized

  • teflon/tests: Add support for models with float inputs and outputs

  • teflon/tests: Also use the cache for models in the test suite

  • etnaviv/ml: Specify which of the input tensors need transposing.

  • etnaviv/ml: Fix in_image_slice in transposes when width != height

  • etnaviv/ml: Take offsets into account in TP operations

  • teflon: Add support for tensor split and concatenation operations

  • etnaviv/ml: Add support for tensor split and concatenation operations

  • teflon: Limit support for Add to two unpopulated tensors

  • etna/ml: Write out the size of the requested tensor

  • teflon: Add support for tensor padding operations

  • etnaviv/ml: Add support for tensor padding operations

  • teflon: Add support for FullyConnected

  • teflon: Add tests for FullyConnected

  • etnaviv/ml: Implement FullyConnected

Valentine Burley (99):

  • amd/ci: Drop x86_64 suffix from job names

  • amd/ci: Merge and convert Raven piglit testing

  • amd/ci: Convert LAVA RADV jobs to deqp-runner suites

  • amd/ci: Increase fraction for radeonsi-raven-piglit

  • panfrost/ci: Turn redundant GLESCTS-full run into disabled Piglit job

  • svga/ci: Convert to deqp-runner suite

  • panfrost/ci: Convert to deqp-runner suite

  • ci: Drop lava-piglit:(x86_64|arm64) definitions

  • radv/ci: Convert Valve RADV jobs to deqp-runner suites

  • turnip/ci: Bump the number of tests per group for a618

  • turnip/ci: Bump the number of tests per group for a630

  • turnip/ci: Bump the number of tests per group for a660

  • turnip/ci: Decrease fraction for a630-vk-asan

  • turnip/ci: Adjust some timeouts

  • turnip/ci: Remove a630-vk-asan skip

  • turnip/ci: Update expectations

  • freedreno/ci: Drop redundant DEQP_VER

  • turnip/ci: Ony increase hangcheck timer for spilling tests on a630

  • lavapipe/ci: Convert lavapipe-vk-asan to a deqp-runner suite

  • etnaviv/ci: Convert to deqp-runner suites

  • softpipe/ci: Convert softpipe-asan-gles31 to a deqp-runner suite

  • radv/ci: Use deqp-vk-main in Raven and Stoney RADV jobs

  • turnip/ci: Enable ASan leak detection in a630-vk-asan

  • ci/deqp: Remove non-suite support

  • llvmpipe/ci: Move Piglit timeout inside the suite

  • ci/deqp: Simplify conditional arguments

  • ci/deqp: Add a DEQP_FORCE_ASAN option

  • llvmpipe/ci: Actually enable ASan testing for llvmpipe-deqp-asan

  • anv/ci: Fix GPU_VERSION configuration for anv-jsl and anv-jsl-full

  • anv/ci: Bump the number of tests per group for ADL

  • anv/ci: Bump the number of tests per group for JSL

  • anv/ci: Bump the number of tests per group for TGL

  • anv/ci: Re-enable TGL and JSL manual jobs

  • anv/ci: Remove fails that are in .gitlab-ci/all-skips.txt

  • anv/ci: Update expectations

  • ci/lava: Use CI_JOB_TIMEOUT instead of separate variable

  • ci/windows: Bump the number of tests per group

  • ci/windows: Add a manual full job

  • ci/windows: Update expectations

  • turnip/ci: Update expectations

  • ci/windows: Always include windows-msvc in scheduled pipelines

  • panvk/ci: Move the fractions out of suites

  • panvk/ci: Bump the number of tests per group for G52

  • lavapipe/ci: Bump the number of tests per group

  • lavapipe/ci: Update expectations

  • venus/ci: Bump the number of tests per group

  • venus/ci: Update expectations

  • angle/ci: Update expectations

  • zink/ci: Update expectations for ANV

  • turnip/ci: Document flake

  • lavapipe/ci: Update expectations

  • lavapipe/ci: Re-enable lavapipe-vk-asan

  • ci: Uprev vkd3d-proton to b121e6d746341e0aaba7663e3d85f3194e8e20e1

  • virgl/ci: Disable virgl-iris-traces-performance

  • virgl/ci: Migrate the two iris jobs to 1130g7-volteer

  • anv/ci: Increase anv-tgl-angle parallelism to 2

  • zink/ci: Migrate the two TGL traces jobs to 1130g7-volteer

  • zink/ci: Increase zink-anv-tgl parallelism to 4

  • ci: Add Valentine to the restricted traces access list

  • freedreno/ci: Update a630-traces-restricted checksums

  • zink/ci: Skip crashing trace in zink-anv-tgl-traces-restricted

  • turnip/ci: Decrease the fraction on a660-vk-full

  • ci: Fix trace update script reading GitLab token from default location

  • pan/ci: Document some flakes

  • android/ci: Allow specifying Vulkan driver in cuttlefish-runner.sh

  • android/ci: Build ANV for Android

  • freedreno/ci: Update expectations

  • panfrost/ci: Revert to 6.6 kernel on G57

  • amd/ci: Add lava-hp-x360-14a-cb0001xx-zork and use it for VA-API testing

  • amd/ci: Run full radeonsi-raven-va job pre-merge

  • freedreno/ci: Update expectations again

  • turnip/ci: Bump the number of tests per group for a630-vk-asan

  • anv/ci: Move a test to common anv-skips

  • ci: Uprev VKCTS to 1.4.1.0

  • pan/ci: Properly wire up DRIVER_NAME

  • panvk/ci: Skip waived tests

  • ci: Uprev VKCTS to 1.4.1.1

  • ci: Skip broken PenumbraOverture trace for zink and freedreno

  • zink/ci: Update checksum for Osmos trace on TGL

  • anv/ci: Revert to 6.6 kernel on anv-jsl

  • iris/ci: Decrease iris-glk-deqp paralellism

  • panfrost/ci: Move panfrost-g52-piglit to nightly

  • zink/ci: Increase zink-anv-adl parallelism

  • turnip/ci: Increase a660-vk fraction

  • freedreno/ci: Decrease a660-gl paralellism

  • freedreno/ci: Disable a618-gl, a618-egl, and a618-piglit

  • turnip/ci: Disable a630-vk

  • freedreno/ci: Decrease a630-gl parallelism

  • freedreno/ci: Re-enable some traces on a618 and disable a630-traces

  • zink/ci: Increase parallelism of zink-tu-a618

  • freedreno/ci: Don’t automatically retry manual jobs

  • freedreno/ci: Migrate a618-piglit-full to kingoftown

  • amd/ci: Migrate amd-raven-skqp from lenovo-zork to hp-zork

  • anv/ci: Decrease anv-jsl-angle parallelism

  • virgl/ci: Skip flaky trace

  • amd/ci: Increase amd-raven-skqp parallelism

  • freedreno/ci: Document flakes

  • venus/ci: Skip flaky test due to intermittent timeouts

  • amd/ci: Revert to 6.6 kernel on Raven

Vignesh Raman (6):

  • ci: Uprev crosvm

  • ci: Force db410c to host mode

  • ci: Uprev kernel to 6.13

  • ci: update expectation files

  • ci: export RESULTS_DIR in crosvm-script.sh

  • ci: use CI_PROJECT_NAME for artifacts name

Vinson Lee (4):

  • hk: Fix hk_ia_update arguments order

  • vulkan: Add missing va_end

  • intel/elk: Fix assert with side effect

  • hk: Fix build error with static_assert

Visan, Tiberiu (3):

  • amd/vpelib: patch to match shader (#456)

  • amd/vpelib: remove luma offset (#459)

  • amd/vpelib: fixed file headers for Palamida scan

Vldly (1):

  • freedreno: Fix resource tracking on repeated map with discard

Xaver Hugl (1):

  • vulkan/wsi: unset GAMMA_LUT, CTM and DEGAMMA_LUT when doing a modeset

Yinjie Yao (3):

  • radeonsi/vcn: Indentation fix

  • radeonsi/vcn: Fix compile warnings with previously uninitialized variables.

  • radeonsi/vcn: Disable 2pass encode for VCN 5.0.

Yiwei Zhang (4):

  • venus: enable VK_EXT_external_memory_acquire_unmodified if needed

  • venus: use dedicated allocation for ANB image memory import

  • venus: fix to handle pipeline flags2 from maint5

  • venus: fix maintenance5 props init and create flags2

Yogesh Mohan Marimuthu (25):

  • amd: update amdgpu_drm.h for new userq ioctl

  • amd: include amdgpu_drm.h from mesa instead of system for ac_fake_hw_db.h

  • winsys/amdgpu: add DOORBELL domain to bo

  • winsys/amdgpu: add CLEAR_VRAM flag to zero vram when creating bo

  • winsys/amdgpu: add userq helper functions

  • ac/gpuinfo: add use_userq and AMD_USERQ variable

  • winsys/amdgpu: call userq init and destroy functions

  • ac: add new userq signal and wait packet id

  • ac: add inherit vmid field to indirect buffer packet

  • winsys/amdgpu: use bo_va_op_raw() function instead of bo_va_op()

  • winsys/amdgpu: use timeline syncobj for userq vm operations

  • winsys/amdgpu: destroy bo_fence_lock late in do_winsys_deinit()

  • winsys/amdgpu: pass job fences to VM ioctl

  • winsys/amdgpu: wait for vm syncobj before creating userq

  • winsys/amdgpu: move noop and ib_bytes adjustment to cs_flush

  • winsys/amdgpu: move legacy chunk init and submission to new function

  • winsys/amdgpu: add userq cmd submission support in amdgpu_cs_submit_ib()

  • winsys/amdgpu: don’t add fence dependency of other queues for userq

  • winsys/amdgpu: send hdp flush packet for userq

  • winsys/amdgpu: keep has_local_buffers true for userq

  • winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations

  • ac/gpu_info: populate fw info using new fw info ioctl for userq

  • winsys/amdgpu: ring doorbell before calling userq_signal ioctl

  • winsys/amdgpu: use next_wptr as cache for userq

  • winsys/amdgpu: ensure strict order in updating mqd wptr and doorbell

You, Min-Hsuan (1):

  • amd/vpelib: fix coverity defects

Zan Dobersek (8):

  • fd/pps: specify counter group for each countable

  • fd/pps: provide derived counters on a7xx

  • freedreno/registers: update RB_BLIT_INFO, RB_CCU_CNTL

  • tu/a7xx: use concurrent resolve groups

  • tu: ensure completion of generic-clear resolves for color, depth/stencil clears

  • tu/a7xx: support 8x MSAA

  • freedreno/registers: fix RBBM_PRIMCTR understanding and usage

  • freedreno/a7xx: fix fd_lrzfc_layout

Zhao, Jiali (1):

  • amd/vpelib: 420 and 422 Output Single Segment cositing support

Zoltán Böszörményi (3):

  • features.txt: Add Vulkan 1.4 section

  • docs/features: Mark VK_EXT_host_image_copy as implemented on Turnip

  • docs/features: Mark more Vulkan 1.4 features as done for drivers

duncan.hopkins (9):

  • glx: change `#if` guard around `dri_common.h` to stop missing ‘driDestroyConfigs’ symbol on MacOS builds.

  • glx: ignore zink check for has_explicit_modifiers and DRI3 on MacOS.

  • kopper: Add ‘#if’ guard around `loader_dri3_get_pixmap_buffer` to stop missing symbol on MacOS.

  • glx: Guard some of the bind_extensions() code with the same conditions as `glx_screen`s frontend_screen member.

  • glx: Add back in `applegl_create_display()` so the OpenGL.framework, on MacOS, pointer get setup.

  • zink: MoltenVk has conditional VK_DYNAMIC_STATE_VERTEX_INPUT_BINDING_STRIDE support.

  • zink: Avoid optimalDeviceAccess on MoltenVK when creating depth taregts.

  • zink, kopper: Conitionally add VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT to swap chain imageUsage.

  • zink: stop zink_set_primitive_emulation_keys producing geometry shaders on platforms that do not support them.

liuqiang (2):

  • lavapipe: Resolved write to pointer after free

  • d3d10umd: Modify comment

nyanmisaka (1):

  • frontends/vdpau: Get AV1 decode subsampling_x/y

sergiuferentz (1):

  • Use try_unbox in VkDescriptorBufferInfo