Mesa 24.2.0 Release Notes / 2024-08-14

Mesa 24.2.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 24.2.1.

Mesa 24.2.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 24.2.0 implements the Vulkan 1.3 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

c02bb72cea290f78b11895a0c95c7c92394f180d7ff66d4a762ec6950a58addf  mesa-24.2.0.tar.xz

New features

  • VK_KHR_dynamic_rendering_local_read on RADV

  • VK_EXT_legacy_vertex_attributes on lavapipe, ANV, Turnip and RADV

  • VK_MESA_image_alignment_control on RADV

  • VK_EXT_shader_replicated_composites on ANV, dozen, hasvk, lavapipe, nvk, RADV, and Turnip

  • VK_KHR_maintenance5 on v3dv

  • VK_KHR_maintenance7 on RADV

  • VK_EXT_depth_clamp_zero_one on v3dv

  • GL_ARB_depth_clamp on v3d

  • Defaulting to a new shader cache implementation, reducing filesystem overhead.

Bug fixes

  • u_debug_stack_test.capture_not_overwritten fails on i386

  • turnip: vulkaninfo crashed

  • turnip-freereno Build error (/usr/local/include/xf86drm.h:40:10: fatal error: drm.h: No such file or directory)

  • tu: compilation failing when compiling turnip with only kgsl and X11 enabled

  • samplerCube constructor in fragment shader no longer converting uvec2 bindless texture handles ( segmentation fault , core dumped )

  • anv: gen9.5 flakiness in dEQP-VK.multiview.dynamic_rendering.depth.*

  • vaapi decoding corruption with green blocks

  • interpolateAt precision lowering unhandled by glsl_to_nir()

  • [anv] CS2 crashes on LNL

  • [anv] Dota2 does not start on LNL

  • [radeonsi][bisected][regression] glClientWaitSync() quickly times out with INT64_MAX timeout

  • d3d10umd: Build regression in 24.2.0-devel

  • zink/tu: glcts flake on a750

  • nouveau: advertises GL_EXT_memory_object without implementing `*UID` callbacks

  • LIBGL_DRIVERS_PATH gone

  • [Regression][Vulkan][TGL][Bisected]vkCmdCopyQueryPoolResults failed to write buffer with compute pipeline on Mesa 24.1

  • Worms Revolution: not rendering explosion effects

  • crash on pushbuf_validate nvc0_blit do_blit_framebuffer

  • piglit: cl-api-build-program crashes

  • i915g: glGenerateMipmap() fails with 2048×2048 textures

  • [radeonsi] Asterix & Obelix XXLRomastered: river misrendered (completely black)

  • Build fails without Vulkan

  • No dependency check for PyYAML in meson.build

  • GPU Hang in Metal Gear Rising Revengeance

  • VK_ERROR_DEVICE_LOST A770 DXVK Fallout 3

  • [Bisected] Recent compile issue in libnak

  • anv: Wrong push constant values for bytes?

  • anv: dEQP-VK.protected_memory tests GPU hang on MTL

  • RustiCL (or maybe not…): radeonsi freezes after 2 hours of simulation, zink works just fine

  • ci_run_n_monitor.py doesn’t monitor manual jobs

  • Crash in util_idalloc_resize due to glBindTexture with a way-too-large ID

  • mesa-24.1.2 fails to compile: ast_to_hir.cpp:5371:39: error: ‘%s’ directive argument is null

  • [regression][bisected] VMware Xv video displays as black rectangle

  • Blender 4.2,4.3 crashes when rendering with motion blur on RDNA3 cards (OpenGL/radeonsi)

  • nvk: regression with multiple games crashing

  • Transparent background in Blender 3D view with nouveau

  • turnip: latest git does not build anymore

  • ACO Unimplemented intrinsic instr

  • RADV/ACO: assert on per-sample interpolation

  • radv: large descriptor layout creation is slow

  • Gnome shell (wayland) crashes when opening any window

  • DRI Intel drivers fix a problem in Redhat 7 (Mesa 18), but are not included for Redhat 8 (Mesa versions v23, v24)

  • Vulkan: ../src/nouveau/vulkan/nvk_physical_device.c:1109: VK_ERROR_INCOMPATIBLE_DRIVER

  • RADV: Smooth lines affect triangle rendering

  • [armhf build error][regression] error: ‘StringMapIterator’ was not declared in this scope; did you mean ‘llvm::StringMapIterator’?

  • Build fails with latest llvm 19: error: no matching function for call to unwrap(LLVMOpaqueModule*&)

  • tu: support KHR_8bit_storage

  • Incorrect colours on desktop and apps

  • nir: Incorrect nir_opt_algebraic semantics for signed integer constants causing end-to-end miscompiles

  • src/gallium/frontends/clover/meson.build:93:40: ERROR: Unknown variable “idep_mesaclc”.

  • panfrost: mpv is broken on T604

  • Nightly CI is broken

  • [radv] GPU hang in Starfield on RX 5700 XT

  • anv, isl, iris: Clarify and improve CCS + FCV on gfx12

  • isl: CPCB horizontal and vertical alignment requirements unknown

  • Indika: flickering black artifacting on the snow

  • intel/isl: Split Xe2 changes into new files of Xe2.

  • rusticl: Generated rusticl_mesa_bindings.c fails to find include

  • isl: Remove 512B pitch requirement for non-displayable CCS

  • MESA 24.1 - broken zink OpenGL under Windows

  • Blue flickering rectangles on AMD RX 7600

  • GPU hangs on AMD Radeon RX 6400 on a fragment shader

  • v3dv: vkcube-wayland crashes

  • intel/brw: scoreboarding regression

  • regression in !29436 for radv+angle on stoney

  • [radv][regression] Starfield invisible terrain on a 7900 XTX

  • free_zombie_shaders() leave context in a bad state (access violation occurs)

  • r300: X11 fails to start with the modesetting driver (glamor is broken with R300/R400 gpus).

  • [NINE]Far Cry 1 trees flicker regression[bisected][traces]

  • Vulkan: Most sync2 implementations are missing new access flags

  • Incorrect buffer_list advance when writing disjoint image descriptors

  • ANV: Block shadows in Cyberpunk on Intel A770

  • ACO ERROR: Temporary never defined or are defined after use

  • [ANV] Graphics memory allocation in Total War: Warhammer 3

  • DG2: God of War trace fails to play

  • Borderlands trace fails to play on dg2

  • NVK: Vulkan apps simply terminated with segfault under wayland and Xwayland

  • NVK: VK_ERROR_OUT_OF_DEVICE_MEMORY on swapchain creation

  • anv/zink regression: piglit.spec.arb_fragment_layer_viewport.layer-no-gs

  • [anv] failures when upgrading vulkancts 1.3.6 -> 1.3.7 on intel mesa ci

  • RustiCL: deadlock when calling clGetProfilingInfo() on callbacks

  • [Intel][Vulkan][Gen12] Vulkan compute shader is 3x slower than the same OpenCL kernel

  • turnip: Broken AHB support

  • zink: nir validation failures in Sparse code

  • nir: nir_opt_varyings uses more stack than musl libc has

  • dEQP-VK.pipeline.pipeline_library.shader_module_identifier.pipeline_from_id.graphics regression

  • freedreno + perfetto missing dependency on adreno_common.xml.h

  • anv: unbounded shader cache

  • radv: Crash due to nir validation fail in Enshrouded

  • vulkan/wsi/wayland: valgrid reports invalid read in `vk_free` call in `wsi_wl_surface_analytics_fini`

  • android: sRGB configs no longer exist after !27709

  • bisected: turnip: deqp regressions

  • aco: Radeonsi unable to use rusticl

  • anv: clean up default_pipeline_cache in anv_device

  • [24.1-rc4] fatal error: intel/dev/intel_wa.h: No such file or directory

  • Turnip driver is crashing since turnip: ANB/AHB support got merged

  • vcn: rewinding attached video in Totem cause [mmhub] page fault

  • When using amd gpu deinterlace, tv bt709 properties mapping to 2 chroma

  • a530: ir3_context_error assertion (unknown vertex shader output name: VARYING_SLOT_EDGE)

  • VCN decoding freezes the whole system

  • [RDNA2 [AV1] [VAAPI] hw decoding glitches in Thorium 123.0.6312.133 after https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960

  • radv regression bisected: Flickering textures (vega)

  • [Regression][Bisected] EGL/Wayland: QT applications terminated by SIGSEGV (Address boundary error) when using dGPU

  • WSI: Support VK_IMAGE_ASPECT_MEMORY_PLANE_i_BIT_EXT for DRM Modifiers in Vulkan

  • nvk: Tracker issue for gamescope support

  • nvk: Implement VK_EXT_image_drm_format_modifier

  • nvk: NVK_DEBUG=zero_memory is hitting an assert

  • nvk: Implement VK_EXT_conservative_rasterization

  • zink sparse: Improper semaphore handling

  • zink sparse: Reference to mip tails should be refcounted

  • radv: Enshrouded GPU hang on RX 6800

  • NVK Zink: Wrong color in Unigine Valley benchmark

  • intel vulkan incremental build takes forever

  • intel vulkan incremental build takes forever

  • 24.0.6: build fails

  • shader with multidimensional array in shader storage buffer takes too long to compile

  • panforst: T604 issue with using u32 for flat varyings

  • lp_screen.c:601:4: error: ‘snprintf’ will always be truncated; specified size is 16, but format string expands to at least 17

  • [anv] FINISHME: support YUV colorspace with DRM format modifiers

  • gen9/11 test became flaky: piglit.spec.!opengl 1_4.blendminmax

  • mesa 24 intel A770 KOTOR black shadow smoke scenes

  • nvk: Implement VK_EXT_pipeline_robustness

  • [bisected][regression] kitty fails to start due to `glfwWindowHint(GLFW_SRGB_CAPABLE,true)`

  • r600: bisected 5eb0136a3c561 breaks a number of piglits

  • [bdw][bisected][regression] assertion failure in nir_validate.c

  • Graphical glitches in RPCS3 after updating Vulkan Intel drivers

  • [R600] OpenGL and VDPAU regression in Mesa 23.3.0 - some bitmaps get distorted.

  • VAAPI radeonsi: VBAQ broken with HEVC

  • tu: weird fail in packing tests

  • radv/video: 10-bit support

  • radv: vkCmdWaitEvents2 is broken

  • anv: add support for EDS3::extendedDynamicState3AlphaToCoverageEnable

  • ci: switch from CI_JOB_JWT to id_tokens

  • Zink: enabled extensions and features may not match

  • anv: share embedded samplers

Changes

Adam Jackson (8):

  • mesa: Enable EXT_shadow_samplers for GLES2

  • gallium: Rename ${target}/target.c to ${target}/{$target}_target.c

  • treewide: Include mesa_interface.h not dri_interface.h

  • mesa_interface: Set ourselves free

  • mesa_interface: Move out of GL/internal/

  • gallium/dril: Compatibility stub for the legacy DRI loader interface

  • dri: Let dril handle the DRI driver link farm

  • gallium/meson: Deconflate swrast/softpipe/llvmpipe

Adrian Perez de Castro (1):

  • Revert “egl/wayland: Remove EGL_WL_create_wayland_buffer_from_image”

Alejandro Piñeiro (29):

  • v3dv/cmd_buffer: always bind pipeline static state

  • v3dv/ci: dEQP-VK.dynamic_state.*.double_static_bind are fixed now

  • v3dv: port dynamic state tracking to use Mesa Vulkan

  • v3dv: provide implementation for vkCmdBindVertexBuffers2

  • v3dv: provide implementation for CmdSetViewportWithCount

  • v3dv: CullMode and FrontFace are dynamic now

  • v3dv: DepthBoundsTestEnable is dynamic now

  • v3dv: move depth CFG bits setting to cmd buffer emission

  • v3dv: ez_state/incompatible_ez_test could be recomputed at cmd_buffer

  • v3dv: PrimitiveTopology is now dynamic

  • v3dv: StencilOp and StencilTestEnable are now dynamic

  • v3dv/ci: update expected list due VK_EXT_extended_dynamic_state

  • v3dv: enable VK_EXT_extended_dynamic_state

  • v3dv/cmd_buffer: missing updates due PrimitiveTopology being dynamic now

  • v3dv: fixes StencilTestEnable handling

  • v3dv: PrimitiveRestartEnable is now dynamic.

  • v3dv: DepthBiasEnable is dynamic now

  • v3dv: SetRasterizerDiscardEnable is dynamic now

  • v3dv: enable VK_EXT_extended_dynamic_state2

  • v3dv: add debug option to disable custom pipeline caches for meta operations

  • v3dv/meta_clear: take into account multiview for the custom clear pipeline caches

  • v3dv/meta_clear: use v3dv_renderpass used as parameter

  • v3dv/device: compute maxDescriptorSet*Limits multiplying per-stage by 4

  • v3dv/device: set DescriptorUpdateAfterBind limits

  • v3d/devinfo: unify comment style

  • broadcom: move HW-dependant constants to v3d_device_info

  • v3d,v3dv: document cl_emit_with_prepacked

  • v3dv/pipeline: ensure vk_graphics_pipeline_all_state alive when still needed

  • drm-shim: stub synobj_timeline_wait and query ioctl

Aleksi Sapon (5):

  • lavapipe: fixes for macOS support

  • lavapipe: build “Windows” check should use the host machine, not the `platforms` option.

  • util: fix memory related OS calls on macOS

  • wsi: fix compilation on macOS

  • util: macOS support for cnd_monotonic

Alessandro Astone (1):

  • egl/gbm: Walk device list to initialize DRM platform

Alex Deucher (1):

  • ac/surface: fix version check for gfx12 DCC

Alexandre Marquet (2):

  • pan/mdg: quirk to disable auto32

  • panfrost: implement SFBD raw format support on v4

Alexandros Frantzis (2):

  • egl/wayland: Pass dri2_wl_formats to create_dri_image

  • egl/wayland: Fail EGL surface creation if opaque format is unsupported

Ali Homafar (1):

  • lavapipe: Set ICD api_version to 1.3

Alyssa Rosenzweig (222):

  • vulkan: optimize vk_dynamic_graphics_state_any_dirty

  • vulkan: add helper to fill out spirv caps automatically

  • nir/lower_subgroups: add filter

  • nir/lower_subgroups: add generic scan/reduce lower

  • nir/lower_subgroups: relax ballot_type_to_uint

  • nir/lower_robust_access: also handle image derefs

  • docs: add header-stub for vk_enum_to_str

  • vulkan: add vk_debug_ignored_stype helper

  • nvk: use common stype debug

  • broadcom: use common stype debug

  • pvr: use common stype debug

  • anv,hasvk: use common stype debug

  • dzn: use common stype debug

  • nir: add is_first_fan_agx sysval

  • nir: add texops for AGX border colour emulation

  • nir: add quad_ballot_agx intrinsic

  • nir,agx: add depth=never workaround

  • nir,agx: fix load_active_subgroup_index

  • compiler: add ACCESS_IN_BOUNDS_AGX

  • agx: optimize and/or with booleans

  • agx: enable more lowering

  • agx: fix query LOD of array

  • agx: fix some ms texture packing

  • agx: handle cross-workgroup memory barriers

  • agx: allow 8-bit bcsel

  • agx: fix phi translation corruption

  • agx: fix load_helper_invocation with sample shading

  • agx: fix frag sidefx with sample shading

  • agx: handle subgroup barriers

  • agx: fix spilling inside sample loop

  • agx: switch to demote internally

  • agx: lower nir_intrinsic_load_num_subgroups

  • agx: delete unreachable blocks

  • agx: model more subgroup ops

  • agx: lower shuffle

  • agx: handle non-immediate shuffles in divergent CF

  • agx: handle quad_broadcast

  • agx: handle quad swaps

  • agx: add missing b2b16 implementation

  • agx: forbid uniforms on ballots

  • agx: lower 8-bit subgroups

  • agx: flesh out subgroup lowering

  • agx: report uses_txf

  • agx: expose agx_link_libagx

  • agx: document another sample_mask restriction

  • agx: reserve scratch registers for mem<–>mem swaps

  • agx: optimize txf with lod 0

  • agx: fix bogus unit test

  • agx: stash early_fragment_test info

  • agx: handle quad reduce

  • agx: implement quad_ballot

  • agx: lower more quad ops

  • agx: optimize elect()

  • agx: fix UB in cursor comparison

  • ail: constify everything

  • asahi: mark eMRT loads as in-bounds

  • asahi: calculate validity when unpacking

  • asahi: agx_translate_sample_count

  • asahi: assert bo size > 0

  • asahi: unwrap pointless null check

  • asahi: implement PIPE_CAP_QUERY_MEMORY_INFO

  • asahi: rm unused #include

  • asahi: resize key

  • asahi: cleanup fs epilog link info

  • asahi: move agx_link_varyings_vs_fs

  • asahi: fix prolog emit

  • asahi: pack UVS key properly

  • asahi: plumb shader stage into info

  • asahi: get debug in common

  • asahi: rm deadcode

  • asahi: drop rgb10a2_sint rendering

  • asahi: add missing rgba4 format

  • asahi: fix 1D array atomics

  • asahi: fix txf/image_load robustness with arrays

  • asahi: rework VBO lower for divisor=0

  • asahi: mv AGX_MAX_OCCLUSION_QUERIES define

  • asahi: handle agx_ppp_fragment_face_2 with no info

  • asahi: clarify format code in image lowering

  • asahi: fix rgb565 blending

  • asahi: fix store_output component/offset

  • asahi: fix sample ID with multiblock epilogs

  • asahi: lower texture instructions with epilogs

  • asahi: fix cull unknown bits

  • asahi: simplify image atomic lowering

  • asahi: move primitive MSAA field

  • asahi: free libagx if we don’t use a ralloc memctx

  • asahi: eliminate troublesome empty uniforms

  • asahi: rearrange VS uniforms

  • asahi: set src_type for store_output

  • asahi: rm dead code

  • asahi: add agx_index_size_to_B helper

  • asahi: move some GS lowering into lower_gs

  • asahi: don’t use load_num_vertices in geometry shaders

  • asahi: mv vertex_id_for_topology_class into GS lowering

  • asahi: rm another num_vertices use

  • asahi: rm dated comment

  • asahi: rm unused lower

  • asahi: rm num_vertices uses

  • asahi: rm redundant input_vertices

  • asahi: mv initialization of grid z for indirect GS

  • asahi: rm more dead lowering

  • asahi: rm always true param

  • asahi: update comment

  • asahi: update comment for maint5

  • asahi: eliminate num_workgroups for VS->GS + VS->TCS

  • asahi: drop bogus assertion

  • asahi: pack tilebuffer usc word ahead-of-time

  • asahi: add agx_ppp_push_merged helper

  • asahi: use ppp_merge

  • asahi: don’t allocate varyings ourselves

  • asahi: don’t allocate for ppp updates

  • asahi: extend varying linking for tri fan weirdness

  • asahi: plumb tri fan flatshading through common

  • asahi: don’t ralloc in agx_fast_link

  • asahi: extend epilog key for force early frag handling

  • asahi: don’t reserve extra UVS space for layer

  • libagx: use sub_group_scan_inclusive_add

  • libagx: add query copy kernel

  • libagx: don’t use get_group_id()

  • asahi/decode: QoL improvements

  • asahi: track imports for decode

  • asahi: clean up bg/eot counts

  • asahi: rename meta -> bg/eot

  • asahi: don’t allocate for USC words

  • asahi: split frag shader words

  • asahi: split CDM Launch words

  • asahi: unify naming for COUNTS structs

  • nir/tex_instr_result_size: handle subpass_ms

  • nir/lower_point_size: support lowered i/o

  • asahi/decode: drop Apple-specific decode check

  • libagx: rm unused field

  • libagx: fix static assert

  • libagx: fix triangle fan + prim restart + GS/XFB

  • libagx: drop unused !indexed path

  • libagx: add libagx_copy_xfb_counters helper

  • asahi: be robust against out of sync shader info

  • agx: fix draw param gather for sw vs

  • asahi: split param structs for GS internal kernel

  • agx: rework libagx I/O lowering

  • asahi: add missing lowerings

  • asahi: force bindless for eMRT

  • asahi: bounds check eMRT stores

  • asahi: support bigger buffer textures

  • asahi: add AGX_TEXTURE_FLAG_CLAMP_TO_0 flag

  • agx: handle discard with force early tests

  • asahi: pack blend key

  • agx: switch to combined clip/cull

  • asahi: add flag controlling sample mask without MSAA

  • asahi: use scalar outputs for rast shaders

  • asahi: move null descriptor routines to common

  • asahi: implement rba2 for uniform texel buffers

  • asahi: implement rba2 for storage texel buffers

  • agx: prepare for lower_wpos_center

  • asahi: extract agx_calculate_vbo_clamp

  • agx: fix indirect CF accounting

  • mesa: fix duplicate initializer

  • nir/lower_wpos_center: clean up

  • nir/builtin_builder: factor out nir_build_texture_query

  • asahi: use nir_build_texture_query

  • gallium: remove ability to probe asahi

  • asahi: add broken bits of unstable Linux UAPI

  • agx: fix 64-bit bcsel ingestion

  • agx: fix fmin/fmax with (-0, 0) pair

  • libagx: fix uint8_t definition

  • libagx: make index buffer fetch robust

  • libagx: generalize query copies

  • asahi: implement rba2 semantics for vbo

  • asahi: fix vbo clamp with stride=0

  • asahi: implement robustness2 for msaa image stores

  • asahi: be more clever about GS side effects

  • nir/lower_robust_access: handle MSAA images

  • nir: add nir_metadata_control_flow

  • treewide: use nir_metadata_control_flow

  • nir: document restriction on load_smem_amd constantness

  • vulkan: reference count vk_descriptor_update_template

  • vulkan: handle push DUT with emulated secondaries

  • vulkan: fix potential UAF with vk_cmd_enqueue_CmdPushDescriptorSetKHR

  • vulkan: handle enqueueing CmdPushConstants2KHR

  • vulkan: handle enqueueing CmdPushDescriptorSet2KHR

  • lvp: use common push constant enqueue

  • lvp: use common push descriptor set enqueue

  • lvp: fix silly casting for sampler desc updates

  • lvp: use common descriptor update templates

  • nir/format_convert: remove unorm bit size assert

  • nir: add nir_def_replace helper

  • treewide: use nir_def_replace sometimes

  • agx: fix insidious ballot optimizer bug

  • agx: add unit test for ballot bug

  • agx: set discard_is_demote

  • nir: add nir_break_if helper

  • treewide: use nir_break_if

  • nir: fix miscompiles with rules with INT32_MIN

  • nir/algebraic: explicitly suffix constants

  • nir/opt_constant_folding: fix array size define

  • zink: move print_pipeline_stats

  • zink: print pipeline stats for compute shader-db

  • util: add dui/uid helpers

  • nir: add nir_alu_instr float controls queries

  • nir/search: use ALU float control helpers

  • nir: use MIN2/MAX2 opcodes for imin/umax folding

  • nir: strengthen fmin/fmax definitions with signed zero

  • glsl/float64: handle signed zero with min/max

  • nir/lower_double_ops: handle signed zero with min/max

  • nir/lower_alu: remove dead #define

  • nir: add lower_fminmax_signed_zero

  • agx: set lower_fminmax_signed_zero

  • agx: do not flush denorms for fp16 fmin/fmax

  • asahi: drop old comment

  • asahi: drop stale comment

  • asahi: make agx_pack opencl compatible

  • asahi: tuck in null query check

  • libagx: specify heap size explicitly

  • asahi,libagx: tessellate on device

  • libagx: add kernel for incrementing CS counter

  • asahi: handle CS pipeline stat with indirect dispatch

  • libagx: handle VS/IA pipeline stats on GPU

  • asahi: eliminate load_num_workgroups from TCS unrolled ID

  • nouveau/drm-shim: set ram_user

  • nvk: add instruction count exec property

  • nir/opt_constant_folding: fix array size define, pt 2

  • zink: remove extraneous \n with shaderdb

  • zink: match shader-db report.py format

Amber (1):

  • tu: Disable depth and stencil tests when attachment state requires it

Amit Pundir (1):

  • android: Fix zink build failure

Amol Surati (1):

  • nine: avoid using post-compacted indices with state expecting pre-compacted ones

Antoine Coutant (1):

  • drisw: fix build without dri3

Antonio Ospite (2):

  • meson: fix deprecation warning in create-android-cross-file.sh

  • android: simplify building libgallium_dri on Android

Arthur Huillet (4):

  • nvk: update 3d classes for conservative raster

  • nvk: implement VK_EXT_conservative_rasterization

  • nvk: import SPH headers files from open-gpu-doc

  • nvk: generate Rust bindings from SPH header files

Asahi Lina (1):

  • asahi: Make asahi_clc build work on x86_64->x86 builds

Bas Nieuwenhuizen (5):

  • radv: Use zerovram for Enshrouded.

  • util/disk_cache: Fix cache marker refresh.

  • util/disk_cache: Delete the old multifile cache if using the default.

  • util/cache_test: Add tests for old cache deletion.

  • relnotes: Add an entry about the new cache default.

Benjamin Lee (1):

  • vk/graphics_state: Add last bits for extraPrimitiveOverestimationSize

Boris Brezillon (81):

  • panvk: Prepare things so we can push sysvals to our push uniform buffer

  • panvk: Put dynamic uniform buffers at the end of the UBO array

  • panvk: Move the dynamic SSBO descriptors to their own UBO

  • panvk: Prepare dynamic buffer descriptors at bind time

  • panvk: Lower sysvals to push uniforms

  • panvk: Kill panvk_sysval_vec4

  • panfrost: do not write outside num_wg_sysval

  • panfrost: Add the BO containing fragment program descriptor to the batch

  • pan/kmod: Fix a syncobj leak in the panthor backend

  • pan/kmod: Make default allocator thread-safe

  • panvk: clang-format fixups

  • panvk: Kill panvk_queue_get_device()

  • panvk: Get rid of panvk_descriptor_state::dirty

  • panvk: Move panvk_cmd_state::batch to panvk_cmd_buffer::cur_batch

  • panvk: Kill unused dynamic state bits

  • panvk: Get rid of special attribute support

  • panvk: Split the graphics and compute state at the cmd_buffer level

  • panvk: Split compute/graphics pipeline objects

  • panvk: Use memory pools to store pipeline shaders/descriptors

  • panvk: Kill the panvk_pipeline_builder object

  • panvk: Transition the graphics pipeline logic to vk_graphics_pipeline_state

  • panvk: Fully transition to vk_viewport_state

  • panvk: Fully transition to vk_rasterization_state

  • panvk: Fully transition to vk_input_assembly_state

  • panvk: Use vk_color_blend_state to fill our blend constant

  • panvk: Fully transition to vk_depth_stencil_state

  • panvk: Fully transition to vk_multisample_state

  • panvk: Set unused attribute buffers descriptors to zero

  • panvk: Rename non_vs_attribs into img_attribs

  • panvk: Prevent re-emission of image attributes used in vertex shaders

  • panvk: Move VS attribute/buffer state to panvk_cmd_graphics_state

  • panvk: Emit VS-accessible image attributes at a fixed offset

  • panvk: Leave holes in the attribute locations used by a shader

  • panvk: Fully transition to vk_vertex_binding_state

  • pan/blend: Move constant inlining out of pan_blend_create_shader()

  • pan/blend: Expose pan_blend_create_shader()

  • panvk: Add a blend library to deal blend shaders/descriptors

  • panvk: Don’t pass the stage to shader_create()

  • panvk: Simplify shader initialization in the pipeline logic

  • panvk: Fix/simplify the shader linking logic

  • panvk: Replace the stages array in panvk_draw_info by vs/fs fields

  • panvk: Move fs_rsd fields to an fs sub-struct

  • panvk: Emit the fragment shader RSD dynamically

  • panvk: Lower global memory IOs

  • panvk: Clean Midgard leftovers in the cmd_close_batch() path

  • panvk: Use vk_pipeline_shader_stage_to_nir()

  • panvk: Kill cmd_get_tiler_context()

  • panvk: Make sure we dump memory mappings before crashing

  • pan/decode: Be robust to NULL texture payload

  • pan/desc: Add missing format in translate_s_format()

  • pan/jc: Drop unused pool argument passed to pan_jc_add_job()

  • panvk: Add a render state to panvk_cmd_graphics_state

  • panvk: Take VK_RENDERING_{RESUM,SUSPEND}ING_BIT flags into account

  • panvk: Force a preload when the render area is not 32x32 aligned

  • panvk: Skip depth/stencil attachments with non-matching aspect mask

  • panvk: Fix dynamic rendering with images containing both depth and stencil

  • panvk: Make sure we don’t lose clear-only operations

  • panvk: Make sure we run the fragment shader if alpha_to_coverage is enabled

  • panvk: Make sure replay of command buffers containing Dispatch calls works

  • panvk: Override the default GetRender[in]AreaGranularityKHR()

  • egl: Use gbm_bo_create_with_modifiers2() when the surface has non-zero flags

  • panvk: Fix formatting around OpaqueCaptureAddress implementation

  • panvk/ci: Flag exact_sampling.*.edge_right test as fails

  • pan/bi: Make sure global loads/stores don’t exceed 16 bytes

  • pan/bi: Fix dynamic indexing of push constants

  • panvk: Fix Cube/2DArray/3D img -> buf copies

  • panvk: Don’t bail out when allocationSize is zero in AllocateMemory()

  • panvk: Prepare for Valhall image views

  • panvk: Prepare for Valhall buffer views

  • panvk: Prepare things for compiling valhall source files

  • panvk: Extend Valhall descriptor set implementation to support Bifrost

  • panvk: Overhaul the Bifrost descriptor set implementation

  • panvk: Refcount private BOs

  • panvk: Store private BOs in lists instead of dynarrays

  • panvk: Prepare panvk_mempool for shared device memory pools

  • panvk: Use memory pools for internal GPU data attached to vulkan objects

  • pan/desc: Extend pan_emit_fbd() to support multilayer rendering

  • pan/desc: Prepare things for fragment job chaining

  • pan/blitter: Let pan_preload_fb() callers queue the jobs to the job chain

  • panvk: Use IDVS jobs when we can

  • panvk: Add support for layered rendering

Caio Oliveira (35):

  • intel/brw: Don’t print IP as part of the dump

  • intel/brw: Hide register pressure information in dumps

  • intel/brw: Use `vNN` instead of `vgrfNN` when printing instructions

  • intel/brw: Fix commas when dumping instructions

  • spirv: Add MESA_SPIRV_DEBUG=values to dump all values

  • intel/brw: Track the number of uses of each def in def_analysis

  • intel/brw: Fix typo in DPAS emission code

  • intel/brw: Add unit tests for scoreboard handling FIXED_GRF with stride

  • intel/brw: Make component_size() consistent between VGRF and FIXED_GRF

  • glsl: Fix warning related to tg4_offsets in release mode

  • intel/brw: Print SWSB information when dumping instructions

  • intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task

  • anv: Use brw_nir_lower_cs_intrinsics for lowering Mesh/Task LocalID

  • intel/brw: Remove unused brw_reg related functions

  • intel/brw: Remove RALLOC helper from fs_reg

  • intel/brw: Remove unused variable from test

  • intel/brw: Move fs_reg data members up to brw_reg

  • intel/brw: Use public inheritance for fs_reg/brw_reg

  • intel/brw: Move most member functions from fs_reg to brw_reg

  • intel/brw: Remove conversion from fs_reg to brw_reg

  • intel/brw: Replace some fs_reg constructors with functions

  • intel/brw: Remove duplicated functions between fs_reg/brw_reg

  • intel/brw: Rename brw_reg() helper to brw_make_reg()

  • intel/brw: Make fs_reg an alias of brw_reg

  • intel/brw: Replace uses of fs_reg with brw_reg

  • intel/brw: Rename fs_reg_* helpers to brw_reg_*

  • intel/brw: Move brw_reg helpers into brw_reg.h

  • intel/brw: Don’t set acc_wr_control for Xe2

  • intel/brw: Use brw_inst_set_group() to set QtrCtrl and NibCtrl

  • intel/brw: Account for reg_unit() in assembler

  • intel/brw: Don’t print extra newlines in assembler

  • intel/brw: Split off assembler logic into library

  • spirv: Don’t warn about FPFastMathMode if not OpenCL

  • intel/brw: Convert missing uses of ralloc to linear in fs_live_variables

  • intel/elk: Convert missing uses of ralloc to linear in fs_live_variables

Chia-I Wu (8):

  • gallium: add pipe_picture_desc::flush_flags

  • frontends/va: track whether there are imported/exported surfaces

  • frontends/va: set PIPE_FLUSH_ASYNC when possible

  • radeonsi: prep for pipe_picture_desc::flush_flags

  • radeonsi: respect pipe_picture_desc::flush_flags

  • radv: check gs_copy_shader directly for executable props

  • radv: make radv_pipeline_has_ngg static

  • drm-shim: intercept access as well

Christian Gmeiner (42):

  • isaspec: Add method to get the displayname of BitSetEnumValue

  • isaspec: Improve ‘meta’ handling

  • etnaviv: isa: Drop capturing of python output

  • etnaviv: isa: Add clang-format special comments

  • etnaviv: isa: Print dst_full for ALU

  • etnaviv: isa: Switch to enum isa_thread

  • etnaviv: isa: Add more flags to etna_inst

  • etnaviv: isa: Rework modeling of left shift for store/load

  • etnaviv: isa: Add name for full writemask

  • mr-label-maker: Add teflon marker

  • etnaviv: isa: Do src swizzle with isaspec

  • clc: Always use spir for 32 bit

  • etnaviv: Zero init all srcs passed to etna_emit_alu(..)

  • ci: uprev mold to 2.32.0

  • gallium: Add vkms entrypoint

  • nak: Move nak_optimize_nir declaration to nak_private.h

  • meson: Update proc-macro2 subproject

  • meson: Update syn subproject

  • meson: Add pest rust dependencies

  • meson: Add roxmltree rust dependency

  • meson: Add indexmap rust dependencies

  • etnaviv: isa: Add meta elements to instructions

  • etnaviv: isa: Generate Rust FFI bindings for asm.h

  • etnaviv: isa: Make header C++ safe

  • etnaviv: isa: Add meson version check

  • etnaviv: isa: Add IsaParser proc_macro_derive

  • etnaviv: isa: Add struct etna_asm_result

  • etnaviv: isa: Make etna_asm_result usable in Rust

  • etnaviv: isa: Add EtnaAsmResultExt trait

  • etnaviv: isa: Add parser module

  • etnaviv: isa: Add C function impl

  • etnaviv: isa: Add cli assembler

  • etnaviv: isa: Extend disasm test

  • ci/etnaviv: Drop shaders@glsl-bug-110796 line

  • etnaviv: isa: Drop 1:1 mapping of opc to bits

  • etnaviv: isa: Add support for extended instructions

  • nak: Update comment about explicit padding

  • etnaviv: isa: Add support for bitset’s displayname

  • etnaviv: isa: Rework branch instruction

  • nak: Set has_imad32 conditionally

  • nak: Move imad late optimization to nir

  • dri: fix driver names

Christopher Michael (3):

  • v3d: Move spec@arb_texture_view@rendering-formats, Crash in broadcom-rpi4-fails

  • v3d: Move spec@!opengl 1.1@getteximage-formats, Fail in broadcom-rpi4-fails

  • broadcom: fix issue of ‘addr’ is used uninitialized

Colin Marc (3):

  • radv/video: don’t truncate frame_num and POC to 32

  • vulkan/video: generate profile_tier_level structure correctly

  • vulkan/video: correctly set sub-layer ordering in H.265 VPS/SPS

Collabora’s Gfx CI Team (6):

  • Uprev Piglit to 7aa7bc1b01d57b4b091c4fc82a94a6ff47f38ebf

  • Uprev Piglit to 8a6ce9c6fc5c8039665655bca4904d5601c6dba0

  • Uprev Piglit to e180f96239edba441f22f58dfc852cafb902844a

  • Uprev Piglit to fdf3fc09deb6beecdf212e65a16c645112540b59

  • Uprev Piglit to 647d0725024f72bc49bbc91c686c5f61168a1fe8

  • Uprev Piglit to 582f5490a124c27c26d3a452fee03a8c85fa9a5c

Cong Liu (1):

  • nir: Fix out-of-bounds access in ntt_emit_store_output()

Connor Abbott (92):

  • ir3: Add scan_clusters_macro to ir3_valid_flags()

  • ir3: Add scan_clusters.macro to is_subgroup_cond_mov_macro()

  • ir3: Validate tied sources better

  • ir3/ra: Don’t demote movmsk instructions to non-shared

  • ir3: Rewrite postsched dependency handling

  • ir3/legalize: Use define for register size

  • ir3: Rewrite regmask implementation

  • ir3/ra: Prepare for shared half-regs

  • ir3/ra: Fix printing shared reg file

  • ir3/ra: Prepare for shared phis

  • ir3: Fix lowering shared parallel copies with immed src

  • ir3/lower_pcopy: Fix immed/const flags for copy from shared

  • ir3: Fix shared parallel copy validation

  • ir3: Don’t use swz with shared registers

  • ir3/lower_copies: Handle HW bug with shared half-floats

  • ir3/lower_copies: Fix “inaccessible” half reg lowering with shared regs

  • ir3/ra: Use ra_reg_get_num() for validating num

  • ir3: Use INVALID_REG in array store

  • ir3: Reset num when creating parallel copies

  • ir3: Validate that shared registers are in-bound

  • ir3: Allow propagation of normal->shared copies

  • ir3: Moves with shared destination are always legal

  • ir3/legalize: Take (ss) into account in WaR hazards

  • ir3/legalize: Remove bad (eq) micro-optimization

  • ir3/legalize: any/all/getone are non-prefetch helper users

  • ir3: Use correct category for OPC_PUSH_CONSTS_LOAD_MACRO

  • ir3: Add support for “scalar ALU”

  • ir3: Implement source restrictions for shared ALU

  • ir3: Validate scalar ALU sources

  • ir3: Immediate source for stc is invalid

  • ir3: Don’t emit single-source collects

  • ir3/cp: Support swapping mad srcs for shared regs

  • ir3/cf: Don’t fold shared conversions

  • ir3: Distinguish lowered shared->normal moves

  • ir3: Add support for ldc.u

  • ir3: Add builder support for shared immediates

  • ir3: Create reduce identity directly

  • ir3: Make type_flags() return a bitmask enum

  • ir3: Support scalar ALU in the builder

  • ir3: Add scalar ALU-specific passes

  • ir3: Get sources before emitting scan_clusters.macro

  • ir3: Rewrite shared reg handling when translating from NIR

  • ir3: Directly use shared registers when possible

  • ir3/nir: Fix imadsh_mix16 definition

  • ir3: Use scalar ALU instructions when possible

  • ir3: Don’t scalarize all SSBO instructions

  • ir3: Don’t manually scalarize SSBO loads

  • freedreno/a7xx: Add AQE-related registers from kgsl

  • freedreno/a7xx: Add A7XX_HLSQ_DP_STR location from kgsl

  • freedreno/crashdec: Initial a7xx support

  • freedreno: Update HLSQ_*_CMD registers for a7xx

  • docs/android: Fix example meson cross file again

  • ir3: Put VS->TCS barrier after preamble

  • ir3/legalize: Insert dummy bary.f after preamble

  • freedreno,ir3: Add has_early_preamble

  • tu: Workaround early preamble HW bug

  • freedreno/a6xx: Workaround early preamble HW bug

  • ir3: Add ir3_info::early_preamble

  • tu: Implement early preamble

  • freedreno/a6xx: Implement early preamble

  • ir3: Enable early preamble

  • tu: Use image aspects for feedback loops

  • tu: Support VK_EXT_attachment_feedback_loop_dynamic_state

  • tu: Use a7xx terminology for flushes

  • freedreno, tu: Use CLEAN events on a7xx

  • tu: Fix unaligned indirect command synchronization

  • tu: Don’t WFI after every dispatch

  • freedreno/a7xx: Fix register file size

  • ir3: Make sure constlen includes stc/ldc.k/ldg.k instructions

  • freedreno: Disable early preamble on a6xx gen4

  • ir3, tu, freedreno: Move early_preamble to ir3_shader

  • tu: Add early preamble statistic

  • ir3: Introduce elect_any_ir3

  • ir3: Use elect_any_ir3 in preambles

  • freedreno: Fix RBBM_NC_MODE_CNTL variants

  • tu: Add support for aligned substreams

  • ir3: Fix UBO size with indirect driver params

  • tu: Make cs writeable for GMEM loads when FDM is enabled

  • tu: Fix fdm_apply_load_coords patchpoint size

  • tu: Support VK_EXT_fragment_density_map on a750

  • tu: Support bufferDeviceAddressCaptureReplay on kgsl

  • freedreno: Fix decoding primitive counter events on a7xx

  • tu: Add VPC hardware workaround for a750

  • ir3: Fix stg/ldg immediate offset on a7xx

  • nir/instr_set: Return the matching instruction

  • nir/instr_set: Don’t remove matching instruction

  • ir3: Split out bindless tex/samp encoding

  • ir3: Don’t consider r63.x as a GPR

  • ir3: Plumb through descriptor prefetch intrinsics

  • ir3: Make preamble rematerialization common code

  • ir3: Expand preamble rematerialization

  • ir3: Add descriptor prefetching optimization on a7xx

Constantine Shablia (5):

  • pan/bi: fix 1D array tex coord lowering

  • pan/bi: clean up tex coord lowering

  • panfrost: report correct MAX_VARYINGS

  • panvk: remove descriptor pool counters

  • panvk: enable KHR and EXT BDA

Corentin Noël (8):

  • zink: Always call deinit_multi_pool_overflow when destroying zink_descriptor_pool_multi

  • ci: Allow to override the virglrenderer render server

  • venus: sync protocol for VK_KHR_maintenance5

  • venus: enable VK_KHR_maintenance5

  • venus/ci: add more recently found flakes

  • wsi: Make sure to return a valid wayland id string

  • venus/ci: Update expectations

  • ci: Make sure to install libraries in the right directory on debian

Craig Stout (8):

  • util: detect_os: add DETECT_OS_FUCHSIA and DETECT_OS_POSIX_LITE

  • util: u_thread: add Fuchsia support

  • util: os_misc: add Fuchsia support

  • util: u_dl: add Fuchsia support

  • util: os_time: add Fuchsia support

  • vulkan/util: add missing dependencies

  • meson: remove unnecessary line continuation

  • vulkan/runtime: add spirv_info_h to vulkan_lite_runtime_header_gen_deps

Daniel Lundqvist (1):

  • radeonsi: Fix unused variable when LLVM is not used for AMD.

Daniel Schürmann (69):

  • aco/ra: fix kill flags after renaming fixed Operands

  • aco/ra: assert that the register file is empty after register allocation completed

  • aco/lower_phis: simplify check for uniform predecessors

  • aco: introduce aco_opcode::p_boolean_phi

  • aco/vn: copy-propagate trivial phis

  • aco/lower_phis: generalize init_state() so that it works with any scalar phis

  • aco/lower_phis: implement SGPR phi lowering

  • aco: use SGPR phi lowering for uniform phis in divergent merge blocks

  • aco: use SGPR phi lowering for all loop header phis

  • aco: use SGPR phi lowering for all scalar phis

  • aco/optimizer: remove p_linear_phi handling from optimizer

  • radv: mark nir_opt_loop() as not idempotent

  • radv: move nir_opt_dead_cf() before nir_opt_loop()

  • panfrost: skip gles-3.0-transform-feedback-uniform-buffer-object on Mali G52 and G57

  • nir/loop_analyze: adjust negative (or huge) iteration count check for bit size

  • nir/opt_if: don’t split ALU of phi into otherwise empty blocks

  • nir/opt_loop: add loop peeling optimization

  • aco/ra: fix handling of killed operands in compact_relocate_vars()

  • aco/ra: Fix array access when finding register for subdword variables

  • aco/ra: refactor get_reg_simple() with increased stride.

  • aco/ra: move can_write_m0() check into get_reg_specified()

  • aco/ra: re-use registers from killed operands

  • aco/ra: change heuristic to first fit

  • aco/ra: use round robin register allocation

  • aco/assembler: fix MTBUF opcode encoding on GFX11

  • aco/assembler: slightly refactor MTBUF assembly for more readability

  • aco/assembler: fix GFX67 MTBUF opcode encoding

  • aco/scheduler: remove unused register_demand parameter

  • aco: move live var information into struct Program

  • aco/reindex_ssa: replace live_var parameter with boolean

  • aco: make aco::monotonic_buffer_resource declaration visible for aco::IDSet

  • aco: use aco::monotonic_allocator for IDSet

  • spirv: make gl_HelperInvocation volatile if demote is being used

  • radv: emit discard as demote by default

  • nir: introduce discard_is_demote compiler option

  • nir/opt_peephole_select: handle nir_terminate{_if}

  • nir: remove nir_intrinsic_discard

  • zink: pass zink_screen to nir_to_spirv().

  • nir/shader_info: remove uses_demote

  • spirv: workaround for tests assuming that OpKill terminates invocations or loops

  • aco/scheduler: fix register_demand validation debug code

  • aco/spill: Unconditionally add 2 SGPRs to live-in demand

  • aco: calculate register demand per instruction as maximum necessary to execute the instruction

  • aco: track and use the live-in register demand per basic block

  • aco: remove get_demand_before()

  • aco/live_var_analysis: slightly refactor handling of additional register demand for Operand copies

  • aco/live_var_analysis: ignore dead phis

  • aco/spill: don’t remove spilled phis

  • aco/ra: use live_in_demand in should_compact_linear_vgprs()

  • aco: add RegisterDemand member to Instruction

  • aco/util: skip empty blocks in IDSet::insert(IDSet)

  • aco/live_var_analysis: refactor using ctx struct

  • aco/live_var_analysis: ignore phi definition and operand demand at predecessors

  • aco/live_var_analysis: inline block->register_demand updates

  • aco/live_var_analysis: remove unused includes

  • aco/live_var_analysis: use separate allocator for temporary live sets

  • aco/ra: remove special-casing of p_logical_end

  • nir: implement loop invariant code motion (LICM) pass

  • radv: use NIR loop invariant code motion pass

  • nir/opt_sink: ignore loops without backedge

  • aco: compute live-in variables in addition to live-out variables

  • aco/ra: use live-in variables directly rather than computing them

  • aco/spill: use live-in variables directly rather than computing them

  • aco/cssa: use live-in variables instead of live-out variables

  • aco/validate: use live-in variables for RA validation

  • aco/print_ir: print live-in instead of live-out variables

  • aco: remove live-out variables from IR

  • aco/spill: Don’t add phi definitions to live-in variables

  • util/disk_cache: enable Mesa-DB disk cache by default

Daniel Stone (27):

  • Revert “ci: disable g52”

  • gbm: Support fixed-rate compression allocation

  • venus/ci: Fix timeout

  • venus/ci: Significantly reduce CTS fraction

  • venus/ci: Temporarily disable jobs

  • dri: Fix BGR format exclusion

  • egl/surfaceless: Enable RGBA configs

  • egl/gbm: Enable RGBA configs

  • egl/dri2: Use createImageFromNames for DRM buffers

  • dri: Remove old createImageWithModifiers

  • dri: Remove createImageFromFds

  • dri: Stop answering DRI_IMAGE_ATTRIB_FORMAT

  • gallium/dri: Delete unused helper function

  • gallium/dri: Drop mesa_format indirection for lookup

  • loader/dri3: Use FourCC for create-image entrypoints

  • egl/x11: Update to createImageFromNames

  • dri: Delete createImageFromName

  • dri: Unify createImage and createImageWithModifiers

  • egl/x11: Remove __DRI_IMAGE_FORMAT remnants

  • loader/dri3: Use FourCC for buffer allocations

  • u_format: Rewrite format table to use YAML

  • format: Generate endian-independent format aliases

  • format: Generate sRGB<->linear conversions from table

  • u_format: Reword introduction

  • build: Check for PyYAML in Meson build

  • dri: Allow INVALID for modifier-less drivers

  • gbm/dri: Remove erroneous assert

Danylo Piliaiev (38):

  • tu: Handle non-overlapping WaW hazard with buffer copy/fill/update

  • tu/a7xx: Don’t set FLUSH_PER_OVERLAP_AND_OVERWRITE for feedback loops

  • tu/a750: Disable HW binning when there is GS

  • freedreno/devices: Add support for Adreno A32 (G3x Gen 2)

  • util/u_trace: Allow mixing of ArgStruct and Arg

  • tu: Add more info to renderpass tracepoint

  • vulkan/wsi: Make current_frame usable in all cases

  • util/u_trace: Pass explicit frame_nr argument to delimit frames

  • tu: Use current_frame from vk device to delimit u_trace frames

  • anv: Use current_frame from vk device to delimit u_trace frames

  • freedreno: Make fd_pps_driver.h usable without including other FD sources

  • turnip/msm: Do rd dump only when there are commands in submission

  • turnip: Implement VK_EXT_depth_clamp_zero_one

  • freedreno/a7xx: Update TPL1_DBG_ECO_CNTL1 to fix UBWC corruption

  • ir3/a7xx: Fix FS consts corruption when other FS has zero constlen

  • tu: Add LRZ disable reason to renderpass tracepoint

  • util/u_trace: Add support for fixed-length string params in tracepoints

  • tu: Add attachments’ UBWC info to renderpass tracepoint

  • freedreno/rddecompiler: Make possible to use original shader

  • freedreno/replay: Fix replaying without SET_IOVA

  • freedreno/ir3: mova has special meaning for (r) flag

  • ir3: Correctly assemble mova1 with (r) on const

  • tu: Fix issues with render_pass tracepoint

  • freedreno: Rename TPL1_DBG_ECO_CNTL1.UBWC_WORKAROUND into TP_UBWC_FLAG_HINT

  • tu: Add enable_tp_ubwc_flag_hint feature to a7xx

  • freedreno/devices: Turn off enable_tp_ubwc_flag_hint for a740 by default

  • freedreno/devices: Fix magic regs for Adreno A32

  • freedreno: Describe LRZ feedback mechanism

  • freedreno/devices: Define and appropriately set has_lrz_feedback

  • tu: Use LRZ feedback in gmem

  • tu: Enable LRZ feedback in sysmem

  • freedreno: Use LRZ feedback in gmem

  • ir3: Print bindless samp/tex ids for tex prefetch

  • ir3/tests: Make possible to specify raw instr value as uint64

  • ir3/tests: Make possible to add generated disasm tests

  • ir3: Fix decoding of stib.b/ldib.b with offset

  • turnip/kgsl: Support external memory via ION/DMABUF buffers

  • tu: Have single Flush/Invalidate memory entrypoints

Dave Airlie (27):

  • radv/video/encode: fix quality params on v2 hw.

  • Revert “zink: use a slab allocator for zink_kopper_present_info”

  • nvk: Only enable WSI modifiers if the extension is supported.

  • draw/texture: handle mip_offset[0] being != 0 for layered textures.

  • nouveau/nvc0: increase overallocation on shader bo to 2K

  • nvidia: fixup classes import and import new classes.

  • nouveau/push: add support for m2mf/i2mf to dumper

  • nouveau/nvc0: add support for using common pushbuf dumper

  • radv/video: fix layered decode h264/5 tests.

  • radv/video: use vcn ip versions for encoder detection.

  • ac/radv/radeonsi: move av1 ctx/probs size/filling to common code.

  • ac/radv/radeon: move film grain init to common code.

  • st/mesa: drop u_simple_shaders.h include where not used.

  • gallivm: create a pass manager wrapper.

  • gallivm: move ppc denorm disable to inline

  • gallivm: split some code out from init module.

  • gallivm: make lp_bld_coro.h c++ include safe.

  • gallivm: export target init code for orc-jit to reuse

  • gallivm: split out generating LLVM Mattrs

  • llvmpipe: Introduce llvmpipe_memory_allocation

  • nvk: use 2k overallocation for shader heap.

  • anv/video: use correct offset for MPR row store scratch buffer.

  • radv/video: advertise mutable/extended for dst video images.

  • draw/orcjit: supply stub function for tcs coro

  • llvmpipe/cs/orcjit: add stub function name for coro

  • gallivm/sample: fix sampling indirect from vertex shaders

  • nvc0: fix null ptr deref on fermi due to debug changes.

David (Ming Qiang) Wu (2):

  • radeonsi/vcn: set accurate size for dec header and index_codec

  • radeonsi/vcn: support DPB_MAX_RES on VCN5

David Heidelberg (58):

  • turnip: rename tu_queue_submit struct to follow ODR

  • ci: fail pipeline for users who got access to restricted traces

  • ci/traces: majanes has no longer access to the restricted traces

  • ci/deqp: correct EGL_EXT_config_select_group detection

  • egl/x11: Move RGBA visuals in the second config selection group

  • mailmap: add Freya Gentz entry

  • etnaviv: migrate from piglit include to generic deqp and toml spec

  • freedreno/ci: move platform to the deqp toml file for a530

  • freedreno/ci: move the disabled jobs from include to the main file

  • freedreno/ci: Switch a306_* to deqp-runner

  • freedreno/ci: do not depend on single job rules for another jobs

  • freedreno/ci: switch a306 to weston

  • freedreno/ci: re-enable a306_piglit

  • ci/panfrost: disable G52 until machines gets fixed

  • ci: drop unused piglit-test and integrate it into piglit-traces-test

  • freedreno/ci: Drop duplicated include and add missing stages

  • freedreno/ci: Implement nightly piglit job for Adreno 630 and 618

  • ci/freedreno: update expectations from the nightly run

  • ci: bump ANGLE

  • ci: Revert “ci: update failures list with angle for jsl, tgl”

  • ci/intel: add new jsl flake

  • ci/panfrost: Revert “ci/panfrost: disable G52 until machines gets fixed”

  • ci/alpine: re-enable Mold linker

  • ci/etnaviv: add flakes from nightly runs

  • winsys/i915: depends on intel_wa.h

  • subprojects: uprev perfetto to v45.0

  • ci/r300: update flake list from nightly reports

  • ci/nouveau: move disabled jobs back from include into main gitlab-ci.yml

  • ci/nouveau: separate HW definition from SW

  • ci/nouveau: adjust and add DEVICE_TYPE

  • ci/freedreno: a3xx will never have Vulkan support

  • docs: correct svga3d redirected URLs

  • ci/radv: dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units passes now

  • ci: re-enable shader-db for nouveau

  • ci: do not build Nine in debian-build-testing

  • ci/piglit: be explicit about what we building

  • ci/lava: enable Piglit OpenCL tests so we can test rusticl on the HW

  • ci/lava: do not build Vulkan for armhf images

  • ci/lava: move wayland-protocols to the main section

  • ci/freedreno: document new failure after piglit update

  • ci/etnaviv: skip Vulkan tests on GC2000

  • ci/etnaviv: remove duplicated line from skips

  • mailmap: update my email

  • ci/arm64: rustify the build

  • ci/lava: add support for RustiCL

  • ci/meson: reuse meson installation

  • ci: move (c)bindgen to own shell script

  • ci/radv: Document recent flake

  • ci/lava: the containers take sometimes more than 60m

  • ci: propagate RUSTICL_ENABLE and DEBUG variables to the DUTs

  • rusticl: add -cl-std only when it’s not defined

  • ci/freedreno: some A306 tests now pass/skip since proper GL detection in Piglit

  • ci: introduce tool for comparing nightly runs

  • util: bump blake3 from 1.3.3 to 1.5.1, improve armv7 and aarch64 performance

  • build: pass licensing information in SPDX form

  • intel/debug: allow silencing CL warnings

  • llvmpipe: Silence “possibly uninitialized value” warning for ssbo_limit (cont)

  • ci/alpine: use llvm variables

David Rosca (39):

  • radv/video: Set correct bit depth and format for 10bit input

  • radv/video: Check encode profiles and bit depth in capabilities query

  • radv/video: Report maxBitrate in encode capabilities

  • radeonsi/vcn: Allocate session buffer in VRAM

  • radeonsi/vcn: Fix 10bit HEVC VPS general_profile_compatibility_flags

  • radeonsi/vcn: Only enable VBAQ with rate control mode

  • frontends/va: Fix AV1 slice_data_offset with multiple slice data buffers

  • Revert “radeonsi/vcn: AV1 skip the redundant bs resize”

  • frontends/va: Only increment slice offset after first slice parameters

  • radeonsi: Update buffer for other planes in si_alloc_resource

  • frontends/va: Store slice types for H264 decode

  • radeonsi/vcn: Ensure DPB has as many buffers as references

  • radeonsi/vcn: Allow duplicate buffers in DPB

  • radeonsi/vcn: Ensure at least one reference for H264 P/B frames

  • frontends/va: Fix leak when destroying VAEncCodedBufferType

  • radeonsi/vcn: Avoid copy when resizing bitstream buffer

  • frontends/va: Send all bitstream buffers to driver at once

  • frontends/va: Fix crash in vaRenderPicture when decoder is NULL

  • radv/video: Add missing VCN 3.0.2 to decoder init switch

  • radeonsi: Make si_compute_clear_image work with 422 subsampled formats

  • gallium/vl: Init shaders on first use

  • frontends/va: Don’t require exact match for packed headers

  • gallium: Add is_video_target_buffer_supported

  • radeonsi: Implement is_video_target_buffer_supported

  • frontends/va: Use is_video_target_buffer_supported for EFC

  • frontends/va: Rework EFC logic

  • frontends/va: Check if target buffer is supported in vlVaEndPicture

  • gallium: Remove PIPE_VIDEO_CAP_EFC_SUPPORTED

  • frontends/va: Simplify AV1 slice parameters handling

  • frontends/va: Move slice_data_offset to context

  • frontends/va: Rename slice_idx to have_slice_params and move to context

  • frontends/va: Support multi elements slice parameter buffers for H264/5

  • gallium: Remove pipe_h264_picture_desc.slice_parameter.slice_count

  • radeonsi/vcn: Limit size to target size in AV1 decode

  • radeonsi: Add debug option to enable low latency encode

  • radeonsi/vcn: Add low latency encode support

  • frontends/va: Support frame rate per temporal layer for AV1

  • radeonsi/vcn: Support 10bit RGB for EFC input

  • radeonsi/vcn: Add decode DPB buffers as CS dependency

Deborah Brouwer (1):

  • ci/lava: Detect a6xx gpu recovery failures

Derek Foreman (13):

  • wsi/wayland: refactor wayland dispatch

  • egl/wayland: Use loader_wayland_dispatch

  • perfetto: Add flows

  • wsi/wayland: Add perfetto flows to image acquisition and presentation

  • wsi/wayland: Add flow id to presentation feedback

  • wsi/wayland: Add timing debugging

  • perfetto: Add simple support for counters

  • wsi/wayland: Add latency information to perfetto profiling

  • perfetto: Add some functions for timestamped events

  • wsi/wayland: Add a perfetto track for image presentation

  • wsi/wayland: Add tracepoint in wsi_wl_swapchain_wait_for_present

  • wsi/wayland: Fix use after free from improperly stored VkAllocationCallbacks

  • wsi/wayland: Use different queue names for different queries

Dmitry Baryshkov (1):

  • freedreno/registers: drop display-related register files

Dmitry Osipenko (1):

  • venus: make cross-device optional

Doug Brown (1):

  • xa: add missing stride setup in renderer_draw_yuv

Dr. David Alan Gilbert (1):

  • treewide: Cleanup unused structs

Dylan Baker (12):

  • meson: use glslang –depfile argument when possible

  • clc: remove check for null pointer that cannot be true in llvm_mod_to_spirv

  • compiler/glcpp: don’t recalculate macro

  • intel/compiler: move predicated_break out of backend loop

  • anv/grl: add some validation that we’re not going to overflow

  • egl/wayland: fix memory leak in error handling case

  • compilers/clc: Add missing break statements.

  • mesa: fix memory leak when using shader cache

  • util/glsl2spirv: fixup the generated depfile when copying sources

  • tgsi_to_nir: free disk cache value if the size is wrong

  • crocus: properly free resources on BO allocation failure

  • crocus: check for depth+stencil before creating resource

Echo J (3):

  • nvk: Add sha1_h as a dependency

  • d3d10umd: Use pipe_resource_usage enum in translate_resource_usage()

  • util: Fix the integer addition in os_time_get_absolute_timeout()

Eli Schwartz (2):

  • meson: create libglsl declared dependency to propagate order-only deps

  • meson: add various generated header dependencies as order-only deps

Emma Anholt (13):

  • nir,panfrost,agx: Fix driver PIXEL_COORD_INTEGER setting and drop workaround.

  • dri: Fix a pasteo in dri2_from_names()

  • dri: Consistently use createImageWithModifiers2()

  • dri: Consistently use createImageFromFds2(), not createImageFromFds()

  • dri: Replace createImageFromDmaBufs() with createImageFromDmaBufs3()

  • dri: Drop old createImageFromRenderbuffer()

  • dri: Consistently use createImageFromDmabufs() not createImageFromFds()

  • dri: Drop createImageFromFds2() in favor of createImageFromDmaBufs()

  • dri: Move EGL image lookup/validate setup to dri_init_screen()

  • mesa: Drop some version checking around ValidateEGLImage

  • dri: Collapse dri2_validate_egl_image() into dri_validate_egl_image()

  • dri: Fold lookup_egl_image_validated into its one caller

  • dri: Drop the old lookupEGLImage wrapper function.

Eric Engestrom (295):

  • VERSION: bump to 20.2

  • docs: reset new_features.txt

  • docs: add release notes for 24.0.6

  • docs: update calendar for 24.0.6

  • docs: add an extra 24.0.x release

  • docs: add sha256sum for 24.0.6

  • docs: update calendar for 24.1.0-rc1

  • ci: fix container rules on release branches and tags

  • panvk/ci: add WSI testing to all the deqp-vk jobs

  • lavapipe/ci: add WSI testing to all the deqp-vk jobs

  • freedreno/ci: add flake

  • lavapipe/ci: add flakes

  • ci: pass MESA_VK_ABORT_ON_DEVICE_LOSS through to the DUT

  • rpi3/ci: drop duplicate comment without any corresponding actual skip line

  • v3dv/ci: skip all the WSI tests, they are way too flaky to be worth it

  • spirv: deduplicate default debug log level

  • v3dv/ci: add rpi5 failure

  • ci: mark microsoft farm as offline

  • meson: simplify `-gsplit-dwarf` compiler argument check

  • egl+glx: fix two #ifdef that should be #if like the rest

  • meson: always set USE_LIBGLVND

  • meson: use bool.to_int() instead of manually converting

  • lavapipe/ci: drop fixed test from failures

  • lavapipe/ci: add the rest of the failures introduced by the 1.3.8.2 uprev

  • lavapipe/ci: skip another test that goes over the timeout

  • meson: move tsan-blacklist.txt to build-support with the other build support files

  • llvmpipe/ci: fix indentation

  • llvmpipe/ci: only run jobs when their corresponding files are changed

  • lavapipe/ci: fix indentation

  • lavapipe/ci: avoid running all lavapipe jobs when llvmpipe ci is changed

  • lavapipe/ci: only run jobs when their corresponding files are changed

  • docs: update calendar for 24.1.0-rc2

  • llvmpipe/ci: trigger jobs on draw & gallivm changes

  • lavapipe/ci: trigger jobs on draw & gallivm changes

  • lavapipe/ci: add flakes seen lately

  • lavapipe/ci: generalize flakes list to all formats for these flaky tests

  • lavapipe/ci: skip ray tracing tests that sometimes time out

  • vc4/ci: add fails seen overnight

  • ci: uprev mold to 2.31.0

  • lavapipe/ci: skip two more timing out ray query tests

  • ci: backport fix for gl_PointSize bug in CTS

  • lavapipe/ci: move a few skips out from under the “llvm jit” comment

  • mr-label-maker: fix yaml syntax

  • docs: add release notes for 24.0.7

  • docs: update calendar for 24.0.7

  • docs: add sha256sum for 24.0.7

  • docs: update calendar for 24.1.0-rc3

  • ci/debian-build-testing: drop extra nesting section

  • ci/shader-db: drop extra nesting section

  • rpi4/ci: use deqp-runner suite for vk job as well

  • rpi5/ci: use deqp-runner suite for vk job

  • microsoft/clc: fix incorrect changes that got through while the Windows CI was down

  • llvmpipe: wrap the push/pull in the ifdef as well

  • radv/ci: add navi21 flakes

  • zink: avoid designated initializers as they are not supported in C++ < 20

  • Revert “ci: fail pipeline for users who got access to restricted traces”

  • radeonsi/ci: document new crash (assert)

  • util/format: add missing null check in util_format_is_srgb()

  • ci: drop default VKD3D_PROTON_RESULTS file name

  • ci: hardcode `-vkd3d` namespace for VKD3D_PROTON_RESULTS

  • amd/ci: track changes to VKD3D_PROTON_RESULTS files

  • mr-label-maker: mark *-vkd3d.txt files as CI results expectations files

  • ci: reuse dead .vkd3d-proton-test to make vkd3d less radv-specific

  • ci: fix section_end in debian-build-testing

  • ci: rename debian version variable job to include the word “version”

  • ci: factor out all the deps to build the debian containers into .debian-container

  • ci: inherit the debian container building infra for test container images

  • ci/b2c: rename B2C_TIMEOUT_FIRST_* to B2C_TIMEOUT_FIRST_CONSOLE_ACTIVITY_*

  • ci/b2c: rename B2C_TIMEOUT_* to B2C_TIMEOUT_CONSOLE_ACTIVITY_*

  • ci/b2c: allow setting timeouts in seconds

  • ci: drop dead VK_CPU option

  • ci/piglit-traces: drop re-definition of VK_DRIVER_FILES

  • ci/init-stage2: set VK_DRIVER_FILES for both xorg and wayland

  • ci/vkd3d: un-hardcode architecture

  • ci/vkd3d: fix version sanity check

  • ci/vkd3d: fail job when failing to get driver version

  • ci/b2c: remove dead rules: that’s always overwritten

  • ci/env: move dead-code-with-comment to the end of the list to make it clearer

  • zink/ci: rename .zink-lvp-venus-rules to .zink-venus-lvp-rules to match the rest of the names

  • README: update links to our own docs

  • docs: update calendar for 24.1.0-rc4

  • mailmap: add entry to unify Roman Stratiienko’s contributions

  • nvk/ci: add nvk job on a GA106 (RTX 3060)

  • zink/ci: add zink+nvk glcts+piglit job on a GA106 (RTX 3060)

  • zink+nvk/ci: skip glx piglit tests as they all fail

  • zink+nvk/ci: skip timing out test

  • zink+nvk/ci: skip more tests that times out

  • zink+nvk/ci: document flakes seen during stress-testing

  • zink+nvk/ci: update expected failures

  • docs: add release notes for 24.0.8

  • docs: update calendar for 24.0.8

  • docs: add sha256sum for 24.0.8

  • docs: add release notes for 24.1.0

  • docs: add sha256sum for 24.1.0

  • docs: update calendar for 24.1.0

  • ci: fix build-kernel.sh -> download-prebuilt-kernel.sh

  • ci: drop dead variables (see previous commit)

  • ci: rename debian/arm*_test to debian/baremetal_arm*_test to be clear about which infra uses that

  • ci: prepare base debian test image for multi-arch

  • ci: prepare GL debian test image for multi-arch

  • ci: prepare VK debian test image for multi-arch

  • ci/image-tags: rename DEBIAN_X86_64_TEST_*_TAG to drop the x86 mention

  • ci: add debian/arm64_test images for gl & vk

  • zink/ci: rename zink-turnip collabora rule to make it unambiguous

  • ci/b2c: add aarch64 tests for gl & vk

  • turnip/ci: add vkcts jobs on the a750

  • turnip+zink/ci: add gl & gles CTS jobs on the a750

  • nvk/ci: adjust the regex for “dut is broken and needs to be rebooted”

  • nvk/ci: mark the job as failing in case of hangs, instead of silently rebooting

  • nvk/ci: add missing .test rules to avoid running nvk tests in post-merge pipeline

  • radv/ci: move amdgpu-specific kernel message warning to src/amd/ci/

  • ci/b2c: make B2C_JOB_WARN_REGEX optional

  • zink+nvk/ci: more KHR-GL46.packed_pixels.varied_rectangle.* flakes, so mark the group as flaky

  • zink+nvk/ci: add more flakes seen in nightly

  • zink+nvk/ci: spec@ext_external_objects@vk-vert-buf-reuse has been fixed

  • mr-label-maker: label src/vulkan/wsi/ as wsi

  • .mailmap: fix email address for @cpmichael

  • v3dv/ci: fix typo in `renderer_check`

  • ci: disable debian-build-testing until it can be fixed

  • vc4/ci: skip VK piglit tests

  • freedreno/a6xx: fix kernel -> compute handling

  • zink+nvk/ci: add flakes seen in latest nightly run

  • docs/calendar: add 24.2 branchpoint and release candidates schedule

  • panfrost/ci: add missing genxml trigger path

  • panfrost: mark tests as fixed

  • etnaviv/ci: skip VK piglit tests

  • radv/ci: document angle regressions from !29436 on stoney

  • zink+nvk/ci: add flakes seen in latest nightly run

  • docs/meson: replace deprecated pkgconfig with pkg-config

  • zink+nvk/ci: add flakes seen in latest nightly run

  • v3dv: add missing bounds check in VK_EXT_4444_formats

  • docs: add release notes for 24.1.1

  • docs: add sha256sum for 24.1.1

  • docs: update calendar for 24.1.1

  • turnip/ci: add a750 flakes seen in the latest nightly

  • radv/ci: fix manual rules

  • radv/ci: move radv manual rules into their own group

  • nvk+zink/ci: add another flake seen in nightly

  • docs: add release notes for 24.0.9

  • docs: update calendar for 24.0.9

  • docs: add sha256sum for 24.0.9

  • venus/ci: add flake that’s been blocking MRs

  • v3d/drm-shim: emulate a rpi4 instead of a rpi3

  • nvk+zink/ci: add another flake seen in nightly

  • radv/ci: document navi31 regression from !29235

  • ci: set a common B2C_JOB_SUCCESS_REGEX with the message that’s printed for all jobs

  • ci/deqp: uprev gl & gles cts

  • radeonsi/ci: mark a bunch of tests as fixed on vangogh

  • radv/ci: drop duplicate navi21-aco flakes line

  • radv/ci: drop duplicate navi31-aco flakes line

  • turnip+zink/ci: mark a dEQP-GLES(2|3).functional.rasterization.(fbo|primitives).line_(strip_|)wide as fixed

  • turnip/ci: add a750 flakes seen in the latest nightly

  • panfrost/ci: remove duplicate path

  • nvk+zink/ci: mark KHR-GL46.sparse_texture2_tests.SparseTexture2* as fixed

  • nvk+zink/ci: add flakes seen in nightly pipeline

  • nvk+zink/ci: consider all the `double` tests in spec@glsl-4.00@execution@built-in-functions to be flaky

  • freedreno/ci: disable mid-testing reboot on a750

  • driconf: drop param for setting default gpu vendor id in DRI_CONF_FORCE_VK_VENDOR()

  • egl: fix teardown when using xcb

  • egl: move android-specific code into an android branch

  • egl: ensure future platforms get their teardown implemented

  • egl/device: drop unnecessary intermediate variable

  • ci: fix meson install script

  • lavapipe/ci: update trace checksum following nir change

  • lavapipe/ci: document regression while it’s being worked on

  • turnip+zink/ci: mark dEQP-GLES3.functional.fbo.depth.depth_test_clamp.* tests as fixed

  • bin/ci: escape literal url in regex

  • glx: fix build -D glx-direct=false

  • nvk+zink/ci: mark spec@ext_image_dma_buf_import@ext_image_dma_buf_import-refcount-multithread as fixed

  • nvk+zink/ci: add flakes seen over the last few nightlies

  • asahi/lib: generate git_sha1.h for agx_device.c

  • ci/vkd3d: deduplicate the diff between the expectation and the results

  • ci/vkd3d: print a message when the expected failures file is missing

  • ci/vkd3d: drop override of job artifacts

  • ci/vkd3d: fix error message printing

  • ci/vkd3d: stop ignoring errors in a block where errors can’t happen

  • ci/vkd3d: don’t ignore errors

  • ci/vkd3d: group version check lines together

  • ci/vkd3d: limit the vulkaninfo capture to the driverInfo line

  • ci/vkd3d: print a real error message when failing to get the list of failing tests

  • ci/vkd3d: rename vkd3d test log file to end in .txt

  • ci/vkd3d: print URL to the vkd3d-proton.log file to make it easier to access

  • ci/vkd3d: put `then` on the same line as the `if` to match the rest of the code style

  • ci/vkd3d: drop the “clear results folder without deleting the folder” logic

  • ci/vkd3d: drop `quiet` wrapper

  • ci/vkd3d: drop redundant “vkd3d-proton execution: SUCCESS”

  • docs: add release notes for 24.1.2

  • docs: add sha256sum for 24.1.2

  • docs: update calendar for 24.1.2

  • venus/ci: fix indentation of list nested in a dict item

  • venus/ci: add manual/nightly venus-lavapipe-full

  • venus/ci: skip timed out test

  • nvk+zink/ci: add flakes seen over the last two nightly runs

  • nvk+zink/ci: catch more `double` flakes

  • venus+zink/ci: drop fraction and add missing timeout on zink-venus-lvp

  • loader: use os_get_option() to allow android to set LIBGL_DRIVERS_PATH, GBM_BACKENDS_PATH, GALLIUM_PIPE_SEARCH_DIR

  • gallium/hud: use os_get_option() to allow android to set GALLIUM_HUD and related vars

  • egl: use os_get_option() to allow android to set EGL_LOG_LEVEL

  • venus/ci: make sure nightly job doesn’t get retried

  • venus/ci: drop fixed test from fails list

  • docs/ci: fix indentation of list nested in a dict item

  • docs/ci: merge test-docs and test-docs-mr

  • docs/ci: auto-run test-docs in fork pipelines

  • docs/ci: drop .no_scheduled_pipelines-rules from test-docs

  • ci: reorder alpine/x86_64_build rules to fix the nightly pipelines

  • drm-shim: stub syncobj_timeline_signal ioctl

  • llvmpipe/ci: add comment for later on weird-looking code

  • llvmpipe/ci: fix indentation of list nested in a dict item

  • llvmpipe/ci: set rusticl variables in deqp-runner instead of passing them down from the job

  • ci: include rusticl in the arm64 build

  • llvmpipe,rusticl/ci: move rusticl files rule out of llvmpipe

  • v3d/ci: add nightly job for rusticl testing

  • panfrost/ci: drop duplicate job rules

  • panfrost/ci: split gl & vk jobs rules

  • radeonsi/ci: mark test as fixed

  • lavapipe/ci: skip timing out test

  • broadcom/ci: disable auto-retry on manual jobs

  • docs/features: mark VK_KHR_maintenance7 as implemented on anv and lvp

  • docs: add release notes for 24.1.3

  • docs: update calendar for 24.1.3

  • docs: add sha256sum for 24.1.3

  • ci_run_n_monitor: add support for new `canceling` job status

  • ci_run_n_monitor: be coherent about using sets for `element in group` checks

  • ci_run_n_monitor: use COMPLETED_STATUSES in more places

  • ci_run_n_monitor: add RUNNING_STATUSES and use it where appropriate

  • bin/ci: allow bugfixes in requirements.txt

  • ci: split .no-auto-retry out of .scheduled_pipeline-rules

  • ci: simplify setting .no-auto-retry now that it isn’t bundled with unrelated rules:

  • v3d/ci: include results of GL full run in expectations

  • v3d/ci: include results of CL run in expectations

  • zink+nvk/ci: ascii-sort fails

  • zink+nvk/ci: document regression from !30033

  • turnip+zink/ci: add two more CS related flakes

  • lvp+zink/ci: document a flake seen in a merge pipeline

  • v3d/ci: add disabled job for GL testing on the RPi5

  • v3d/ci: rename “rusticl on v3d” suite to `v3d-rusticl`

  • v3d/ci: add disabled job for CL testing on the RPi5

  • features.txt: specify that VK_EXT_depth_clamp_zero_one is only supported on v3dv/vc7+

  • features.txt: specify that VK_EXT_depth_clip_enable is only supported on v3dv/vc7+

  • features.txt: specify that GL_ARB_depth_clamp is only supported on v3d/vc7+

  • docs: add release notes for 24.1.4

  • docs: update calendar for 24.1.4

  • docs: add sha256sum for 24.1.4

  • ci: replace gallium-drivers=swrast with gallium-drivers=llvmpipe,softpipe

  • bin/ci_run_n_monitor: explain that the ‘Universal Recycling symbol’ ♲ emoji means these jobs were cancelled

  • bin/ci_run_n_monitor: add text labels next to the emojis

  • bin/ci_run_n_monitor: replace ♲ with 🗙 to represent cancelled jobs

  • meson: fix filename printed when generating devenv files

  • meson/megadriver: fix install message to match the rest of meson

  • meson/megadriver: stop removing the “master” .so file

  • meson/megadriver: replace hardlinks with symlinks

  • ci/vkd3d: fix LD_LIBRARY_PATH

  • v3d/ci: mark spec@amd_performance_monitor@vc4 tests as fixed

  • llvmpipe/ci: mark spec@!opengl 1.1@gl_select tests as fixed

  • Revert “bin/ci_run_n_monitor: explain that the ‘Universal Recycling symbol’ ♲ emoji means these jobs were cancelled”

  • VERSION: bump for 24.2.0-rc1

  • .pick_status.json: Update to 0cc23b652401600e57c278d8f6fe6756b13b9f6a

  • radeonsi/ci: skip timing out test

  • freedreno/ci: double job timeout for a306

  • freedreno/ci: document extra variants of failing tests on a618 and a630

  • anv+zink/ci: mark some tests as fixed

  • anv+zink/ci: document two tests, one failing and one crashing

  • anv+zink/ci: mark a couple of tests as flaky

  • venus/ci: skip timing out test

  • loader: gc loader_get_extensions_name() and __DRI_DRIVER_{GET_,}EXTENSIONS defines

  • .pick_status.json: Update to 3b6867f53a6718de80bbff4acb84ffd5aca8a8c8

  • nak: fix meson typo

  • venus: initialize bitset in CreateDescriptorPool()

  • v3d/ci: mark spec@amd_performance_monitor@vc4 tests as flaky

  • meson: xcb & xcb-randr are needed by the loader whenever x11 is built

  • .pick_status.json: Update to c30e5d44b1027ed03a8fd542829df0055d3e1a96

  • .pick_status.json: Update to 6cd4372460b197fea98d257217328ddc3406e6ad

  • docs: add stub header for u_format_gen.h

  • .pick_status.json: Update to c33d2db06ac0ea4d3d5372caa93bee3bbbe028c7

  • VERSION: bump for 24.2.0-rc2

  • .pick_status.json: Update to ad90bf0500e07b1bc35f87a406f284c0a7fa7049

  • ci/baremetal: fix logic for retrying boot when it failed

  • meson: don’t select the deprecated `swrast` option ourselves

  • meson: improve wording of “incompatible llvm options” error

  • ci: remove llvmpipe in the job that disables llvm

  • .pick_status.json: Update to aa9745427b917bb0613b753ccd59c6c1e6f07584

  • VERSION: bump for 24.2.0-rc3

  • .pick_status.json: Update to 366e7e2ddc7d3b340bbf040eca1d3223219e6122

  • meson,ci: remove dead `kmsro` option in `gallium-drivers`

  • .pick_status.json: Mark 93f9afa1e039cbf681adcc6d170aec987d9f0f65 as denominated

  • .pick_status.json: Mark f427c9fe233e862bfa30d0c7441ce77592ce4654 as denominated

  • .pick_status.json: Update to d58f7a24d1be7b8b50ebdc0c1c3ce26bd65317a5

  • .pick_status.json: Update to d9849ac46623797a9f56fb9d46dc52460ac477de

  • .pick_status.json: Update to ef88af846761ca9e642f7ed46011db7d3d6b61fd

  • VERSION: bump for 24.2.0-rc4

  • .pick_status.json: Update to c90e2bccf756004e48f9e7e71e555db0d03c1b98

  • ci: pass MESA_SPIRV_LOG_LEVEL from job to the test

  • android: fix build in multiple ways

  • .pick_status.json: Update to 214b6c30406f844560bdf35a54ff8a51ee248709

  • .pick_status.json: Update to cc2dbb8ea5329b509d79eedb6c0cbb9a1903b5ad

Eric R. Smith (8):

  • panfrost: add a barrier when launching xfb jobs in CSF

  • get_color_read_type: make sure format/type combo is legal for gles

  • glsl: test both inputs when sorting varyings for xfb

  • glsl: make the xfb varying sort stable

  • panfrost: fix some omissions in valhall flow control

  • panfrost: change default rounding mode for samplers

  • panfrost: fix texture.border_clamp regression for valhall

  • panfrost: use RGB1 component ordering for R5G6B5 pixel formats

Erico Nunes (6):

  • ci: lima farm maintenance

  • lima/ci: update piglit ci expectations

  • Revert “ci: lima farm maintenance”

  • lima: fix surface reload flags assignment

  • mesa/st: don’t set lower_fdot in draw_nir_options

  • dri: fix sun4i-drm driver name

Erik Faye-Lund (106):

  • panfrost: add PAN_MAX_TEXEL_BUFFER_ELEMENTS define

  • panfrost: clamp buffer-size to max-size

  • panfrost: remove nonsensical assert

  • panfrost: do not deref potentially null pointer

  • panfrost: check return-value from u_trim_pipe_prim

  • panfrost: assert that drmSyncobjWait returns 0

  • panfrost: check return-code of drmSyncobjWait

  • panfrost: correct first-tracking for signature

  • panvk: drop needless null-check

  • panvk: do not leak bindings

  • panvk: drop needless null-checks

  • panvk: avoid dereferencing a null-pointer

  • docs/panfrost: compact gpu-table

  • docs/panfrost: move details to separate articles

  • docs/panfrost: link to conformant products

  • panfrost: simplify panfrost_texture_num_elements

  • panfrost: explicitly loop over surfaces

  • panfrost: untangle faces from layers

  • util/format: correct a typo

  • mesa/main: rewrite mipmap generation code

  • mesa/main: remove unused function

  • mesa/main: rework GL_IMAGE_PIXEL_TYPE query

  • mesa/main: clean up _mesa_uncompressed_format_to_type_and_comps

  • mesa/main: clean up switch statement

  • mesa/main: do not return _REV format for uncompressed format

  • mesa/main: prefer non-suffixed enums

  • mesa/main: fixup indent

  • mesa/main: updates for EXT_texture_format_BGRA8888

  • docs: wrap long words instead of overflowing

  • meson: bump test-timeout

  • mesa/main: remove unused function

  • panfrost: lower maxVertexInputStride to match vulkan runtime

  • mesa/main: remove stale prototype

  • mesa/main: remove duplicate error-checks

  • mesa/main: require EXT_texture_integer for GL 3.0

  • mesa/main: do not allow RGBA_INTEGER et al in gles3

  • mesa/main: factor out format/type enum checking

  • mesa/main: use extension-helper

  • mesa/main: tighten rg/half-float interaction

  • mesa/main: use _mesa_is_gles1()-helper

  • mesa/main: remove needless check

  • mesa/main: simplify conditions

  • mesa/main: merge identical checks

  • panvk: move macro-definition to header

  • mailmap: invert tomeu’s mapping

  • mailmap: merge Robert and Bob Beckett into one

  • mailmap: invert my mailmapping

  • mailmap: map collabora.co.uk to collabora.com

  • mailmap: move konstantin to the right sorted position

  • mailmap: use consistent spelling for constantine

  • mailmap: update rohan’s primary email address

  • nir: fix utf-8 encoding-issue

  • Revert “docs: use html_static_path for static files”

  • docs: edgeflag -> edge flag

  • docs: zink -> Zink

  • docs: Anv -> ANV

  • docs: tgsi -> TGSI

  • docs: hw -> HW

  • docs: mooth -> smooth

  • docs: unify spelling of front/back-facing

  • docs: eg. -> e.g.

  • docs: url -> URL

  • docs: nabled -> enabled

  • docs: sommelier -> Sommelier

  • docs: remove apostrophe from uppercased

  • docs: utrace -> trace

  • docs: google -> Google

  • docs: Nvidia -> NVIDIA

  • docs: ssbo/ubo -> SSBO/UBO

  • docs: cpu -> CPU

  • docs: gpu -> GPU

  • docs: renderpass -> render pass

  • docs: spell out “stencil reference”

  • docs: submision -> submission

  • docs: Steamos -> SteamOS

  • docs: colour -> color

  • docs: occured -> occurred

  • docs: precidence -> precedence

  • docs: undifined behaviour -> undefined behavior

  • docs: debian -> Debian

  • docs: zink -> Zink

  • docs: vulkan -> Vulkan

  • docs: attachements -> attachments

  • docs: acress -> across

  • docs: pluggins -> plug-ins

  • docs: pusbuf -> pushbuf

  • docs: metadatas -> metadata

  • docs: use os.pardir

  • docs: allow out-of-tree docs build

  • meson: build html-docs

  • docs: automatically generate depfile

  • meson: error when missing hawkmoth

  • meson: allow specifying html-docs-path

  • ci: build docs using meson

  • panvk: support x11 wsi

  • vulkan/runtime: tne -> the

  • vulkan/runtime: initizlie -> initialize

  • vulkan/runtime: abreviation -> abbreviation

  • vulkan/runtime: multiesample -> multisample

  • vulkan/runtime: implementaiton -> implementation

  • docs: fix bootstrap-extension

  • docs/panfrost: fix numbered list

  • docs/panfrost: fix math-notation

  • docs/panfrost: use math-role more

  • docs/panfrost: use c:func-role for function

  • docs/panfrost: quote identifiers

Esdras Tarsis (1):

  • nvk: Enable 8bit and 16bit access in VK_KHR_workgroup_memory_explicit_layout.

Faith Ekstrand (297):

  • nak: Don’t saturate depth writes

  • nvk: Only clip Z with the guardband

  • nouveau/class_parser.py: Fix the docs for –out-rs

  • nvk: Advertise VK_EXT_pipeline_robustness

  • nouveau/headers: Clean up the meson a bit

  • spirv: Auto-generate spirv_info.h

  • spirv: Update the JSON and headers

  • spirv: Better handle duplicated enums in the JSON parser

  • spirv: Generate a spirv_capabilities struct

  • spirv: Record capabilities rather than ad-hoc bools

  • mesa: Stop pretending to support SPV_AMD_gcn_shader in OpenGL

  • spirv: Move the old AMD extensions out of capabilities

  • spirv: Move the printf enable out of capabilities

  • spirv: Add supported_capabilities to vtn_builder

  • spirv: Use supported_capabilities for various checks

  • spirv: Drop the SubgroupUniformControlFlow check

  • spirv: Add a table of all implemented capabilities

  • spirv: Check capabilities using the supported_capabilities table

  • spirv: Add support for specifying caps through the new struct

  • spirv: Use spirv_capabilities in tests

  • mesa: Flip the script on SPIR-V extension enabling

  • mesa: Use the new spirv_capabilities struct

  • clover: Use the new spirv_capabilities struct

  • rusticl: Use the new spirv_capabilities struct

  • vulkan: Set SPIR-V caps from supported features

  • radv: Use vk_physical_device_get_spirv_capabilities()

  • intel/kernel: Use the new capabilities struct

  • asahi/clc: Use the new spirv_capabilities struct

  • zink: Use the new spirv_capabilities struct

  • anv: Use spirv_capabilities for the float64 shader

  • ir3: Use spirv_capabilities in ir3_cmdline

  • microsoft: Use spirv_capabilities for spirv_to_dxil

  • spirv: Get rid of the old caps struct

  • nvk: Re-emit sample locations when rasterization samples changes

  • nvk/meta: Restore set_sizes[0]

  • nvk: Get rid of sets_dirty

  • nvk: Don’t rely on push_dirty for which push sets exist

  • nouveau/headers: Add a bool for whether or not to dump offsets

  • nvk/upload_queue: Only upload one line of data

  • nvk/upload_queue: Add some useful asserts

  • nvk/upload_queue: Add a _fill method

  • nvk: Use the upload queue for NVK_DEBUG=zero_memory

  • nvk: Improve the GetMemoryFdKHR error

  • nouveau/winsys: Take a reference to BOs found in the cache

  • nouveau/winsys: Make BO_LOCAL and BO_GART separate flags

  • nvk: Allow GART for dma-bufs

  • nil: Use the right PTE kind for Z32 pre-Turing

  • nvk: Set color/Z compression based on nil_image::compressed

  • nil: Default to NV_MMU_PTE_KIND_GENERIC_MEMORY on Turing+

  • nvk: Allow VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT

  • drm-uapi: Sync nouveau_drm.h

  • nouveau/winsys: Add back nouveau_ws_bo_new_tiled()

  • nvk: Support image creation with modifiers

  • nvk: Set tile mode and PTE kind on dedicated dma-buf BOs

  • nvk: Implement DRM format modifier queries

  • nvk: Advertise VK_EXT_queue_family_foreign

  • nvk: Advertise VK_EXT_image_drm_format_modifier

  • vulkan/wsi: Bind memory planes, not YCbCr planes.

  • nvk/wsi: Advertise modifier support

  • zink: Set workarounds.can_do_invalid_linear_modifier for NVK

  • nvk: Fix misc. whitespace and style issues

  • nvk: Go wide for query copies

  • nvk: Store descriptor set addresses in descriptor state

  • nvk: Add static asserts for nvk_buffer_address layout

  • nvk: Store an nvk_buffer_address for each set in the root table.

  • nvk: Advertise 32 descriptor sets

  • nvk: Move and better document set_dynamic_buffer_start

  • nvk: Add an NVK_MAX_SAMPLES #define

  • nvk: Refactor nvk_meta_begin() to use a desc helper

  • nvk/meta: Save and restore set_dynamic_buffer_start

  • nak: Emit !PT for carries on IADD3

  • nak: Add with -0 for fabs()

  • nak: Don’t emit a plop3 for immediate shift sources

  • nak: Encode LDC directly

  • vulkan: Update XML and headers to 1.3.286

  • spirv: Update the JSON and headers

  • nir: Handle cmat types in lower_variable_initializers

  • spirv: Handle constant cooperative matrices in OpCompositeExtract

  • spirv: Assert that non-vector composites have the right length

  • spirv: Implement SPV_EXT_replicated_composites

  • nvk: Advertise VK_EXT_shader_replicated_composites

  • anv: Advertise VK_EXT_shader_replicated_composites

  • hasvk: Advertise VK_EXT_shader_replicated_composites

  • radv: Advertise VK_EXT_shader_replicated_composites

  • turnip: Advertise VK_EXT_shader_replicated_composites

  • lavapipe: Advertise VK_EXT_shader_replicated_composites

  • dozen: Advertise VK_EXT_shader_replicated_composites

  • nir/print: Improve divergence information

  • nak: Fix NAK_DEBUG=serial for warp barriers

  • nak: Only convert the written portion of the buffer in NirInstrPrinter

  • nak: Fix BasicBlock::phi*() for OpAnnotate

  • nak: BMov is always variable-latency

  • nak: Only copy-prop neg into iadd2/3 if no carry is written

  • nak: Get rid of OpINeg

  • nak: Expose a BasicBlock::map_instrs() helper

  • nak: Add some helpers for uniform instructions and registers

  • nak: Add OpR2UR

  • nak: Clean up bindless cbuf handles

  • nak/ra: Move an assert

  • nak: Make SSARef::file() return Option<RegFile>

  • nak: Drop BasicBlock::new()

  • nak: Add a concept of uniform blocks

  • nak/to_cssa: Resolve phi register file mismatches

  • nak/ra: Spill UGPRs and UPreds

  • nak/ra: Never move uniform regs in non-uniform blocks

  • nak: Support uniform regs in lower_copy_swap()

  • nak/sm70: Defer ALU src processing until encode_alu()

  • nak/sm70: Rework ALU source encode helpers

  • nak/sm70: Add support for encoding uniform ALU ops

  • nak/sm70: Fix encoding of fadd/fsetp and friends with UGPRs

  • nak/sm70: Implement a bunch of uniform ops on SM75+

  • nak/legalize: Fold immediate sources before instructions

  • nak/legalize: Drop some pointless plop3 logic

  • nak/legalize: Be more precise about shfl and out

  • nak/legalize: Fix imad and ffma legalization on SM50

  • nak/legalize: Patch a RegFile through to copy helpers

  • nak/legalize: Handle uniform sources in warp instructions

  • nak/legalize: Ensure all SSA values for a given ref are in the same file

  • nak/legalize: Copy uniform vectors in non-uniform control-flow

  • nak/legalize: Uniform instructions can’t have cbuf sources

  • nak/legalize: Explicitly ignore OpPhiSrcs and OpPhiDsts

  • nak/calc_instr_deps: Rename a couple variables

  • nak/calc_instr_deps: Rewrite calc_delays() again

  • nak/calc_instr_deps: Add latencies for uniform instructions

  • nak: Add a opt_uniform_instrs() pass

  • nak/copy_prop: Rewrap a couple comments

  • nak/copy_prop: Don’t propagate UBOs into uniform instructions

  • nak/lower_cf: Parent scopes are never NULL

  • nak/lower_cf: Track block divergence

  • nak: Convert to LCSSA before divergence analysis

  • nak/lower_cf: Flag phis as convergent when possible

  • nak/from_nir: Clean up phi annotations

  • nak: Add a UniformBuilder

  • nak/from_nir: Emit uniform instructions when !divergent

  • nak/sm70: Properly encode bindless cbufs

  • nak/dce: Account for bindless CBuf handles

  • nak/calc_instr_deps: Account for bindless CBufs

  • nak/bitset: Add an iterator

  • nak/ra: Handle bindless CBufs

  • nak/ra: Pull searching for unused/unpinned regs into a helper

  • nak/ra: Rename PinnedRegAllocator to VecRegAllocator

  • nak/ra: Add a concept of pinned registers to RegAllocator

  • nak: Add OpPin and OpUnpin

  • nak/legalize: Allow pinned uniform vectors in non-uniform blocks

  • nak/legalize: Bindless cbufs must be pinned in non-uniform blocks

  • nak/copy_prop: Don’t propagate bindless cbufs into non-uniform blocks

  • nir: Add some new _nv intrinsics

  • nvk,nak: Switch to nir_intrinsic_ldc_nv

  • nak: Implement r2ur_nv

  • nak: Implement [un]pin_cx_handle_nv

  • nir: Add nir_foreach_block_in_cf_node_safe() iterators

  • nak: Lower non-uniform ldcx_nv to global loads

  • nak: Implement nir_intrinsic_ldcx_nv

  • nvk: Split SSBO and UBO address formats

  • nvk: Split write_[dynamic_]buffer_desc into UBO and SSBO variants

  • nvk: Align buffer descriptors

  • nvk: Rename nvk_cmd_buffr_get_cbuf_descriptor()

  • nvk: Make nvk_min_cbuf_alignment() inline

  • nvk/lower_descriptors: Add a descriptor_type_is_ubo/ssbo() helper

  • nvk: Move the zero offset optimization to load_descriptor_for_idx_intrin()

  • nvk: Allow the cbuf optimization for VK_DESCRIPTOR_TYPE_MUTABLE_EXT

  • nvk/descriptor_set_layout: Record which dynamic buffers are UBOs

  • nvk: Use bindless cbufs on Turing+

  • nvk: Be much more conservative about rebinding cbufs

  • nvk: Use cbuf loads for variable pointers dynamic SSBO descriptors

  • nvk: s/draw_idx/draw_index/g

  • nvk: Pass the base workgroup and global size to flush_compute_state()

  • nvk: Use helper macros for accessing root descriptors

  • nvk: Pass the queue to draw/dispatch_state_init()

  • nvk: Use inline constant buffer updates for CB0

  • nvk: Only write draw parameters to cb0 when they change

  • nvk: Refactor build_cbuf_map()

  • nak,nir: Drop r2ur_nv in favor of as_uniform

  • nouveau: Fix a race in nouveau_ws_bo_destroy()

  • nvk: Use NVK_VK_GRAPHICS_STAGE_BITS in dirty_cbufs_for_descriprots()

  • nvk: Dirty cbufs in CmdPushDescriptorSetWithTemplate2KHR

  • intel/blorp: Set nir_shader::options up-front before building

  • util/format_pack: Fix packing of signed 1010102 SSCALED formats

  • util/format_pack: Also use iround for SCALED formats

  • util/format_pack: Clamp SNORM values to [-1, 1] when unpacking

  • util/format: Round to nearest even when converting to R11G11B20F

  • util/format: Handle denorms when converting to R11G11B10F

  • nir/format_convert: Smash NaN to 0 in pack_r9g9b9e5()

  • nir/format_convert: Use fmin/fmax to clamp R9G9B9E5 data

  • nir: Add a nir_intrinsic_use for unit tests

  • nir: Move most of nir_format_convert to a C file

  • nir: Support 0 and 32 bits in some format conversion helpers

  • util: Make format_srgb.h C++ safe

  • nir: Add a format pack helper and tests

  • nir: Add a format unpack helper and tests

  • nir/format_convert: Assert that UNORM formats are <= 16 bits

  • ci: Update trace SHAs

  • vulkan/meta: Use demote instead of discard

  • nvk: Fix whitespace issues around conservative rasterization

  • nvk: Re-order conservative rasterization checks

  • nvk: Don’t emit conservative rasterization before Maxwell B

  • nvk: Silently fail to enumerate if not on nouveau

  • util/cnd_monotonic: Move the guts to a c file

  • util/cnd_monotonic: Use a void * on Windows

  • vulkan/wsi/wayland: Use mtx_t and u_cnd_monotonic

  • vulkan/wsi/x11: Use c11/threads for thread spawning

  • vulkan/wsi/x11: Use mtx_t and u_cnd_monotonic

  • vulkan/wsi/display: Use mtx_t and u_cnd_monotonic

  • vulkan/wsi/queue: Use mtx_t and u_cnd_monotonic

  • vulkan/wsi: Delete wsi_init_pthread_cond_monotonic

  • vulkan: Use u_cnd_monotonic for vk_sync_timeline

  • nvk: Why are nvk_image.c/h writeable?

  • nvk: Bump the sparse alignment requirement on buffers to 64K

  • nvk: Align sparse-bound images to the sparse binding size

  • nvk: Fetch debug flags from the physical device

  • nvk: Initialize the debug flags in nvk_instance

  • nvk: Add the start of a KMD abstraction

  • nvk/nvkmd: Implement dev and pdev for nouveau

  • nvk: Use the NVKMD interface for device enumeration

  • nvk/nvkmd: Add memory and virtual address interfaces

  • nvk/nvkmd: Implement the mem and va interfaces for nouveau

  • nvk: Add static wrappers for image/buffer binding

  • nvk: Use nvkmd_mem for nvk_device_memory

  • nvk: Use nvkmd_mem for nvk_image::linear_tiled_shadow_mem

  • nvk: Use nvkmd_mem for nvk_cmd_pool

  • nvk: Use nvkmd_mem for nvk_descriptor_pool

  • nvk: Use nvkmd_mem in nvk_upload_queue

  • nvk: Use nvkmd_mem for descriptor tables

  • nvk: Use nvkmd_mem for shader and event heaps

  • nvk: Use nvkmd_mem for query pools

  • nvk: Use an nvkmd_mem for the SLM area

  • nvk: Drop extra_bos from nvk_queue_submit_simple()

  • nvk: Use nvkmd_mem for the nvk_queue_state::push

  • nvk: Use nvkmd_mem for the zero page, VAB, and CB0

  • nvk/nvkmd: Add a context interface

  • nvk/nvkmd: Implement nvkmd_ctx for nouveau

  • nvk: Convert the upload queue to nvkmd_ctx

  • nvk: Use an nvkmd_ctx for sparse binding

  • nvk: Use nvkmd_ctx for queue submit

  • nvk: Remove the last vestages of nouveau/winsys from core NVK

  • nouveau/mme: Don’t dereference an empty vector

  • nouveau/mme: Don’t leak data_bo

  • nouveau/mme: Use fixed BO addresses in the MME test

  • nvk: Move Heaps and BO binding into nvkmd

  • nvk: Move debug flags int nvk_debug.h

  • nvk/nvmkd: Plumb parent pointers through everywhere

  • nvk/nvkmd: Re-implement NVK_DEBUG=vm

  • nvk: Do mem maps directly in nvkmd on nouveau

  • nvk/nvkmd: Add real mem<->bo flag translation

  • nvk/nvkmd: Flip the script on NO_SHARED

  • nvk: Drop nvk_buffer::is_local

  • nvk/nvkmd: Rework memory placement flags

  • nvk/nvkmd: Be more specific about memory alignments

  • nvk/nvkmd: Be a lot more pedantic about VA alignments

  • nvk: Put CB0 in VRAM

  • nvk: Put descriptors in VRAM

  • nouveau/push: Cache the last header DW to avoid read-back

  • nak/sph: Stop storing the shader model in ShaderProgramHeader

  • nak: Move encode_sm* to to sm*.rs

  • nak/sm50: Get rid of the hand-rolled align_up/down() helpers

  • nak: Plumb a ShaderModel trait through everywhere

  • nak/ra: Move the NAK_DEBUG=spill logic into RA

  • nak: Move RegFile::num_regs() into ShaderModel

  • nak: Move Instr::can_be_uniform() into ShaderModel

  • nak: Move instruction encoding into ShaderModel

  • nak/sm70: Move instruction encoding into a trait

  • nak/sm70: Re-organize the code a bit

  • nak/legalize: Move a bunch of helpers to a trait

  • nak/legalize: Handle OpBreak and OpBSSy specially

  • nak/legalize: Handle RA instructions up-front

  • nak/sm70: Move legalization into SM70Op

  • nak/sm50: Move instruction encoding into a trait

  • nak/sm50: Move legalization into SM50Op

  • nak: Add a legalize() method to ShaderModel

  • nak/sm50: Re-order all the ops

  • nak/sm50: Fix immediates for IMnMx

  • zink/kopper: Set VK_COMPOSITE_ALPHA_OPAQUE_BIT when PresentOpaque is set

  • nak: gather instr count explicitly

  • nvk/nvkmd: nouveau uses the OS page size

  • nvk: Drop the sparse alignment back down to 4096

  • nvk: Use the page size queried from NVKMD

  • nak/nir: Use an indirect load for sample locations

  • nak/copy_prop: Propagate OpSel with a selector of SrcRef::Zero

  • nak/copy_prop: Ignore the top 16 bits of OpPrmt::sel

  • nak: Don’t print the destination of OpIpa twice

  • nir,nak: Add a nir_op_prmt_nv

  • nak/nir: Use prmt in texture lowering

  • nak/nir: Use prmt for barycentric offset lowering

  • nak/nir: Make interpolate_at_sample more efficient

  • nak: Add some helpers for working with OpPrmt selectors

  • nak: Optimize nested OpPrmt

  • nak: Add a pass macro for more consistent debug printing

  • nak: Run copy-prop again after opt_prmt and opt_lop

  • nvk: Fix indirect cbuf binds pre-Turing

  • nvk: Don’t advertise sparse residency on Maxwell A

  • nvk: Reject sparse images on Maxwell A and earlier

  • nak/spill_values: Don’t assume no trivial phis

  • meson/megadriver: Don’t invoke the megadriver script with no drivers

  • nak: Sample locations are byte-aligned

  • nvk: Require color or depth/stencil attachment support for input attachments

  • nvk: Support STORAGE_READ_WITHOUT_FORMAT on buffers

  • zink: Align descriptor buffers to descriptorBufferOffsetAlignment

Francisco Jerez (33):

  • intel/brw/xe2+: Keep PS sample mask in the f1.0 register whether or not kill is used.

  • intel/brw: Don’t emit Z coordinate interpolation if CPS isn’t in use.

  • intel/brw/xe2+: Fix indirect extended descriptor setup for scratch space.

  • iris: Allocate fixed amount of space for blend state.

  • blorp: Allocate fixed amount of space for blend state.

  • intel/brw/xe2+: Don’t use SEL peephole on 64-bit moves.

  • intel/brw/xe2+: Fix 64-bit subgroup scan intrinsics not to rely on SEL instructions.

  • intel/brw/xe2+: Lower 64-bit SHUFFLE and CLUSTER_BROADCAST.

  • intel/xe2+: Enable native 64-bit integer arithmetic.

  • nir: Add option to lower 64-bit uadd_sat.

  • intel/brw/xe2+: Lower 64-bit integer uadd_sat.

  • intel/brw/xe2+: Round up spill/unspill data size to nearest reg_size multiple.

  • intel/xe2+: Enable native 64-bit integer arithmetic.

  • iris,anv/xe2+: Enable the DX10/OGL border mode for YCrCb as per Wa_14014226147.

  • iris,anv/xe2+: Set tessellation redistribution regions per patch to recommended values.

  • iris,anv/xe2+: Use pipelined variant of 3DSTATE_DRAWING_RECTANGLE.

  • intel/brw/xe2+: Use active-thread-only barriers available since Xe2+.

  • iris/xe2+: Fix format of scratch space surface address in various 3DSTATE packets.

  • anv/xe2+: Fix format of scratch space surface address in various 3DSTATE packets.

  • intel/fs/gfx20+: Fix surface state address on extended descriptors for NIR scratch intrinsics.

  • intel/fs/xe2+: Ask driver for PS payload registers based on barycentric load intrinsics in use.

  • iris/gfx11+: Request PS payload fields for ALU-based interpolation via 3DSTATE_PS_EXTRA.

  • anv/gfx11+: Request PS payload fields for ALU-based interpolation via 3DSTATE_PS_EXTRA.

  • intel/fs/xe2+: Don’t lower barycentric load offsets to fixed-point format on Xe2+.

  • intel/fs/xe2+: Add ALU-based implementation of barycentric interpolation at a per-channel offset.

  • intel/fs/xe2+: Add ALU-based implementation of barycentric interpolation at a per-channel sample.

  • intel/dev: Add GRF size information to the intel_device_info struct.

  • anv/xe2+: Align push constant ranges to GRF boundaries.

  • intel/brw: Implement null push constant workaround.

  • intel/dev: Add devinfo flag for TBIMR push constant workaround.

  • anv/gfx12.5: Pass non-empty push constant data to PS stage for TBIMR workaround.

  • iris/gfx12.5: Pass non-empty push constant data to PS stage for TBIMR workaround.

  • iris: Pin pixel hashing table BO from iris_batch submission instead of from iris_state.

Friedrich Vock (7):

  • aco/tests: Insert p_logical_start/end in reduce_temp tests

  • aco/spill: Insert p_start_linear_vgpr right after p_logical_end

  • radv: Use max_se instead of num_se where appropriate

  • radeonsi: Use max_se instead of num_se where appropriate

  • radv/rt: Fix memory leak when compiling libraries

  • aco/spill: Don’t spill phis with all-undef operands

  • aco: Limit rt stages to 128 vgprs

GKraats (3):

  • i915g: fix generation of large mipmaps

  • i915g: fix mipmap-layout for npots

  • i915g: fix max_lod at mipmap-sampling

Ganesh Belgur Ramachandra (4):

  • radeonsi: add GL_EXT_texture_filter_minmax extension

  • radeonsi: add GL_ARB_texture_filter_minmax extension

  • radeonsi: fix eptich on chips without image opcodes (e.g. gfx940)

  • amd/common: skip lane size determination for chips without image opcodes (e.g. gfx940)

Georg Lehmann (88):

  • aco/tests: don’t use undef for descriptors

  • aco/tests/post_ra: fix various validation errors

  • aco/lower_to_hw: fix v_cvt_pk_u16_u32 instruction format

  • aco/lower_to_hw: fix 16bit p_insert on gfx8

  • aco/tests: validate before and after post-ra tests

  • spirv: preserve signed zero in modf

  • aco/lower_to_hw: don’t use regClass to identify subdword reductions

  • aco: add a subdword lowering pass

  • aco: add tests for lower_subdword

  • aco/ra: remove gfx6/7 subdword paths

  • aco/lower_to_hw: remove gfx6/7 subdword paths

  • ac/nir: explicitly use pack_half_2x16_rtz

  • radv, radeonsi: don’t use D16 for f2f16_rtz

  • radv: always run nir_opt_16bit_tex_image

  • nir/opt_16bit_tex_image: pass options to opt_16bit_dest

  • nir/opt_16bit_tex_image: optimize packed conversions too

  • aco/gfx11+: use v_cvt_pk_u8_f32 for 8bit constant copies

  • aco/gfx10: use v_add_u16 with literal for constant copies

  • aco/tests: simplify small constant copy test

  • aco/gfx11+: optimize v_fma_mix throughput

  • zink: use bitcasts instead of pack/unpack double opcodes

  • aco/gfx11: use v_swap_b16

  • aco/optimizer: remove ineffective vcc opt

  • aco/optimizer: remove ineffective undef opt

  • aco: remove perfwarn

  • aco: don’t pass program to emit_bpermute

  • aco/lower_to_hw: add copy_constant_sgpr

  • aco: small constant copy optimizations

  • aco/lower_to_hw: use copy_constant_sgpr for masks

  • aco/lower_to_hw: optimize split 64bit constant copies

  • aco/optimizer: use p_create_vector to create mask when a copy can’t be used

  • nir: remove unpack_half_flush_to_zero

  • nir/opt_uniform_atomics: handle inverse_ballot when detecting single lane ifs

  • aco: optimize branching sequence with p_create_vector exec producer

  • nir: sink/move inverse_ballot like moves

  • ac: set has_pack_32_4x8

  • nir: lower pack_uvec4_to_uint to pack_32_4x8 if supported

  • nir/opt_algebraic: alternative 8bit pack_[us]norm_4x8 lowering

  • aco: rework how affinities for acc operands are determined

  • aco: add affinities for possible sopk optimizations

  • aco/gfx11+: fix inline constants for v_pk_fmac_f16

  • aco: move literal unswizzle opt to RA

  • aco/ra: use a switch to check vop2acc instruction support

  • aco: move s_add_u32 -> s_addk_i32 optimization fully to ra

  • amd/common: set COMPUTE_STATIC_THREAD_MGMT_SE2-3 correctly on gfx10-11

  • aco: add more anonymous namespaces

  • aco: make local functions static in files without anonymous namespace

  • radv: inline partial push constant loads

  • nir: add ford, funord, fneo, fequ, fltu, fgeu

  • aco: implement ford, funord, fneo, fequ, fltu, fgeu

  • ac/llvm: implement ford, funord, fneo, fequ, fltu, fgeu

  • ac/nir: enable ford, funord, fneo, fequ, fltu, fgeu

  • nir/opt_algebraic: look through fabs/fneg when matching fmulz/ffmaz

  • nir/optimize cmp(a, -0.0)

  • nir/opt_algebraic: optimize cmp(fneg(a), #b) and feq with fabs

  • nir/opt_algebraic: add various unordered/ordered patterns from aco

  • aco: remove ordered/unordered optimizations

  • aco/ir: remove unused vopc helpers

  • iris/ci: update trace checksums

  • aco/ra: fix affinity for s_addk

  • aco: fix s_delay_alu with salu and trans dependency

  • aco,nir: add dpp16_shift_amd intrinsic

  • radv/nir: add a pass to optimize shuffle/booleans dependent only on tid/consts

  • radv: use radv_nir_opt_tid_function for shuffles

  • radv: use radv_nir_opt_tid_function to create inverse_ballot

  • aco/gfx12: use trans s_delay_alu for pseudo scalar

  • aco/gfx12: don’t allow vgpr operands for pseudo scalar

  • aco/gfx11.5: select s_cvt_[ui]32_f32

  • aco/gfx11.5: select s_(ceil|floor|trunc|rndne)

  • aco: add aco_opcode::p_s_cvt_f16_f32_rtne

  • aco/gfx11.5: select SALU float conversions

  • aco/gfx11.5: fix s_fmac acc to definition

  • aco/gfx11.5: select SOP2 float instructions

  • aco/gfx11.5: select SOPC float instructions

  • aco/gfx11.5: select SALU fsat

  • aco/gfx11.5: select SALU fsign

  • aco/gfx11.5+: allow sgpr dst for trans ops and use pseudo scalar ops on gfx12

  • aco/gfx11.5: select SALU fneg/fabs

  • aco/gfx11.5: select SALU fquantize2f16

  • aco: micro optimize VALU fquantize2f16

  • aco: handle clustered uniform reductions correctly

  • nir: constant fold inverse_ballot

  • aco: remove optimize_cmp_subgroup_invocation

  • spirv: ignore more function param decorations

  • aco/optimizer: update temp_rc when converting to uniform bool alu

  • aco/gfx11+: don’t use VOP3 v_swap_b16

  • nir/lower_int64: replace uadd_sat with ior for find_lsb64 and ufind_msb64

  • aco/gfx10+: set lateKill for sgprs used by wave64 VALU writing a mask

Gert Wollny (4):

  • zink/kopper: Wait for last QueuePresentKHR to finish before acquiring for readback

  • mesa/st: don’t use base shader serialization when uniforms are not packed

  • r600/sfn: Set bit size for newly created store intrinsic

  • zink: limit minSampleShading to a maxium value of 1.0

Guilherme Gallo (3):

  • ci: Add S3 id_token for all jobs

  • ci: Use id_tokens for JWT auth

  • ci/lava: Fix cmdline for UART/fastboot devices

Hans-Kristian Arntzen (5):

  • vulkan: Update XML and headers to 1.3.285.

  • ac/surface: Add surface flags to prefer 4K and 64K alignment.

  • radv: Implement VK_MESA_image_alignment_control

  • wsi/common: Do not update present mode with MESA_VK_WSI_PRESENT_MODE.

  • wsi/x11: Bump maximum number of outstanding COMPLETE events.

Heinrich Fink (1):

  • zink: remove workaround of FB modifiers forcing present state

Iago Toral Quiroga (53):

  • v3dv: fix VK_KHR_vertex_attribute_divisor

  • v3d,v3dv: stop hard-coding max attrib divisor

  • broadcom/compiler: assert on array overflow

  • v3d: fix array_len when precompiling outputs for shader-db

  • broadcom/compiler: fix num_textures for precompiled shaders

  • broadcom/compiler: don’t read excess channels on image loads

  • broadcom/compiler: simplify v3d_vir_emit_tex

  • broadcom/cle: fix up shader record for V3D 7.1.10 / 2712D0

  • v3d: support 2712D0

  • v3dv: support 2712D0

  • broadcom/compiler: make add_node return the node index

  • broadcom/compiler: don’t assign payload registers to spilling setup temps

  • broadcom/compiler: apply payload conflict to spill setup before RA

  • broadcom/compiler: check if vertex shader writes point size

  • v3dv: only flag ‘shader writes point size’ if the shader actually writes it

  • v3dv: emit a default point size when drawing points

  • v3dv: drop unused stride field from v3dv_pipeline_vertex_binding

  • v3dv: fix incorrect index buffer size

  • v3dv: use pSizes paramater in vkCmdBindVertexBuffers2

  • v3dv: implement vkCmdBindIndexBuffer2KHR

  • v3dv: handle VkBufferUsageFlags2CreateInfoKHR

  • v3dv: handle VkPipelineCreateFlags2CreateInfoKHR

  • v3dv: lower maxVertexInputBindingStride to match vulkan runtime

  • v3dv: shader modules are deprecated with VK_KHR_maintenance5

  • v3dv: implement vkGetImageSubresourceLayout2KHR

  • v3dv: refactor create_image

  • v3dv: add a get_image_subresource_layout helper

  • v3dv: implement vkGetDeviceImageSubresourceLayoutKHR

  • v3dv: implement vkGetRenderingAreaGranularityKHR

  • v3dv: fix pipeline leaks when meta pipeline cache is disabled

  • v3dv: fix a few asserts that check layerCount instead of array_layers

  • v3dv: allow VK_REMAINING_ARRAY_LAYERS in VkImageSubresourceLayers

  • v3dv: remove blit shader restriction on depth/stencil not being linear

  • v3dv: disable some TLB paths for cases of linear depth/stencil stores

  • v3dv: support VK_FORMAT_A1B5G5R5_UNORM_PACK16_KHR

  • v3dv: add more checks for device loss

  • v3dv: fix handling of pipeline flags when pipeline init fails

  • v3dv: expose VK_KHR_maintenance5

  • broadcom/compiler: initialize payload_conflict for all initial nodes

  • v3dv: don’t call wsi_device_init too early

  • broadcom/compiler: don’t spill in between multop and umul24

  • broadcom/compiler: fix per-quad spilling

  • broadcom/compiler: validate rtop + thrsw hazard

  • broadcom/compiler: drop multop if we dce umul24

  • broadcom/compiler: add missing signal compatibilities for V3D 7.x

  • broadcom/compiler: add new float32 unpack modifiers in V3D 7.x

  • broadcom/compiler: disallow copy propagation of FMOV exclusive modifiers

  • broadcom/compiler: implement nir_op_fsat

  • v3d: don’t lower fsat on V3D 7.x

  • v3dv: make nir helpers receive nir compiler options from caller

  • v3dv: don’t lower fsat on V3D 7.x

  • v3d: skip tlb loads when emitting clears with a draw call

  • v3d: rename job->clear to job->clear_tlb

Ian Romanick (33):

  • intel/brw: Fix optimize_extract_to_float for i2f of unsigned extract

  • intel/brw: Avoid optimize_extract_to_float when it will just be undone later

  • intel/elk: Fix optimize_extract_to_float for i2f of unsigned extract

  • nir/algebraic: Optimize some extract_* expressions

  • spirv: Use fp16 fp_fast_math settings when lowering fp16 asin and acos

  • intel/brw: Remove dsign optimization

  • intel/elk: Remove dsign optimization

  • intel/brw: Use fs_inst::resize_sources in brw_fs_opt_algebraic

  • intel/brw: Add support for fcsel opcodes

  • intel/brw: Handle fsign optimization in a NIR algebraic pass

  • intel/brw: Update CSEL source type validation

  • intel/brw: Combine constants and constant propagation for CSEL

  • intel/brw: Algebraic optimizations for CSEL

  • intel/brw: Implement more strictly correct fsign lowering

  • intel/brw: Use range analysis to optimize fsign

  • nir/algebraic: Add nir_lower_int64_options::nir_lower_iadd3_64

  • nir/search: Fix is_16_bits for vectors

  • nir/search: Refactor is_16_bits

  • nir/algebraic: More patterns to generate iadd3

  • nir/algebraic: intel/fs: Optimize some patterns before lowering 64-bit integers

  • intel/brw: Temporarily disable result=float16 matrix configs

  • intel/brw: Major rework of lower_cmat_load_store

  • intel/brw/xe2+: Catch invalid uses of writes_accumulator earlier

  • intel/brw/xe2+: Adjust size_read() for DPAS

  • intel/brw/xe2+: Scale size_written by reg_unit for DPAS

  • intel/brw/xe2+: Adjust DPAS lowering to DP4A to accommodate larger GRF and SIMD16

  • intel/brw/xe2+: Allow vec16 for cooperative matrix

  • nir: dpas_intel second source can have different number of components

  • intel/brw/xe2+: Add LNL cooperative matrix configurations

  • intel/tools: Advertise I915_PARAM_HAS_EXEC_TIMELINE_FENCES

  • intel/brw: Test corner case CSE of ADD3 instructions

  • intel/brw: Don’t propagate saturate to an instruction that writes flags

  • intel/elk: Don’t propagate saturate to an instruction that writes flags

Icenowy Zheng (7):

  • llvmpipe: add shader cache support for ORCJIT implementation

  • gallivm: orcjit: use a mutex to protect symbol looking up

  • util: detect LoongArch architecture

  • gallivm: add LoongArch support to the mattrs setting code

  • llvmpipe: add LoongArch support in ORCJIT

  • gallivm: orcjit: keep the ownership of tm for LPJit

  • gallivm: orcjit: use atexit to release LPJit singleton at exit

Italo Nicola (1):

  • nir: add {load,store}_global_etna intrinsics

Iván Briano (21):

  • compiler: reorder FLOAT_CONTROLS enums

  • nir: track some float controls bits per instruction

  • spirv: gather some float controls bits per instruction

  • nir: check inf/nan/sz preserve per-instruction

  • nir/algebraic: support float controls conditions per instruction

  • nir/algebraic: move float control conditions to be per instruction

  • vtn: support float controls2

  • anv: enable VK_KHR_shader_float_controls2

  • anv: check requirements for VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE

  • anv: fix casting to graphics_pipeline_base

  • anv: consolidate DestroyPipeline for graphics and graphics_lib

  • intel/brw: fix subgroup size of geometry stages for lnl+

  • anv: check cmd_buffer is on a transfer queue more properly

  • intel/brw: add fetch_viewport_index function

  • intel/brw: always read LAYER/VIEWPORT from the FS payload

  • vulkan/runtime: pColorAttachmentInputIndices is allowed to be NULL

  • vulkan/properties: handle LayeredApiPropertiesListKHR

  • anv: enable VK_KHR_maintenance7

  • anv: get scratch surface from the correct pool

  • anv: set MOCS for protected memory when needed

  • intel/rt: fix terminateOnFirstHit handling

JCWasmx86 (1):

  • meson: Fix invalid kwarg name

Jeremy Gebben (1):

  • radv: Return hang status from radv_check_gpu_hangs()

Jesse Natalie (14):

  • nir_opt_algebraic: Add a couple optimizations for lowered unpack(pack())

  • wgl: Delete pixelformat support query

  • wgl: Fix flag check for GDI compat

  • nir_range_analysis: Use fmin/fmax to fix NAN handling

  • d3d12: Use GetResourceAllocationInfo instead of GetCopyableFootprints for residency sizes

  • nir: Remove assert-only variable by inlining its single use

  • zink: Add ASSERTED to assert-only local variable

  • mesa: Add ASSERTED to assert-only local variable

  • subprojects: Use depth=1 in the git wrap files

  • blake3: fix Windows ARM64 build and detect ARM64EC as ARM64

  • ci/windows: Disable zlib in LLVM

  • ci/windows: Specify numpy < 2.0 to prevent breaking changes

  • microsoft/clc: Split struct copies before vars_to_ssa in pre-inline optimizations

  • meson: Add an error message for llvmpipe without llvm draw support

Jessica Clarke (3):

  • Revert “meson: Do not require libdrm for DRI2 on hurd”

  • Revert “meson: fix with_dri2 definition for GNU Hurd”

  • meson: egl: Build egl_dri2 driver even for plain DRI

Jianxun Zhang (43):

  • intel/isl: Allow multi-sample on depth aux usage (xe2)

  • isl: Add a heading 4KB to MCS surface (xe2)

  • isl: Add AUX MCS encoding into aux modes (xe2)

  • blorp: Scaledown rectangle of MSAA fast clear (xe2)

  • blorp: Fix offset when ambiguating MCS buffer (xe2)

  • isl: Clone from isl_gfx12.* files (xe2)

  • isl: Update isl_gfx20 code (xe2)

  • isl: Add isl_gfx20 into build (xe2)

  • isl: Add dispatching in isl.c (xe2)

  • isl: Implement a part of WA_22018390030 (xe2)

  • isl: Remove code for Xe2 from isl_gfx12.c

  • isl: Update render CMF mapping (xe2)

  • isl: Don’t set clear values or their address (xe2)

  • blorp: Get fast clear rectangle of non-MSAA surfaces (xe2)

  • blorp: Pass down fast clear color value (xe2)

  • intel/genxml,blorp,common: Update 3DSTATE_PS command (xe2)

  • iris: Update aux state for color fast clears (xe2)

  • iris: Limit FCV_CCS_E to platforms that enable it

  • anv: Don’t enable compression with modifiers (xe2)

  • iris: Add more restrictions on compression (Xe2)

  • anv: Don’t enable compression on external bos (xe2)

  • iris: Disable PAT-based compression on depth surfaces (xe2)

  • anv: Disable PAT-based compression on depth images (xe2)

  • iris: Update synchronization of fast clear (xe2)

  • iris: Workaround: Don’t allocate compressed bo from cache (xe2)

  • isl: Remove restriction of CCS_E support on formats (xe2)

  • blorp: Don’t convert ccs_e formats for copy (xe2)

  • isl: Initialize the last usage in isl_encode_aux_mode[] (xe2)

  • anv: Update synchronization of fast clear (xe2)

  • iris: Disable predraw resolve (xe2)

  • blorp: Ensure MSAA fast clear in correct modes (xe2)

  • intel/dev: Select a compressed PAT entry (xe2)

  • isl: Add some formats not covered in CMF table (xe2)

  • anv: Disable tracking fast clear and aux state (xe2)

  • anv: Fix Vulkan CTS failure related to MCS (xe2)

  • anv: Support arbitrary fast-clear value on all layouts (xe2)

  • anv: Disable tracking of clear color on color attachment

  • intel/common: Ensure SIMD16 for fast-clear kernel (xe2)

  • intel/common: Remove blank lines in intel_set_ps_dispatch_state() (xe2)

  • anv: Fix assertion failures on BMG (xe2)

  • iris: Fix an assertion failure with compressed format

  • anv: Disable compression on legacy modifiers (xe2)

  • anv: Disable legacy CCS setup in binding (xe2)

Job Noorman (33):

  • ir3: simplify cat5 parsing

  • ir3: add encoding for isam.v

  • ir3: use isam.v for multi-component SSBO loads

  • ir3: add encoding of ldib/stib offsets

  • ir3: lower SSBO access imm offsets

  • nir/opt_offsets: add callback for max base offset

  • nir/opt_offsets: add option to allow offset wrapping

  • nir/opt_offsets: add load/store_ssbo_ir3

  • ir3: use nir_opt_offsets for SSBO accesses

  • ir3: optimize SSBO offset shifts for nir_opt_offsets

  • ir3: remove spilled splits in shared RA

  • ir3: set wrmask for spilled splits in shared RA

  • ir3: print sharedness/halfness of merge set regs

  • ir3: print intervals when dumping merge sets

  • ir3: print dst_offset of spill.macro

  • ir3: debug print limit pressure and post-spill max pressure

  • ir3: set current instruction before all validation asserts

  • ir3: fix crash in try_evict_regs with src reg

  • ir3: fix handling of early clobbers in calc_min_limit_pressure

  • ir3: set offset on splits created while spilling

  • ir3: correctly set wrmask for reload.macro

  • ir3: don’t remove intervals for non-killed tex prefetch sources

  • ir3: don’t remove collects early while spilling

  • ir3: expose instruction indexing helper for merge sets

  • ir3: make indexing instructions optional in ir3_merge_regs

  • ir3: index instructions before fixing up merge sets after spilling

  • ir3: move liveness recalculation inside ir3_ra_shared

  • ir3: restore interval_offset after liveness recalculation in shared RA

  • ir3: add ir3_cursor/ir3_builder helpers

  • ir3: refactor ir3_spill.c to use the ir3_cursor/ir3_builder API

  • ir3: only add live-in phis for top-level intervals while spilling

  • ir3: print rounding mode for cov

  • ir3: set rounding mode for all floating point conversions

Jordan Justen (33):

  • blorp: Update programming for XY_FAST_COLOR_BLT on xe2

  • intel/genxml: Add XY_FAST_COLOR_BLT for xe2

  • intel/genxml: Update 3DSTATE_BTD for xe2

  • intel/dev: Allow setting FORCE_PROBE for intel PCI IDs

  • intel/dev: Support INTEL_FORCE_PROBE env-var

  • docs: Document INTEL_FORCE_PROBE env-var

  • intel/dev: Add LNL device info

  • pci_ids/intel: Add LNL PCI IDs (with FORCE_PROBE set)

  • anv/grl: Set INTEL_FORCE_PROBE=* when running intel_clc

  • intel/brw: Simplify enabling brw_fs_test_dispatch_packing

  • intel/brw: Allow xe2 in brw_stage_has_packed_dispatch()

  • intel/brw: Fix SSBO/shared load offset register size for Xe2

  • anv/grl: Build for xe2

  • Revert “anv: Disable Ray Tracing on xe2 until our compiler supports Xe2 RT”

  • intel/dev/mesa_defs.json: Update LNL WA entries

  • intel/dev: Add INTEL_PLATFORM_BMG enum, BMG WA info

  • intel/dev: Add BMG device info

  • intel/dev: Add BMG PCI IDs (with FORCE_PROBE set)

  • intel/dev: Silence INTEL_FORCE_PROBE warning for intel_clc

  • intel/dev: If building the driver, always allow getting device info

  • Revert “anv/grl: Set INTEL_FORCE_PROBE=* when running intel_clc”

  • intel/compiler: Don’t set size written in brw_lower_logical_sends.cpp

  • intel/tools: Fix intel_dev_info –hwconfig switch

  • isl: Move isl_get_render_compression_format in isl_genX_helpers.h

  • isl: Implement isl_get_render_compression_format for xe2

  • intel/brw: Retype some regs to BRW_TYPE_UD for Xe2 indirect accesses

  • intel/perf/xe: Fix free pointer location in xe_add_config()

  • intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE

  • anv/generated_indirect_draws: Adjust xe2 simd32 sends_count_expectation

  • intel/dev: Disable LNL PCI IDs on Mesa 24.2 (require INTEL_FORCE_PROBE)

  • intel/brw/validate: Simplify grf span validation check by not using a mask

  • intel/brw/validate: Update dst grf crossing check for Xe2

  • intel/brw/validate: Convert access mask to be grf based

Jordan Petridis (1):

  • Revert “ci: mark microsoft farm as offline”

Jose Maria Casanova Crespo (9):

  • v3d: fix CLE MMU errors avoiding using last bytes of CL BOs.

  • v3dv: fix CLE MMU errors avoiding using last bytes of CL BOs.

  • v3d: Increase alignment to 16k on CL BO on RPi5

  • v3dv: Increase alignment to 16k on CL BO on RPi5

  • v3dv: V3D_CL_MAX_INSTR_SIZE bytes in last CL instruction not needed

  • v3dv/ci: Add more dEQP-VK subgroups that are currently skipped

  • v3dv: Emit stencil draw clear if needed for GFXH-1461

  • v3dv: really fix CLE MMU errors on 7.1HW Rpi5

  • v3d: really fix CLE MMU errors on 7.1HW Rpi5

Josh Simmons (3):

  • radv: Fix crash when using SQTT and NO_COMPUTE

  • radv: Add `RADV_PROFILE_PSTATE` envvar

  • radv: Fix shader mask for SQ_WGP SPM counters

José Expósito (2):

  • meson: Update proc_macro2 meson.build patch

  • llvmpipe: Init eglQueryDmaBufModifiersEXT num_modifiers

José Roberto de Souza (87):

  • intel/perf: Nuke platform_supported

  • intel/perf: Remove i915_drm.h include from gen_perf.py

  • intel/perf: Fix the error check of i915_add_config()

  • intel/perf: Change oa_format to uint64_t

  • intel/perf: Store pointer intel_device_info to in intel_perf_config

  • intel/perf: Add intel_perf_free()

  • intel/perf: Add intel_perf_free_context()

  • intel/ds: Free perf config and context

  • intel/ds: Nuke ralloc_ctx and ralloc_cfg

  • anv: Free intel_perf_config when destroying physical device

  • hasvk: Free intel_perf_config when destroying physical device

  • iris: Free intel_perf_config and intel_perf_context

  • crocus: Free intel_perf_config and intel_perf_context

  • intel/perf: Add and use a function to return platform OA format

  • intel/perf: Add function to open perf stream

  • intel/perf: Fix return of read_oa_samples_until()

  • anv: Nuke perf_query_pass from anv_execbuf

  • intel/perf: Replace I915_OA_FORMAT_* usage by platform check

  • intel/perf: Move code that will be shared by both KMDs

  • intel/perf: Move i915 specific code from common code

  • intel/perf: Move i915 specific code to load configurations to i915 file

  • intel/perf: Allocate sseu in heap memory

  • intel/perf: Replace drm_i915_perf_record_header by intel_perf_record_header

  • intel/perf: Add a macro with header + sample length

  • intel/perf: Add intel_perf_stream_read_samples()

  • intel/dev: Add LNL stepping mapping

  • intel/dev: Add BMG stepping mapping

  • intel: Move slm functions from brw_compiler.h to intel_compute_slm.c/h

  • intel/common: Implement Xe2 SLM encode

  • intel/common: Implement preferred SLM encode

  • intel/dev: Use topology variables to calculate strides in Xe KMD

  • intel/dev: Add function to get the number of EUs per subslice

  • intel: Set preferred SLM allocation size >= than SLM size for Xe2

  • anv: Set maxComputeSharedMemorySize value for Xe2 platforms

  • intel: Compute the optimal preferred SLM size per subslice

  • anv: Initialize variable to fix static analyzer warning

  • intel/genxml/gfx20: Sync POSTSYNC_DATA struct with spec

  • anv/xe2: Enable compute walker and BTD thread preemption

  • anv/xe2: Add STATE_COMPUTE_MODE individual masks

  • anv: Remove block promoting non CPU mapped bos to coherent

  • intel/isl: Set dummy_aux_address to implement Wa_14019708328

  • anv: Implement Wa_14019708328

  • iris: Implement Wa_14019708328

  • anv: Implement Wa_14019857787

  • iris: Implement Wa_14019857787

  • intel/dev: Add compressed PAT entry

  • anv: Add support for compressed images allocation in Xe2

  • anv: Give apps the choice of compressed or uncompressed but cpu visible images

  • iris: Add support for compressed images allocation in Xe2

  • anv: Fix assert in xe_gem_create()

  • intel/perf: Change order of if blocks

  • intel/perf: Add assert to check if allocated enough query fiels

  • intel/dev: Add engine_class_supported_count to intel_device_info

  • intel/perf: Add LNL OA XML

  • intel/perf: Add INTEL_PERF_QUERY_FIELD_TYPE_SRM_OA_PEC

  • intel/perf: Extend intel_perf_query_result_read_gt_frequency() to gfx 20

  • intel: Sync xe_drm.h

  • intel/perf: Implement function that returns OA format for Xe KMD

  • intel/perf: Add function to check if OA/perf is supported by Xe KMD

  • intel/perf: Replace i915_perf_version and i915_query_supported by a feature bitmask

  • intel/perf: Refactor and add Xe KMD support to add and remove configs

  • intel/perf: Add Xe KMD perf stream open function

  • intel/perf: Refactor and add Xe KMD support to enable and disable perf stream

  • intel/perf: Refactor and add Xe KMD support to change stream metrics id

  • tool/pps: Add Xe KMD support

  • intel/perf: Remove i915_drm.h includes from common code

  • intel/perf: Implement Xe KMD perf stream read

  • anv: Implement Xe KMD query pools

  • intel/perf: Enable perf on Xe KMD

  • intel/perf: Implement intel_perf_query_result_accumulate() for gfx 20+

  • intel/perf: Add support for LNL OA sample format size

  • intel/perf: Return LNL OA sample format

  • intel/perf: Do not add INTEL_PERF_QUERY_FIELD_TYPE_SRM_OA_PEC

  • intel/perf: Adjust EU count for Xe2+

  • intel/dev: Replace intel_device_info::apply_hwconfig by a gfx version check

  • intel: Rename XE_PERF to XE_OBSERVATION

  • anv: Fix return of PAT index for compressed bos for discrete GPUs

  • intel/dev: Drop DG1 PAT entries

  • intel/dev: Add documentation about intel_device_info_pat_entry::mmap

  • intel/dev: Drop coherency from intel_device_info_pat_entry

  • intel/dev: Add comment documenting the PAT entries

  • intel/dev: Use GPU WB PAT for Xe2 writecombining

  • intel/dev: Drop writeback_incoherent from Xe2

  • isl: Fix Xe2 protected mask

  • anv: Propagate protected information to blorp_batch_isl_copy_usage()

  • intel: Sync xe_drm.h

  • intel/dev: Support new topology type with SIMD16 EUs

Juan A. Suarez Romero (57):

  • vc4/ci: update results

  • vc4/v3d/ci: update expected list

  • vc4: set src type on storing sample mask

  • broadcom/compiler: remove unused parameters in vpm read

  • broadcom/compiler: do not run lowering I/O for FS

  • v3d/vc4/ci: set full renderer version check

  • nir,v3d: rename tlb_color_v3d intrinsic

  • vc4: use tlb_color_brcm intrinsic

  • .gitignore: add .cache folder

  • vc4: use IO semantics for location

  • v3d: use BITSET for the masks

  • v3d: remove handled cases for devices <= 42

  • ci: define SNMP base interface on runner

  • v3d: use screen name in disk cache

  • v3d,v3dv: add compatibility revision in GPU name

  • broadcom/ci: update expected results

  • v3dv/ci: add expected failure

  • v3dv/ci: fix spurious line in expected

  • v3dv/ci: add new timeouts

  • dri: cast constant to uint for bitshift

  • util: do not access member of a NULL structure

  • util: use unsigned types when performing bitshift

  • vulkan: do not access member of a NULL structure

  • nir: fix overflow when negating maxint in constant expressions

  • nir: use unsigned types when performing bitshifting

  • glsl: fix downcasting addresses to wrong object types

  • egl: do not access member of a NULL structure

  • mesa: use unsigned types when performing bitshifting

  • mesa: do not pass NULL pointer to function not expecting NULLs

  • ci: disable Igalia farm

  • broadcom/compiler: use unsigned types when performing bitshifting

  • v3dv: do not access member of a NULL structure

  • v3dv: do not pass NULL pointer to function not expecting NULLs

  • v3dv: restrict to channels when encoding border color

  • v3dv: fix misalignment in descriptor layout structure

  • v3d: do not access member of a NULL structure

  • v3d: do not pass NULL pointer to function not expecting NULLs

  • vc4: use unsigned types when performing bitshifting

  • vc4: do not access member of a NULL structure

  • vc4: do not pass NULL pointer to function not expecting NULLs

  • vc4: do not create 0-bytes variable length arrays

  • vc4: fix out-of-bounds access to array

  • Revert “ci: disable Igalia farm”

  • v3d: use original enabled_mask on setting vertex buffers

  • broadcom/ci: read 32-bit kernel from arm32 path

  • broadcom/ci: remove arch from hardware name

  • vc4/ci: run tests in 64-bits

  • broadcom/ci: run some GL tests in arm32 arch

  • broadcom/qpu: clean all versions not supported

  • broadcom: follow version naming convention

  • broadcom/ci: add more jobs to test with rpi5

  • broadcom/ci: update traces for rpi4

  • v3d/ci: update expected list

  • v3dv: adversise VK_EXT_depth_clamp_zero_one

  • v3d: expose ARB_depth_clamp in V3D 7.x

  • v3dv: free temp image created when copying with blit

  • v3dv: don’t leak cache key

Julian Orth (1):

  • egl/wayland: ignore unsupported driver configs

Juston Li (8):

  • venus: refactor out image requirements helpers

  • venus: extend image cache to vkGetDeviceImageMemoryRequirements

  • sync protocol for VkRingPriorityInfoMESA

  • venus: forward nice priority when creating ring

  • zink: disable cpu_storage for PIPE_USAGE_STREAM

  • venus: add missing sTypes for vk_set_physical_device_properties_struct

  • venus: sync protocol for conditionally ignored dyn arrays

  • anv/android: set ANV_BO_ALLOC_EXTERNAL for imported AHW

Karmjit Mahil (6):

  • ir3: Don’t set saturation on `flat.b`

  • zink: Add missing currentExtent special value handling

  • turnip: Remove workaround for CTS bug zero-sized inline uniform block

  • mailmap: Add Karmjit Mahil

  • freedreno/isa: Fix isaspec map for a3xx-ld

  • tu: Set `TU_ACCESS_CCHE_READ` for transfer ops with read access

Karol Herbst (159):

  • nir: add SYSTEM_VALUE_BASE_WORKGROUP_ID

  • nir/divergence_analysis: handle load_base_global_invocation_id

  • intel/compiler: lower workgoup id to index only for mesh shaders

  • v3d: call nir_lower_compute_system_values to get rid of base intrinsics

  • lavapipe: lower base_workgroup_id to zero

  • mesa/st: lower base invoc and workgroup id

  • nir: remove global_invocation_id_zero_base

  • nir: remove workgroup_id_zero_base

  • nir: document base_global_invocation_id and base_workgroup_id

  • core/kernel: skip validating unique kernel signatures

  • rusticl/program: Arc the stored KernelInfo

  • rust/program: remove Program::kernels

  • nouveau: fix potential double-free in nouveau_drm_screen_create

  • nir: fix nir_shader_get_function_for_name for functions without names.

  • rusticl: use stream uploader for cb0 if prefered

  • rusticl/kernel: properly handle grid and offsets being usize

  • rusticl: lower huge grids

  • rusticl: add RUSTICL_MAX_WORK_GROUPS

  • rusticl/event: use Weak refs for dependencies

  • rusticl/icd: remove CLObject

  • rusticl/spirv: enable more caps

  • Revert “rusticl/event: use Weak refs for dependencies”

  • event: break long dependency chains on drop

  • rusticl/device: add DeviceCaps and move timestamp stuff into it

  • rusticl/device/caps: move enough for has_images

  • rusticl/device: properly handle devices with no support for images

  • rusticl/mesa/context: flush context before destruction

  • rusticl: merge rusticl_nir and rusticl_mesa_bindings_inline_wrapper targets

  • rusticl: move mesa_version_string out of the inline wrapper

  • rusticl: bump bindgen req to 0.65

  • rusticl: bump meson req to 1.4

  • rusticl: make use of new `output_inline_wrapper` meson.rust.bindgen feature

  • nir/lower_cl_images: set binding also for samplers

  • nouveau: import nvif/ioctl.h file from libdrm_nouveau

  • gallium/vl: stub vl_video_buffer_create_as_resource

  • gallium/vl: remove stubs which are defined in mesa_util

  • meson: centralize galliumvl_stub handling

  • rusticl: link against libgalliumvl_stub

  • wgl: link against libgalliumvl_stub

  • gallium/drivers: do not link against libgalliumvl directly

  • rusticl/event: fix deadlock when calling clGetEventProfilingInfo inside callbacks

  • iris: fix PIPE_RESOURCE_PARAM_STRIDE for buffers

  • rusticl/icd: make sure returned function pointers are of the right type

  • rusticl/kernel/launch: fix mapping usize types to GPU pointer sizes

  • rusticl/kernel/launch: remove useless upload of the input

  • rusticl/kernel: move most of the code in launch inside the closure

  • rusticl/kernel/launch: move allocation of resources vec

  • rusticl/kernel/launch: rework how the printf buffer is allocated

  • rusticl/kernel/launch: get rid of Arc clones for global resources

  • rusticl/kernel/launch: add helper to bind global buffers

  • broadcom/compiler: handle load_workgroup_size

  • v3d: add support for load_workgroup_size

  • rusticl/spirv: do not pass a NULL pointer to slice::from_raw_parts

  • rusticl/memory: copies might overlap for host ptrs

  • gallium: reduce pipe_resource.usage to 4 bits

  • gallium: properly type pipe_resource.usage with the enum

  • gallium: properly type fields of pipe_resource.usage

  • nir_lower_mem_access_bit_sizes: support unaligned store_scratch

  • nir: add global_atomic_2x32 variants to nir_get_io_offset_src_number

  • broadcom/compiler: support global load/store intrinsics

  • broadcom/compiler: use nir_lower_mem_access_bit_sizes for memory lowering

  • broadcom/compiler: convert 2x32 global operations to scalar variants

  • broadcom/compiler: only handle load_uniform explicitly in v3d_nir_lower_load_store_bitsize

  • broadcom/compiler: rework scratch lowering

  • rusticl/meson: add build root dir to the include dirs of rusticl_c

  • rusticl: depend on the spirv_info target

  • util/u_printf: properly handle %%

  • rusticl/memory: assume minimum image_height of 1

  • rusticl/memory: fix clFillImage for buffer images

  • rusticl: add new CL_INVALID_BUFFER_SIZE condition for clCreateBuffer

  • rusticl: add bsymbolic to linker flags

  • rusticl/icd: rename all entry points to the actual correct name

  • radeonsi: set bo_size for user memory allocations

  • rusticl/queue: gracefully stop the worker thread

  • rusticl/queue: run rustfmt

  • nir/lower_alu: support 8 and 16 bit bit_count

  • nir/opt_sink: add load_kernel_input

  • gallium: add PIPE_CAP_TEXTURE_SAMPLER_INDEPENDENT

  • rusticl/device: require PIPE_CAP_TEXTURE_SAMPLER_INDEPENDENT for image support

  • rusticl/mesa/context: handle clear_buffer not set by driver

  • rusticl/mesa/screen: handle get_timestamp not set by driver

  • rusticl/kernel/launch: fix global work offsets for 32 bit archs again

  • broadcom/compiler: add generated v3d_nir_lower_algebraic

  • broadcom/compiler: handle fp16 conversion ops

  • broadcom/compiler: fix iu2f32 for 8 and 16 bit inputs

  • broadcom/compiler: try handling 8/16 bit alu operations

  • broadcom/compiler: handle up to vec16 load_uniforms

  • broadcom/compiler: abort on unknown intrinsics

  • broadcom/compiler: implement load_kernel_input

  • broadcom/compiler: call nir_lower_64bit_phis

  • broadcom/compiler: handle variable shared memory

  • v3d: implement gallium APIs for OpenCL support

  • v3d: treat SHADER_KERNEL as SHADER_COMPUTE

  • v3d: lower CL alus

  • v3d: lower 64 bit ALUs

  • v3d: support variable shared memory

  • v3d: fix MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE

  • v3d: never replace a mapped bo

  • rusticl: enable v3d

  • nir/schedule: add write dep also for shared_atomic

  • meson: rename with_gallium_opencl to with_gallium_clover

  • rusticl/program: move binary parsing into its own function

  • rusticl/program: make binary API not crash on errors

  • rusticl/program: use blob.h to parse binaries

  • rusticl/program: update binary format

  • rusticl/program: use default in more places

  • Revert “rusticl/queue: run rustfmt”

  • Revert “rusticl/queue: gracefully stop the worker thread”

  • rusticl/buffer: harden bound checks against overflows

  • rusticl/context: move SVM pointer tracking into own type

  • rusticl/ptr: add a few APIs to TrackedPointers

  • rusticl/memory: complete rework on how mapping is implemented

  • rusticl: remove unused interfaces to simplify code

  • rusticl/mesa: remove ResourceType::Cb0

  • rusticl/memory: optimize sw_copy when the row_pitch matches the height

  • rusticl/mesa: make PipeResource repr(transparent)

  • v3d: support unnormalized coords

  • rusticl/spirv: support more caps

  • rusticl/device: fix image_3d_write_supported for embedded

  • rusticl/device: turn image_3d_write_supported into a cap

  • rusticl/device: fix advertizement of 3d write images support

  • rusticl: require PIPE_CAP_IMAGE_STORE_FORMATTED for image support.

  • rusticl/event: make set_status handle error status properly

  • rusticl/queue: do not overwrite event error states

  • rusticl/queue: properly check all dependencies for an error status

  • rusticl/event: properly implement CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST

  • rusticl/queue: properly implement in-order queue error checking

  • rusticl/event: return execution errors when doing a blocking enqueue

  • rusticl/mesa: handle failures with u_upload_data

  • rusticl/mesa: set take_ownership to true in set_constant_buffer_stream

  • rusticl/event: fix outdated comment in call

  • rusticl/queue: format file

  • zink: fix OpenCL read_write images

  • rusticl: support read_write images

  • spirv: generate info for FunctionParameterAttribute

  • spirv: initial parsing of function parameter decorations

  • spirv: handle function parameters passed by value

  • nak: allow clippy::not_unsafe_ptr_arg_deref lints

  • clc: force linking of spirvs with mismatching pointer types in signatures

  • rusticl: fix clippy lint having bounds defined in multiple places

  • rusticl/program: protect against 0 length in slice::from_raw_parts

  • rusticl/api: protect against 0 length in slice::from_raw_parts

  • rusticl/spirv: protect against 0 length in slice::from_raw_parts

  • nouveau: handle realloc failure inside cli_kref_set

  • mesa: check for enabled extensions for *UID enums

  • nouveau/winsys: fix handling of NV_DEVICE_TYPE_IGP

  • nouveau: use nv_devince_info and fill in PCI and type information

  • nouveau: add nv_device_uuid

  • nouveau: implement driver_uuid and device_uuid

  • nvk: use nv_device_uuid

  • zink: lower 64 bit find_lsb, ufind_msb and bit_count

  • zink: lower 8/16 bit alu ops vk spirv doesn’t allow

  • rusticl/kernel: properly respect device thread limits per dimension

  • rusticl/memory: Fix memory unmaps after rework

  • rusticl/image: take pitches into account when allocating memory for maps

  • rusticl/image: properly sync mappings content for 1Dbuffer images

  • rusticl/queue: add clSetCommandQueueProperty

  • util/u_printf: do not double print format string with unused arugments

  • rusticl/memory: fix sampler argument size check

Kenneth Graunke (63):

  • isl: Set MOCS to uncached for Gfx12.0 blitter sources/destinations

  • intel/brw: Delete gfx10 table for align1 3src type encoding

  • intel/brw: Drop NF type support

  • intel/brw: Rework BRW_REGISTER_TYPE’s representation semantics

  • intel/brw: Stop using long BRW_REGISTER_TYPE enum names

  • intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_*

  • intel/brw: Use newer brw_type_is_* shorter names

  • intel/brw: Replace brw_reg_type_from_bit_size by brw_type_with_size

  • intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*

  • intel/brw: Combine a1/a16 3src type encoding functions

  • intel/brw: Combine a1/a16 3src type decoding functions

  • intel/brw: Rename brw_reg_type_to_hw_type to brw_type_encode

  • intel/brw: Don’t use inst return value when it isn’t needed

  • intel/brw: Make a helper for finding the largest of two types

  • intel/brw: Add builder helpers for math functions

  • intel/brw: Add builder helpers that allocate temporary destinations

  • intel/brw: Use new builder helpers that allocate a VGRF destination

  • intel/brw: Print W/UW immediates correctly

  • intel/brw: Do not create empty basic blocks when removing instructions

  • intel/brw: Support CSE on more ops

  • intel/brw: Don’t include unnecessary undefined values in texture results

  • intel/brw: Add a new VEC() helper.

  • intel/brw: Use VEC for load_const

  • intel/brw: Use VEC for gl_FragCoord

  • intel/brw: Use VEC for TCS/TES/GS input/output loads

  • intel/brw: Use VEC for FS outputs

  • intel/brw: Use VEC for output stores

  • intel/brw: Use VEC for NIR vec*() sources

  • intel/brw: Use VEC for emit_unzip()

  • intel/nir: Set src_type on TCS quads workaround store_output

  • intel/brw: Blockify convergent load_shared on Gfx11-12 as well

  • intel/brw: Recreate GS output registers after EmitVertex

  • intel/brw: Skip fs_nir_setup_outputs for compute shaders

  • intel/brw: Handle scratch address swizzling of constants

  • intel/brw: Add a idom_tree::dominates(a, b) helper.

  • intel/brw: Make brw_reg::bits publicly accessible from fs_reg

  • intel/brw: Update instructions_match() to compare more fields

  • intel/brw: Drop compiler parameter from try_constant_propagate()

  • intel/brw: Drop BRW_OPCODE_IF from try_constant_propagate

  • intel/brw: Refactor try_constant_propagate()

  • intel/brw: Refactor code to commute immediates into legal positions

  • intel/brw: Delete SAD2 and SADA2 opcodes

  • intel/brw: Make VEC() perform a single write to its destination.

  • intel/brw: Make gl_SubgroupInvocation lane index loading SSA

  • intel/brw: Skip LOAD_PAYLOADs after every texture instruction if possible

  • intel/brw: Add a new def analysis pass

  • intel/brw: Print defs in dump_instructions

  • intel/brw: Write a new global CSE pass that works on defs

  • intel/brw: Switch to the new defs-based global CSE pass

  • intel/brw: Delete old local common subexpression elimination pass

  • intel/brw: Introduce a new SSA-based copy propagation pass

  • intel/brw: Use the defs-based copy propagation along with the old one

  • intel/brw: Make opt_copy_propagation_defs clean up its own trash

  • intel/brw: Build the scratch header on the fly for pre-LSC systems

  • intel/brw: Skip discarding the interference graph

  • intel/brw: Delay liveness calculations in saturate propagation

  • intel/brw: Make an alu2 builder helper

  • intel/brw: Make bld.ADD(x, 0) emit no instructions and return x directly

  • intel/brw: Support CSE of ADD3

  • intel/brw: Add a lower_csel pass and allow building it for all types

  • intel/nir: Don’t needlessly split u2f16 for nir_type_uint32

  • intel/brw: Don’t mix types for unary extended math instructions

  • intel/brw: Disallow scalar byte to float conversions on DG2+

Kevin Chuang (6):

  • anv: Properly fetch partial results in vkGetQueryPoolResults

  • anv: Properly handle cases for different query types in copy_query_results_with_shader

  • intel/genxml: add task/mesh shader statistics registers

  • anv: Update pipeline statistics mask for task/mesh shader invocations

  • anv: implement mesh shader queries

  • anv: toggle meshShaderQueries based on whether we support mesh_shader or not

Khem Raj (1):

  • amd: Include missing llvm IR header Module.h

Konstantin (4):

  • docs: Add documentation about debugging GPU hangs on RADV

  • ac/debug,radv: Read UMR wave dumps into memory before parsing

  • radv: Use a struct for the trace_bo layout

  • radv: Trace indirect dispatch sizes

Konstantin Seurer (59):

  • radv: Handle all dependencies of CmdWaitEvents2

  • nir/print: Do not access invalid indices of load_uniform

  • radv: Fix radv_shader_arena_block list corruption

  • radv: Remove arenas from capture_replay_arena_vas

  • radv: Zero initialize capture replay group handles

  • radv/ci: Add back pipeline library flakes

  • radv/ci: Document recent flakes

  • gitlab: Reference hang debugging documenttion

  • radv: Remove radv_cmd_dirty_dynamic_bits

  • llvmpipe: Use a second LLVMContext for compiling sample functions

  • radv: Add locking to radv_replay_shader_arena_block

  • radv: Replace is_rt_shader with RADV_SHADER_TYPE_RT_PROLOG

  • radv: Remove uses_dynamic_rt_callable_stack

  • radv/rt: Track ray_launch_id reads

  • radv/rt: Track ray_launch_size reads

  • radv/rt: Remove load_rt_dynamic_callable_stack_base_amd

  • radv: Return a block from radv_replay_shader_arena_block

  • ac/llvm: Fix DENORM_FLUSH_TO_ZERO with exact instructions

  • ac/llvm: Enable helper invocations for vote_all/any

  • radv/ci: Bring back vkcts-navi21-llvm-valve

  • khronos-update: Add ANDROID guards to vk_android_native_buffer.h

  • zink: Always include renderdoc_app.h

  • zink: Blit using one triangle for nearest filtering

  • llvmpipe: Lock shader access to sample_functions

  • llvmpipe: Stop using a sample_functions pointer as cache key

  • llvmpipe: Only evict cache entries if a fence is available

  • lavapipe: Always call finish_fence after lvp_execute_cmd_buffer

  • radv: Clean up pipeline barrier handling

  • radv: Remove dead access bits

  • radv/meta: Use READ access for dst_access_flush

  • radv/rra: Detect BVHs with back edges

  • radv/rra: Move some code into handle_accel_struct_write

  • radv/rra: Fix disabling the ray history

  • radv/rra: Fix reporting the isec invocations

  • radv/rra: Bump rt_driver_interface_version to 8.0

  • radv/rra: Reduce the memory requirement of copy_after_build

  • radv/rra: Rework calculating the ray history size

  • radv/rra: Enable RADV_RRA_TRACE_COPY_AFTER_BUILD by default

  • util: Add a helper for querying sparse tile sizes

  • lavapipe: Do not allocate 0 sized buffers for descriptor sets

  • gallium: Add a memory range parameter to resource_bind_backing

  • llvmpipe: Use an anonymous file for memory allocations

  • lavapipe: Implement sparse buffers and images

  • lavapipe: Implement shaderResourceResidency

  • venus: Refactor hiding sparse features and properties

  • venus: Disable sparse binding on lavapipe

  • vulkan: Handle group stages in vk_.*_access2_for_pipeline_stage_flags2

  • vulkan: Add vk_expand_(dst|src)_access_flags2

  • radv: Use vk_expand_(src|dst)_access_flags2

  • radv: Remove no-op access flag handling

  • radv: Remove handling for expanded access flags

  • radv: Remove write access handling from radv_dst_access_flush

  • radv: Handle AS access bits like shader storage access bits

  • radv: Refactor radv_(dst|src)_access_flush

  • radv: Fix smooth lines with dynamic polygon mode and topology

  • radv: Always use dynamic line smoothing

  • nir: Stop using “capture : true” for nir_opt_algebraic

  • nir: Add FLOAT_CONTROLS_.*_PRESERVE

  • aco: print s_delay_alu INSTSKIP>3 correctly

Leo Liu (4):

  • ac/surface: add GFX12 256B tile mode for video

  • ac/surface/tests: add the test for ADDR3_256B_2D

  • radeon/vcn: use pipe video buffers for dpb

  • radeon/vcn: enable dpb to use pipe video buffer with swizzle mode

Lionel Landwerlin (125):

  • anv: disable dual source blending state if not used in shader

  • anv: reuse embedded samplers across shaders

  • anv: simplify multisampling check

  • anv: fixup indentation

  • anv: factor out wm_prog_data get in runtime flush

  • intel/brw: fixup wm_prog_data_barycentric_modes()

  • intel/fs: decouple alphaToCoverage from per sample dispatch

  • intel/brw: add min_sample_shading value in wm_prog_data

  • anv: track sample shading enable & min sample shading

  • anv: add dirty tracking of fs_msaa_flags in runtime

  • anv: move 3DSTATE_WM::BarycentricInterpolationMode programming to runtime

  • anv: move more PS_EXTRA programming to runtime

  • anv: move 3DSTATE_PS to partial packing

  • anv: move 3DSTATE_MULTISAMPLE to partial emission

  • anv: remove fs_msaa_flags from the graphics pipeline

  • anv: enable EDS3 AlphaToCoverageEnable & RasterizationSamples

  • anv: fixup alloc failure handling in reserved_array_pool

  • anv: fix leak of custom border colors

  • anv: avoid requirement to put flush_data as first field

  • anv: move device initialization as the last step of vkCreateDevice

  • anv: move empty_vs_input to physical device

  • anv: VK_EXT_legacy_vertex_attributes

  • docs: update anv features

  • anv: fix ycbcr plane indexing with indirect descriptors

  • intel/hang_replay: use newer API of i915 execbuffer

  • intel/hang_replay: use hw image param

  • intel/tools: add README file

  • brw: add more condition for reducing sampler simdness

  • intel: move debug identifier out of libintel_dev

  • brw: drop dependency on libintel_common

  • anv: fix push constant subgroup_id location

  • nir/divergence: add missing load_printf_buffer_address

  • nir: add a base offset for printf indexing

  • nir: add ptr_bit_size parameter to nir_lower_printf

  • nir: add a low level printf emission helper

  • intel/nir: remove unused prototypes

  • intel/nir: add reloc delta to load_reloc_const_intel intrinsic

  • intel/compiler: store u_printf_info in prog_data

  • intel/nir: add printf lowering

  • anv: add debug shader printf support

  • intel/clc: enable printfs support

  • anv: shader printf example

  • anv: switch to vk_device::mem_cache field for default cache

  • anv: use weak_ref mode for global pipeline caches

  • anv: fix shader identifier handling

  • intel/brw: ensure find_live_channel don’t access arch register without sync

  • anv: fix utrace compute walker timestamp captures

  • anv: fix timestamp copies from secondary buffers

  • anv: move last compute command pointers to the state structure

  • u_trace: extend tracepoint end_of_pipe bit into flags

  • anv: optimize POSTSYNC_DATA rewrites in timestamp emissions

  • intel: fix HW generated local-id with indirect compute walker

  • brw: use a single virtual opcode to read ARF registers

  • brw: limit dependencies on SR register

  • brw: better model READ_ARF_REG opcode

  • anv: fix Gfx9 fast clears on srgb formats

  • anv: rewrite Wa_18019816803 tracking to be more like state

  • anv: factor out some more gpu_memcpy setup

  • anv: fix pipeline flag fields

  • anv: expose VK_MESA_image_alignment_control

  • anv: support setting CFE_STATE::StackIDControl per application

  • anv: limit aux invalidations to primary command buffers

  • anv: ensure completion of surface state copies before secondaries

  • anv: simplify TRTT initialization

  • anv: reuse setup_execbuf_fence_params for utrace submissions

  • anv: rework utrace submission

  • anv: move trtt submissions over to the anv_async_submit

  • anv: use reserved array pool for legacy custom border colors

  • anv: make device initialization more asynchronous

  • mi-builder: rename relocated api

  • mi-builder: c++ warning fix

  • mi-builder: make instruction pointer manipulation more obvious

  • mi-builder: add missing write completion check

  • mi-builder: add relocated register/memory writes

  • mi-builder: add a write check parameter

  • anv: centralize mi_builder setup

  • anv: use the new relocated write mi-builder api

  • anv: move more MI_SDI to mi_builder

  • anv: use default mocs for memory bits only touched by CS

  • anv: set query mi-builder mocs only once

  • anv: use new mi-builder write check API to avoid stalls

  • genxml: add MI_MEM_FENCE for Gfx20

  • mi-builder: add read/write memory fencing support on Gfx20+

  • intel/fs: fix lower_simd_width for MOV_INDIRECT

  • anv: add custom mi write fences

  • anv: emit conditional after gfx state flushing

  • anv: factor out STATE_BASE_ADDRESS filling to helper function

  • anv: predicate emission of STATE_BASE_ADDRESS

  • anv: reuse device local variable

  • anv: avoid initalizing TRTT stuff without sparseBinding

  • anv: fix vkCmdWaitEvents2 handling

  • anv: don’t apply descriptor array bound checking

  • brw: add missing break

  • brw: factor out source extraction for rematerialization

  • brw: improve rematalization of surface/sampler handles

  • brw: bound the amount of rematerialized NIR instructions

  • brw: remove rematerialization assert

  • brw: remove some brackets

  • brw: enable rematerialization of non 32bit uniforms

  • brw: always use new registers for load address increments

  • brw: annotation send instructions with surface handles generated with exec_all

  • brw: avoid Wa_1407528679 in uniform cases

  • brw: blockify load_global_const_block_intel

  • brw: enable A64 loads source rematerialization

  • anv: limit vertex fetch invalidation on indirect read

  • anv: add a protected scratch pool

  • anv: prepare 2 variants of all shader instructions

  • anv: allocate compute scratch using the right scratch pool

  • anv: emit the right shader instruction for protected mode

  • anv: workaround flaky xfb query results on Gfx11

  • anv: fix u_trace on < Gfx12.0

  • intel/ds: remove duplicate arguments

  • hasvk: move cmd_emit_timestamp initialization to genX

  • hasvk: pass anv_address to predicate helper

  • brw: fix uniform rebuild of sources

  • anv: get rid of the second dynamic state heap

  • isl: account for protection in base usage checks

  • anv: properly flag image/imageviews for ISL protection

  • anv: propagate protected information for blorp operations

  • anv: fix check on pipeline mode to track buffer writes

  • vulkan/runtime: allow null/empty debug names

  • anv: reuse object string for RMV token

  • anv: add missing MEDIA_STATE_FLUSH for internal shaders

  • anv/blorp: force CC_VIEWPORT reallocation when programming 3DSTATE_VIEWPORT_STATE_POINTERS_CC

  • brw/rt: fix ray_object_(direction|origin) for closest-hit shaders

Louis-Francis Ratté-Boulianne (20):

  • dri_interface: add interface for EGL_EXT_surface_compression

  • gallium: add interface for fixed-rate surface/texture compression

  • egl/wayland: factor out common part of DRI image creation

  • egl: wire up EGL_EXT_surface_compression extension

  • st/dri2: add support for fixed-rate compression interface

  • egl/dri2: add support for EGL_EXT_surface_compression

  • mapi: add EXT_texture_storage_compression extension

  • mesa/st: add compression parameter to st_texture_create

  • mesa: implement EXT_texture_storage_compression extension

  • mesa: implement EXT_EGL_image_storage_compression extension

  • panfrost: Add AFRC overlay in v10 xml specification

  • panfrost: add device querying for AFRC support

  • panfrost: add utils for AFRC fixed-rate support

  • panfrost: encode component order as an inverted swizzle (v10)

  • panfrost: add support for AFRC textures

  • panfrost: add support for AFRC render targets

  • panfrost: add support for AFRC modifiers

  • panfrost: add translation between modifier and compression rates

  • panfrost: add support for fixed-rate compression

  • panfrost: add PAN_AFRC_RATE env var to force a compression rate

Luc Ma (4):

  • loader: silence implicit-load zink error by the loader

  • gallium: properly propagate the usage of resource

  • gallium: inline trivial needs_pack()

  • meson: Build pipe-loader when build-tests is true

Lucas Fryzek (7):

  • llvmpipe: query winsys support for dmabuf mapping

  • u_gralloc/fallback: Set fd from handle directly

  • egl/x11/sw: Implement swapbuffers with damage

  • vulkan/wsi: Update sw x11 wsi to only copy damage regions

  • egl/x11/sw: Implement shm support

  • egl/x11: Remove force software check for exporting SBWD

  • lp: only map dt buffer on import from dmabuf

Lucas Stach (2):

  • etnaviv: drm: don’t skip flush when there are active PMRs

  • etnaviv: always flush pending queries on get_query_result

M Henning (2):

  • nir: Handle texop_*_nv in nir_tex_instr_is_query

  • nak: Add minimum bindgen requirement

Maaz Mombasawala (2):

  • svga: Retry DRM_VMW_SYNCCPU ioctl on failure.

  • svga: Replace shared surface flag and simplify surface creation

Marcin Ślusarz (2):

  • intel/genxml/xe2: update MESH_CONTROL

  • anv,intel/compiler/xe2: fill MESH_CONTROL.VPandRTAIndexAutostripEnable

Marek Olšák (174):

  • ac/gpu_info: set tcc_rb_non_coherent only if number of TCCs != number of RBs

  • ac/surface: disable DCC for 3D textures on gfx9 to improve performance

  • ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

  • radeonsi: don’t invalidate L2 for internal compute without DCC stores

  • radeonsi: fix KHR-GL46.texture_lod_bias.texture_lod_bias_all on gfx10-11

  • radeonsi: validate IO semantics in scan_io_usage

  • radeonsi: add workarounds for DCC MSAA for gfx9-10

  • radeonsi: enable DCC for MSAA on gfx10-10.3

  • radeonsi: check for FMASK correctly in gfx10_get_bin_sizes

  • amd/ci: 17 piglit failures are fixed for raven

  • nir: add ACCESS_CP_GE_COHERENT_AMD

  • nir: add nir_atomic_op_ordered_add_gfx12_amd

  • nir: add streamout intrinsics for AMD GFX12

  • nir: add sleep intrinsics for AMD

  • nir: validate src_type of store_output intrinsics, require bit_size >= 16

  • nir: add more build helpers

  • nir: add shader_info::use_aco_amd

  • nir/lower_tex: support FMASK loads with a 16-bit sample index

  • nir/lower_image: support FMASK loads with a 16-bit sample index

  • drm-uapi: update amdgpu_drm.h and drm_fourcc.h for gfx12

  • amd: import gfx12 addrlib

  • amd: add gfx12 register definitions

  • amd: add gfx12 register definitions into the register header generator

  • amd: add initial common code for gfx12

  • ac/nir: update ac_nir_lower_resinfo for gfx12

  • ac/nir,llvm: add GS VGPR changes for gfx12

  • ac/llvm: use new s_wait instructions and split the existing ones for gfx12

  • ac/llvm: add new cache flags for gfx12

  • ac/llvm: add CS SGPR changes for gfx12

  • ac/llvm: update inline assembly for buffer_load_format_xyzw with TFE for gfx12

  • ac/nir: add ac_nir_sleep and handle the intrinsics

  • ac/nir: add gfx12 streamout NIR code

  • ac/llvm: handle nir_atomic_op_ordered_add_gfx12_amd

  • ac/llvm: implement nir_intrinsic_ordered_xfb_counter_add_gfx12_amd

  • ac/llvm: add a workaround for nir_intrinsic_load_constant for LLVM on gfx12

  • ac/surface: add gfx12

  • ac/surface/tests: add gfx12 tests

  • radeonsi: add gfx12

  • util: shift the mask in BITSET_TEST_RANGE_INSIDE_WORD to be relative to b

  • ac/llvm: improve/simplify/fix load_ssbo

  • radeonsi: serialize shader disassembly string to fix asm dumps for ACO

  • radeonsi: fix the size of the query result SSBO

  • radeonsi: validate the buffer range in si_set_shader_buffer

  • radeonsi: remove GDS tests

  • radeonsi: set flags directly instead of having needs_db_flush

  • radeonsi/gfx11: use a lighter workaround for Navi31 dEQP failures

  • radeonsi: get NIR options from si_screen instead of calling get_compiler_options

  • radeonsi: minor simplifications of clear/copy_buffer shaders

  • radeonsi: simplify the complex clear/copy_buffer shader

  • radeonsi: use set_work_size for all internal compute dispatches

  • radeonsi: replace the clear_12bytes_buffer shader with the DMA compute shader

  • radeonsi: remove slow code from si_msaa_resolve_blit_via_CB

  • radeonsi/ci: fix caselists for vk-gl-cts/main

  • radeonsi/ci: update failures for all generations

  • radeonsi/ci: remove some gfx11 flakes

  • radeonsi: constify struct pipe_vertex_buffer *

  • nir/algebraic: eliminate pack+unpack and unpack+pack pairs

  • ac: move radv_mem_vectorize_callback to common code

  • ac/llvm: global stores should have no holes in the writemask

  • radeonsi: call nir_lower_int64 later to fix ACO failure with Tomb Raider

  • radeonsi: vectorize load/stores and shrink stores

  • amd: update addrlib

  • amd: add more gfx11 APUs

  • amd: enable 32B minimum DCC block size for gfx1151

  • ac/llvm: fix incorrect parameter type in llvm.amdgcn.s.nop

  • radeonsi: vectorize loads/store after ABI lowering and optimizations

  • radeonsi/gfx12: fix the alpha ref value

  • radeonsi/gfx12: fix incorrect condition for when to do clear_buffer via compute

  • radeonsi/gfx12: disable CU1 instead of CU0 for GS due to SQTT

  • radeonsi/gfx12: fix a regression in si_set_mutable_tex_desc_fields

  • radeonsi/gfx12: fix depth bounds register values

  • radeonsi/gfx12: fix a regression in si_init_depth_surface

  • radeonsi: don’t lower UBO/SSBOs to descriptors if they are already lowered

  • radeonsi: lower NIR resource srcs to descriptors last

  • ac/descriptors: fix gfx12 regressions

  • ac/nir/lower_ngg: use global_atomic_amd to fix gfx12 streamout

  • ac/nir/lower_ngg: use voffset in global_atomic_add for xfb

  • ac: add gfx12 DCC shared code

  • radeonsi/gfx12: fix GPU deadlocks due to query result incoherency

  • radeonsi: assume si_set_ring_buffer is only used by gfx6-10.3

  • radeonsi: remove cp_to_L2 and L2_to_cp, inline the values

  • radeonsi: remove RADEON_FLAG_READ_ONLY

  • radeonsi: allow RADEON_HEAP_BIT_GL2_BYPASS for VRAM

  • radeonsi: remove leftover comment of non-existent RADEON_FLAG_MALL_NOALLOC

  • radeonsi/gfx12: add DCC

  • ac/surface: pass the correct addrlib handle to Addr3GetPossibleSwizzleModes

  • amd: update addrlib

  • ac/nir/lower_ngg: don’t use gfx12 xfb defs outside their basic block on gfx11

  • radeonsi/gfx12: fix stencil corruption

  • gallium/u_blitter: add option to override fragment shader for util_blitter_blit

  • radeonsi: don’t declare 3D coordinates in the compute blit if they aren’t needed

  • radeonsi: use better workgroup sizes for compute blits to improve perf

  • radeonsi: ignore PIPE_SWIZZLE_1 for 40% VGPR usage reduction for compute blits

  • radeonsi: remove fp16_rtz from the compute blit

  • radeonsi: use MIMG D16 (16-bit data) for image instructions in compute blits

  • radeonsi: optimize unaligned compute blits

  • radeonsi: fix sample0_only for the compute blit

  • radeonsi: reject unsupported parameters as the first thing in the compute blit

  • radeonsi: don’t use si_can_use_compute_blit in the compute blit

  • radeonsi: don’t fail due to DCC when using the compute blit on compute queues

  • radeonsi/gfx11: enable MSAA image stores in the compute blit

  • radeonsi: document better how X/Y flipping in the compute blit works

  • radeonsi: cosmetic and robustness changes for the compute blit

  • radeonsi: extend the compute blit to do image clears as well

  • radeonsi: switch compute image clears to the compute blit shader

  • radeonsi: rename si_compute_blit “testing” parameter to “fail_if_slow”

  • radeonsi: rename si_compute_copy_image -> si_compute_copy_image_old

  • radeonsi: add a new version of si_compute_copy_image using the compute blit

  • radeonsi: switch the old compute image copy to the new one using the blit

  • radeonsi: remove the old si_compute_copy_image

  • radeonsi: convert the compute blit shader hash table to u64 keys

  • radeonsi: split xy_clamp_to_edge to separate X and Y flags for the compute blit

  • radeonsi: restructure (rewrite) the compute blit shader

  • radeonsi: adds flags parameter into si_compute_blit to replace fail_if_slow

  • radeonsi: change the compute blit to clear/blit multiple pixels per lane

  • radeonsi: extend NIR compute helpers to allow returning 16-bit results

  • radeonsi: use MIMG A16 (16-bit image coordinates) in compute blits

  • radeonsi: print the compute shader blit key for AMD_DEBUG

  • radeonsi: use shader_info::use_aco_amd to determine whether to use ACO

  • radeonsi: add use_aco into CS blit shader key

  • radeonsi: clear color buffers via compute for special tiling cases

  • radeonsi: add a custom MSAA resolving pixel shader

  • radeonsi: add fail_if_slow parameter into si_msaa_resolve_blit_via_CB

  • radeonsi: add a new blit microbenchmark

  • radeonsi: add decision code to select when to use CB_RESOLVE for performance

  • radeonsi: add decision code to select when to use compute blit for performance

  • ac/nir: import the MSAA resolving pixel shader from radeonsi

  • ac/nir: import the universal compute clear/blit shader

  • ac/nir: import the dispatch logic for the universal compute clear/blit shader

  • Revert “radeonsi: fix initialization of occlusion query buffers for disabled RBs”

  • radeonsi/ci: update gfx10.3 failures

  • nir/lower_io_to_scalar: add new_component temporary variable

  • nir/lower_io_to_scalar: don’t create output stores that have no effect

  • nir: add nir_opt_vectorize_io, vectorizing lowered IO

  • glsl/linker: vectorize lowered IO

  • nir: add a NIR option flag nir_io_prefer_scalar_fs_inputs

  • ac/nir/cdna: allow 16-bit coordinates

  • ac/nir/cdna: ignore image_descriptor intrinsics

  • ac/nir/cdna: don’t use image_descriptor intrinsics if the src is a descriptor

  • mesa: switch remaining shader functions from SHA1 to BLAKE3

  • radeonsi: replace shader SHA1 hashes with BLAKE3

  • radeonsi: don’t use CP DMA on GFX940

  • nir: rename ordered_xfb_counter_add_gfx12_amd -> ordered_add_loop_gfx12_amd

  • ac/nir: remove sleeps from gfx12 streamout code

  • ac/llvm: remove s_nop from ordered_add_loop_gfx12_amd

  • ac/llvm: fix inline assembly register constraints for ordered_add_loop_gfx12_amd

  • as/llvm: add s_nops before the ordered add loop and s_wait_alu workaround

  • radeonsi: implement nir_intrinsic_load_ssbo_address

  • radeonsi: expose internal buffer bindings to compute shaders

  • radeonsi/gfx12: always set BO metadata, not just during export

  • radeonsi/gfx12: fix compute register settings for global_atomic_ordered_add

  • ac/surface: finish display DCC for gfx11.5

  • ac/surface: finish display DCC for gfx12

  • radeonsi: add fail_if_slow parameter into compute_clear/copy_buffer

  • radeonsi: use a hash_table and define a shader key for the DMA compute shader

  • radeonsi: add dwords_per_thread parameter into si_compute_clear_copy_buffer

  • radeonsi: clear buffers with a 12B clear value by clearing 4 dwords per thread

  • radeonsi: rewrite the clear/copy_buffer microbenchmark

  • radeonsi/ci: update gfx11 failures

  • radeonsi: replace si_shader::scratch_bo with scratch_va, don’t set it on gfx11+

  • radeonsi: don’t update compute scratch if the compute shader doesn’t use it

  • ac: add radeon_info::has_scratch_base_registers

  • radeonsi: lock a mutex when updating scratch_va for compute shaders

  • util: make util_idalloc_exists private

  • util: don’t use variable names that can appear in args of idalloc foreach macros

  • util: add util_idalloc_sparse, solving the excessive virtual memory usage

  • mesa: switch ID allocation to util_idalloc_sparse to reduce virtual memory usage

  • nir/opt_algebraic: use fmulz for fpow lowering to fix incorrect rendering

  • radeonsi/gfx12: fix a GPU hang due to an invalid packet with window rectangles

  • radeonsi: ensure TC_L2_dirty is set if we don’t sync after internal SSBO blits

  • radeonsi: fix buffer coherency issues on gfx6-8,12 due to missing PFP->ME sync

  • radeonsi/gfx12: fix register programming to fix GPU hangs

  • radeonsi/gfx12: fix VS output corruption with streamout

  • ac/surface/gfx12: turn off HiZ for pre-production samples

Mark Burton (1):

  • gallivm: Fix compilation errors when using LLVM 13.

Mark Collins (21):

  • vdrm: Add fixed VA parameter for mapping memory

  • tu: Handle VkDeviceMemory BO unmapping in VkUnmapMemory

  • tu: Implement VK_EXT_map_memory_placed

  • docs/features: Add VK_EXT_map_memory_placed

  • tu/shader: Allow LRZ when write pos with explicit early frag test

  • tu/lrz: Emit GRAS_LRZ_CNTL2 on A7XX

  • tu/lrz: Use actual CHIP rather than hardcoding A6XX

  • fd/a7xx: Initialize magic register 8C34 to 0

  • fd/a7xx: Initialize magic register 8008 to 0

  • tu: Allow LRZ on A7XX

  • tu/lrz: Add structure for LRZ FC layout

  • tu: Update LRZ FC allocation for A7XX layout

  • tu: Update LRZ FC dirty clear for A7XX

  • tu: Specify LRZ FC depth clear value on A7XX

  • tu: Enable LRZ fast-clear for A7XX

  • fd/a7xx: Document `LRZ_FLIP_BUFFER` event

  • docs/freedreno: Add documentation on A7XX LRZ

  • tu: Emit GRAS_LRZ_DEPTH_BUFFER_INFO correctly

  • tu/kgsl: Spin unti KGSL reports queue timestamp during profiling

  • tu/kgsl: Fix profiling buffer GPU IOVA

  • fd/meson: Only build ‘ds’ when system has DRM

Martin Krastev (2):

  • svga: convert license block to SPDX

  • svga: update timespan in copyright message

Martin Roukala (né Peres) (9):

  • ci/b2c: Reduce the length of the kernel cmdline

  • nvk+zink/ci: rename the ga106 jobs to be more in line with RADV

  • nvk+zink/ci/ga106: make the expectations codename-specific

  • nvk+zink/ci: document more flakes in the ga106

  • turnip/ci: document a missing flake from the a750_vk job

  • turnip/ci: bump the a750_vk timeout

  • turnip+zink/ci: add more flakes to the expectations

  • radv+zink/ci: document recent flakes

  • radv/ci: add a bunch of flakes

Mary Guillemard (86):

  • nak: Pass has_mod to all form of src2 requiring it

  • panvk: Ensure we lower load_base_workgroup_id to 0

  • panfrost: Skip new failure from VKCTS 1.3.8.x

  • nvk, nak: Wire up conservative rasterization underestimate

  • docs/features: Add EXT_conservative_rasterization for NVK

  • agx: speed-up dce

  • panvk: Only clear UBOs descriptors when set isn’t present

  • nouveau: nvidia_header: Add AMPERE_B class generation

  • nak: Set SPH version to 4 on SM75+

  • nak: Migrate sph.rs to use SPH headers defintion

  • bi: Reformat code

  • midgard: Reformat code

  • bi: Alloc replacement array once in opt_cse

  • pan/lib, panvk: Ensure data_size is on 64 bits

  • panvk: Fix shader destruction when vk_shader_module_to_nir fail

  • panvk: Remove panvk_lower_blend

  • panvk: Remove dynarray from panvk_shader

  • panvk: Keep panvk_shader alive in panvk_pipeline_shader

  • panvk: Upload shader in panvk_shader

  • panvk: Upload copy tables in panvk_shader

  • panvk: Upload render state in panvk_shader

  • panvk: Move the linking bits to panvk_shader

  • panvk: Kill panvk_pipeline_shader and use panvk_shader directly

  • panvk: Link shaders at draw time

  • panvk: Move compile logic out of shader_create

  • panvk: Move NIR lower logic out of shader_create

  • panvk: Move preprocess logic out of shader_create

  • panvk: Implement vk_shader

  • panvk: Remove panvk_pipeline

  • pan/va: Ensure no clash with other defs in disassembler

  • bi: Make disassembler take a const void*

  • midgard: Make disassembler take a const void*

  • bi: Move bi_disasm definitions to their own header

  • panfrost: Add pan_shader_disassemble

  • panvk: Implement executable IR reporting

  • panvk: Advertise VK_KHR_pipeline_executable_properties

  • panvk: Generate proper device and driver UUIDs

  • panvk: Advertise VK_EXT_pipeline_creation_cache_control and VK_EXT_pipeline_creation_feedback

  • panvk: Advertise VK_EXT_shader_module_identifier

  • panvk: Advertise VK_KHR_pipeline_library and VK_EXT_graphics_pipeline_library

  • panvk: Enable pipeline library in CI for Mali-G52

  • docs: Update features.txt to add panvk for BDA extensions

  • panvk: Advertise VK_KHR_device_group and VK_KHR_device_group_creation

  • panvk: Reorder extensions by name

  • panvk: Advertise VK_KHR_maintenance3

  • panvk: Add missing null check in DestroyCommandPool

  • panvk: Add missing clean up in blend_shader_cache_init

  • panvk: Make mempool detect NULL BOs

  • panvk: Check for maxBufferSize in panvk_CreateBuffer

  • panvk: Make panvk_kmod_zalloc use correct allocation scope on non-transient

  • panvk: Ensure to unref transient bo in reset for mempools

  • panvk: Fix device mempool leaks

  • panvk: Add more allocation checks in create_device

  • panvk: Implement CmdDispatchBase

  • panvk: Enable device_init, null_handle and object_management in CI for Mali-G52

  • panvk: Advertise shaderModuleIdentifier feature

  • panvk: Report correct min value for discreteQueuePriorities

  • panvk: Enable dEQP-VK.info tests in CI for Mali-G52

  • panvk: Clamp viewport scissor to valid range

  • panvk: Enable offscreen_viewport tests in CI for Mali-G52

  • panvk: Skip dispatch on empty workgroup

  • panvk: Report proper workgroup invocation and size

  • panvk: Enable compute pipeline in CI for Mali-G52

  • panvk: Advertise VK_EXT_private_data

  • panvk: Do not emit blend shader when color_mask is 0

  • panvk: Run nir_lower_io_to_vector for fragment shader

  • panvk: Enable glsl.440.linkage in CI for Mali-G52

  • panvk: Implement and advertise anisotropy support

  • panvk: Advertise VK_KHR_sampler_mirror_clamp_to_edge

  • panvk: Enable texture filtering in CI for Mali-G52

  • pan/kmod: Avoid deadlock on VA allocation failure on panthor

  • panfrost: Handle context_init errors correctly

  • panfrost: Handle gracefully resource BO alloc failures

  • ci/panfrost: Update t760 fails

  • rusticl: Add panthor when panfrost is present in RUSTICL_ENABLE

  • bi: Clean up mem_access_size_align_cb

  • bi: Enable lower_pack_64_4x16

  • bi: Lower pack_32_4x8_split and pack_32_2x16_split in algebraic

  • bi: Enable lower_pack pass in compiler

  • bi: Implement basic 8-bit vec support

  • panfrost: Rewrite set_global_binding to make resources truly global

  • panfrost: Do not recreate bo if already mapped

  • panfrost: Increase address space to 48-bit

  • panfrost: Fetch available system memory

  • panvk: Fix image support in vertex jobs

  • panvk: Pass attrib_buf_idx_offset to desc_copy_info

MastaG (1):

  • gallivm: Call StringMapIterator from llvm:: scope

Matt Coster (1):

  • docs: List VK_EXT_debug_utils

Matt Turner (8):

  • intel: Build float64 shader only for Vulkan

  • intel/clc: Free parsed_spirv_data

  • intel/clc: Free disk_cache

  • intel/brw: Use REG_CLASS_COUNT

  • intel/elk: Use REG_CLASS_COUNT

  • docs: Drop references to LIBGL_DRIVERS_PATH

  • util: Add ATTRIBUTE_OPTIMIZE(flags)

  • util: Force emission of stack frame in stack unit test

Mauro Rossi (1):

  • intel/common: fix building error in intel_common.c

Maíra Canal (7):

  • v3dv: Use errno when logging an error to stderr

  • drm-uapi: Update v3d_drm.h

  • broadcom/common: Add maximum number of perf counters to v3d_device_info

  • v3dv: Use DRM_IOCTL_V3D_GET_COUNTER to get perfcnt information

  • v3d: Use DRM_IOCTL_V3D_GET_COUNTER to get perfcnt information

  • broadcom/simulator: Add DRM_V3D_PARAM_MAX_PERF_COUNTERS parameter support

  • broadcom/simulator: Add DRM_IOCTL_V3D_GET_COUNTER to simulator

Michel Dänzer (4):

  • wsi/wayland: Dispatch event queue in wsi_wl_swapchain_queue_present

  • wsi: Call drmSyncobjQuery only once for all images

  • egl/dri: Use packed pipe_format

  • dri: Go back to hard-coded list of RGBA formats

Mike Blumenkrantz (162):

  • glthread: check for invalid primitive modes in DrawElementsBaseVertex

  • zink: reconstruct features pnext after determining extension support

  • zink: prune zink_shader::programs under lock

  • zink: fully wait on all program fences during ctx destroy

  • kopper: fix bufferage/swapinterval handling for non-window swapchains

  • zink: slightly better swapinterval failure handling

  • kopper: don’t set drawable buffer age

  • zink: handle swapchain currentExtent special value

  • zink: clean up accidental debug print

  • dri: rename ‘implicit’ param from earlier series

  • tu: support VK_EXT_legacy_vertex_attributes

  • llvmpipe: add KHR-Single-GL45.arrays_of_arrays_gl.AtomicUsage skip

  • ci: disable lavapipe-vk-asan job

  • lavapipe: VK_EXT_legacy_vertex_attributes

  • zink: clamp buffer_indices_hashlist resets to used region

  • zink: delete GS conditional in update_so_info

  • zink: use zink_shader_key_optimal unions for pipeline state asserts

  • zink: use info.fs.uses_sample_qualifier instead of manual scan

  • zink: simplify confusing return in rewrite_tex_dest

  • zink: simplify flagging legacy shadow samplers

  • zink: rename zink_shader variable in create functions

  • zink: break out shadow sampler scanning

  • zink: always block the precompile threads when pruning shaders

  • zink: more effectively synchronize separate shader program precompiles

  • zink: use zink_shader type directly in zink_create_gfx_shader_state()

  • zink: split shader create into 2-stage functions

  • zink: reorder precompile_separate_shader_job() in file

  • zink: split generated tcs creation into 2-stage functions

  • zink: move gfx shader init to thread

  • zink: reorder some code in zink_create_gfx_program()

  • zink: reorder fencing in zink_create_gfx_program()

  • zink: split gfx program creation into 2-stage functions

  • zink: precompile_job() -> gfx_program_precompile_job()

  • zink: move blocking gfx program init functions to thread

  • ci: disable g52

  • egl/x11: disable dri3 with LIBGL_KOPPER_DRI2=1 as expected

  • zink: add a batch ref for committed sparse resources

  • u_blitter: stop leaking saved blitter states on no-op blits

  • freedreno/replay: use inttypes format string for 64bit

  • zink: delete unused zink_batch struct member

  • zink: move in_rp to zink_context

  • zink: move ref_lock from zink_batch to zink_batch_state

  • zink: move has_work from zink_batch to zink_batch_state

  • zink: rename last_was_compute -> last_work_was_compute

  • zink: move last_work_was_compute from zink_batch to zink_context

  • zink: move work_count from zink_batch to zink_context

  • zink: move swapchain from zink_batch to zink_context

  • zink: rename zink_batch::state -> zink_batch::bs

  • zink: delete all zink_batch uses from zink_query.c

  • zink: remove zink_batch usage from zink_clear.c

  • zink: remove all uses of zink_batch from zink_batch.c

  • zink: remove all zink_batch usage from zink_resource.h

  • zink: remove all zink_batch usage from zink_draw.cpp

  • zink: remove all zink_batch usage from zink_render_pass.c

  • zink: remove all zink_batch usage from zink_context.c

  • zink: delete zink_batch

  • zink: zink_batch_state::has_barriers -> has_reordered_work

  • zink: reset all the has_work flags in the same place

  • zink: check all has_work flags for flushes

  • zink: rely on zink_get_cmdbuf() to set has_work flags

  • zink: flag has_work in a few more places

  • zink: stop flagging has_work on batch tracking

  • zink: don’t submit main cmdbuf if has_work is not set

  • frontends/dri: only release pipe when screen init fails

  • frontends/dri: always init opencl_func_mutex in InitScreen hooks

  • zink: use u_minify for sparse calcs

  • zink: always commit full miptails

  • zink: refcount miptails

  • zink: clean up semaphore arrays on batch state destroy

  • zink: add a batch array for tracked semaphores

  • zink: stop leaking sparse semaphores

  • zink: rework sparse semaphore waits

  • ci: bump VVL to snapshot-2024wk19

  • zink: hook up VK_EXT_legacy_vertex_attributes

  • zink: set all spirv caps for the vvl vtn pass

  • ci: bump VVL to v1.3.285

  • zink: make unassigned io variables unreachable

  • zink: minor tweaks to shader io assignment

  • zink: outdent assign_producer_var_io()

  • zink: outdent assign_consumer_var_io()

  • zink: pass a struct through io assignment functions

  • zink: track masks of io locations used during linking

  • zink: unify io assignment

  • zink: move ‘reserved’ into io assign struct

  • zink: split slot map between regular varyings and patch

  • zink: ci updates

  • egl/dri2: fix error returns on dri2_initialize_x11_dri3 fail

  • nir/lower_aaline: fix for scalarized outputs

  • nir/linking: fix nir_assign_io_var_locations for scalarized dual blend

  • lavapipe: split out DGC into separate file

  • lavapipe: plumb print_cmds through NV DGC

  • lavapipe: lvp_indirect_command_layout -> lvp_indirect_command_layout_nv

  • zink: remove dgc debug mode

  • zink: add atomic image ops to the ms deleting pass

  • build/amd: add amd-use-llvm build option

  • ir3: flag progress from nir_lower_io_to_scalar

  • ir3: assert that no further optimizations can be done if !progress

  • gallium: add drawid_offset to draw_mesh_tasks interface

  • gallium: stop dropping drawid_offset param with util_draw_indirect

  • vulkan: Update XML and headers to 1.3.287

  • zink: add HKP to tiler mode switch

  • lavapipe: fix mesh+task binding with shader objects

  • mesa/st: fix zombie shader handling for non-current programs

  • zink: null check pipe loader config before use

  • zink: split out msaa replication

  • zink: implement msaa replication with dynamic rendering

  • radeonsi: enable compute pbo blits

  • ci: kill filament trace globally

  • zink: add a driver workaround to disable 2D_VIEW_COMPATIBLE+sparse

  • zink: free sparse page for miptail on uncommit

  • zink: remove adreno from broken_cache_semantics driver workaround

  • egl: deduplicate MESA_image_dma_buf_export enablement

  • egl: only enable MESA_image_dma_buf_export with PIPE_CAP_DMABUF

  • lavapipe: maint7

  • st/pbo: fix MESA_COMPUTE_PBO=spec crash on shutdown

  • st/pbo_compute: special case stencil extraction from Z24S8

  • mesa/st: use compute pbo download for readpixels

  • ci: bump vvl to v1.3.289

  • zink: add an a750 skip

  • zink: enable compute pbos for turnip

  • aux/tc: update docs to indicate replaced buffers have multiple pipe_resources

  • zink: don’t lower fpow

  • zink: propagate valid buffer range to real buffer when mapping staging

  • zink: track the “real” buffer range from replacement buffers

  • zink: modify some buffer mapping behavior for buffer replacement srcs

  • zink: move all driverID checks to a helper function

  • zink: hook up maintenance7

  • zink: use maint7 to capture venus driver and more accurately use workarounds

  • mesa/st: load state params for feedback draws with allow_st_finalize_nir_twice

  • egl/x11/sw: fix partial image uploads

  • egl/x11/sw: plug in swap_buffers_with_damage handling

  • winsys/radeon: take the full winsys struct in radeon_get_drm_value()

  • winsys/radeon: wrap fd access with util function

  • winsys/radeon: switch to rendernode when card node doesn’t work

  • winsys/radeon: revert recent changes

  • glx: directly link to gallium

  • egl: link with libgallium directly

  • gbm: link directly with libgallium

  • loader: delete loader_open_driver()

  • loader/dri3: check xfixes version in loader_dri3_open()

  • loader/dri3: avoid killing the xcb connection if dri3 not found

  • loader/glx: move multibuffers check to loader

  • egl: use loader’s multibuffer check to deduplicate lots of code

  • vl/dri3: use loader’s dri3 init code and delete everything else

  • zink: move image aoa access to nir pass

  • zink: use PIPE_CAP_NIR_SAMPLERS_AS_DEREF

  • gallium: install gallium-$version.so to libdir

  • ci: prune dri from LD_LIBRARY_PATH

  • dril: rework config creation

  • llvmpipe: handle vma allocation failure

  • llvmpipe: only use vma allocations on linux

  • dri: fix kmsro define

  • Revert “vl/dri3: use loader’s dri3 init code and delete everything else”

  • glx: include src/gallium for apple

  • dri: link with libloader

  • kopper: check swapchain size after possible loader image resize

  • pipe-loader: fix driconf memory management

  • dril: always take the egl init path

  • egl: fix zink init

  • dri: fix kms_swrast screen fail

  • egl/wayland: bail on zink init in non-sw mode if extension check fails

  • zink: fix partial update handling

Mike Lothian (2):

  • radeonsi,aco: Run ac_nir_lower_global_access pass

  • ac/llvm: Remove global access ops handling

Mingcong Bai (2):

  • meson: set default drivers for ppc, ppc64

  • meson: set default Vulkan drivers for ppc, ppc64

Mohamed Ahmed (4):

  • nil: Add a nil_image::compressed bit

  • nil: Add some helpers for DRM format modifiers

  • nil: Support creating images with DRM modifiers

  • nvk: enable rendering to DRM_FORMAT_MOD_LINEAR images

Mykhailo Skorokhodov (2):

  • egl/wayland: Fix sRGB format look up for config

  • ci/lima: expect fail of window_8888_colorspace_srgb on wayland

Nanley Chery (29):

  • intel/isl: Add and use _isl_surf_info_supports_ccs

  • intel/isl: Reduce halign for disabled CCS on XeHP

  • intel/isl: Update quote for XeHP’s CCS halign rule

  • intel/isl: Allow sampling from 3D HIZ_CCS_WT

  • intel/blorp: Factor bpb into the fast-clear rect

  • intel/blorp: Allow gfx12 fast-clears without CCS surf

  • intel/isl: Add and use ISL_DRM_CC_PLANE_PITCH_B

  • anv: Refactor modifier plane layout queries

  • intel/aux_map: Add and use INTEL_AUX_MAP_MAIN_PITCH_SCALEDOWN

  • intel/aux_map: Add and use INTEL_AUX_MAP_META_ALIGNMENT_B

  • intel/aux_map: Add and use INTEL_AUX_MAP_MAIN_SIZE_SCALEDOWN

  • intel/isl: Add and use ISL_MAIN_TO_CCS_SIZE_RATIO_XE

  • intel/isl: Add and use multi-engine surf usage bits

  • iris: Simplify bo import in memobj_create_from_handle

  • intel/isl: Assert alignments of surface addresses

  • anv: Rely on the primary surf usage to disable aux

  • anv,hasvk: Drop anv_get_isl_format_with_usage

  • anv: Support multiple aspects in anv_formats_ccs_e_compatible

  • anv: Rely more on ISL_SURF_USAGE_DISABLE_AUX_BIT

  • anv: Restrict CCS ISL surface creation to gfx9-11

  • iris: Add and use comp_ctrl_surf_offset on gfx12

  • intel/isl: Drop support for the gfx12 CCS ISL surf

  • intel/isl: Add and use isl_drm_modifier_needs_display_layout

  • iris,anv: Disable gfx12.0 fast-clears with unaligned pitch

  • intel/isl: Consolidate some tiling checks for CCS

  • intel/isl: Require display flag for 512B pitch alignment

  • intel/isl: Pad the pitch on gfx12.0 for fast-clears

  • anv+zink/ci: Change sparse test result from crash to fail

  • intel/isl: Enable Tile4 for CPB surfaces

Natanael Copa (1):

  • nir/opt_varyings: reduce stack usage

Neha Bhende (2):

  • svga: Retrieve stride info from hwtnl->cmd.vdecl for swtnl draws

  • dri: fix macro name check to detect svga driver

Oskar Viljasaar (8):

  • vulkan/properties: support Android in the property generator

  • v3dv: constify arguments of vendor/device id getters

  • v3dv: Use common runtime vk_properties

  • vulkan/properties: Document RENAMED_PROPERTIES in the property generator

  • anv: Move completely over to common runtime GetPhysicalDeviceProperties2

  • hasvk: switch to use runtime physical device properties infrastructure

  • vulkan: add a property struct setter function

  • venus: Use common physical device properties

Patrick Lerda (8):

  • gallium/auxiliary/vl: fix typo which negatively impacts the src_stride initialization

  • clover: fix pipe_box update regression

  • clover: fix memory leak related to optimize

  • r600: fix vertex state update clover regression

  • mesa/main: fix stack overflow related to the new mipmap code

  • radeonsi: fix assert triggered on gfx6 after the tessellation update

  • clover: fix meson opencl-spirv option

  • st/pbo_compute: fix async->nir memory leak

Paulo Zanoni (31):

  • isl: add ISL_TILING_64_XE2 to isl_tiling_to_name()

  • anv/sparse: add the MSAA block shape tables

  • anv/sparse: we can’t do multi-sampled depth/stencil sparse images

  • anv/sparse: properly reject sample counts we don’t support

  • anv/sparse: reject all sample flags that non-sparse doesn’t support

  • anv/sparse: fix block_size_B when the image is multi-sampled

  • anv/sparse: exclude Xe2’s Tile64’s non-standard block shapes

  • anv/sparse: flush the tile cache when resolving sparse images

  • anv/sparse: enable MSAA for Sparse when applicable

  • anv: check for VK_RENDERING_SUSPENDING_BIT once at CmdEndRendering

  • anv+zink/ci: add failures related to multi-sampled sparse binding

  • anv/sparse: assert a format can’t be standard and non-standard

  • anv/xe: fix declaration of memory flags for integrated non-LLC platforms

  • anv/sparse: reject 1D sparse residency images

  • anv/sparse: fix the image property sizes for multi-sampled images

  • anv/sparse: fix reporting of VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT

  • intel/isl: pass struct isl_tile_info to choose_image_alignment_el()

  • anv/sparse: dump info about opaque binds when DEBUG_SPARSE

  • anv/sparse: fix TR-TT page table bo size and flags

  • anv/sparse: remove obsolete linear tiling code path

  • anv/sparse: unify and rework tile size calculation

  • anv/sparse: use ANV_SPARSE_BLOCK_SIZE instead of tile_size when possible

  • anv: properly store the engine_class_supported_count values

  • anv: LNL+ doesn’t need the special flush for sparse

  • anv: reimplement the anv_fake_nonlocal_memory workaround

  • iris: fix iris_xe_wait_exec_queue_idle() on release builds

  • anv/trtt: fix the process of picking device->trtt.queue

  • anv/xe: try harder when the vm_bind ioctl fails

  • anv: don’t expose the compressed memory types when DEBUG_NO_CCS

  • anv: disable CCS for Source2 games on Xe2

  • intel: fix compute SLM sizes on Xe2 and newer

Pavel Ondračka (13):

  • r300: replace constant size field with usemask

  • r300: move dead constants pass earlier for vertex shaders

  • r300: switch to a new constant remap table format

  • r300: compact scalar uniforms into empty slots

  • r300: better packing for immediates

  • r300/ci: fails list update

  • r300: fix cycles counting for KIL

  • r300: fix writemask rewrite when converting to omod

  • r300: fix for ouput modifier and DDX/DDX

  • r300: fix RC_OMOD_DIV_2 modifier

  • r300: missing whitespace in shader stats

  • r300: vectorization tweaks for R300/R400

  • r300: bias presubtract fix

Philipp Zabel (7):

  • etnaviv: drm: Suppress get-param error message for non-existent core

  • etnaviv: drm: Stop after model query failure

  • etnaviv: Pass npu to etna_screen_create in a separate parameter

  • etnaviv: Add a separate NPU pipe

  • etnaviv: Allow collecing both GPU and NPU specs

  • etnaviv/nn: Pipe through input/accumulation buffer depth from hwdb

  • etnaviv: update headers from rnndb

Pierre-Eric Pelloux-Prayer (34):

  • radeonsi/sqtt: use ac_sqtt_get_shader_mask for spm counters

  • radeonsi/sqtt: cleanup si_sqtt_add_code_object a bit

  • radeonsi/sqtt: support sqtt buffer auto-resizing

  • radeonsi: add new si_shader_binary_upload_at method

  • radeonsi/sqtt: use si_shader_binary_upload_at to reupload shaders

  • radeonsi: allocate sqtt and spm buffers in GTT

  • radeonsi: add testmemperf mem bandwidth test

  • radv/sqtt: use radeon_check_space before emit_spm_*

  • radeonsi: use the common SQTT implementation

  • radeonsi/sqtt: add AMD_THREAD_TRACE_INSTRUCTION_TIMING

  • ac/sqtt: make VA helpers static

  • ac/llvm: implement WA in nir to llvm

  • radeonsi: store the total binary size in si_shader

  • radeonsi: handle DBG(TEX) after tc_compatible_htile is set

  • radeonsi/tests: don’t match gfx10_3 baseline for gfx10 family

  • radeonsi/tests: add a shortcut to re-run only failing tests

  • ac/surface: reject modifiers with retile_dcc and bpe != 32

  • radeonsi: add gfx11 workaround for upgraded_depth

  • ac/nir: don’t use the compute blit for PIPE_FORMAT_R5G6B5_UNORM

  • radeonsi/tests: update tests baseline

  • radeonsi/tests: clarify the output when results changes

  • radeonsi: fix buffer_size in si_compute_shorten_ubyte_buffer

  • Revert “ac, radeonsi: remove has_syncobj, has_fence_to_handle”

  • ac/info: remove has_syncobj

  • winsys/radeon: fill lds properties

  • radeonsi: fix crash in si_update_tess_io_layout_state for gfx8 and earlier

  • radeonsi/tests: correctly parse the family name

  • radeonsi: fix ac_create_shadowing_ib_preamble parameter

  • radeonsi, radv: bump libdrm_amdgpu version requirement

  • ci: bump Fedora and Android libdrm2 to 2.4.122

  • radeonsi: fix si_get_dmabuf_modifier_planes for gfx12

  • frontends/dri: add error logs to dri2_create_image_from_fd

  • amd: use a valid size for ac_pm4_state allocation

  • egl,gbm,glx: fix log message spam

Qiang Yu (8):

  • glsl: respect GL_EXT_shader_image_load_formatted when image is embedded in a struct

  • radeonsi: add missing nir_intrinsic_bindless_image_descriptor_amd

  • nir: fix lower array to vec metadata preserve

  • nir: fix clip cull distance lowering metadata preserve

  • nir: add filter parameter to nir_lower_array_deref_of_vec

  • nir: nir_vectorize_tess_levels support indirect access

  • nir: consider more deref types when fixup deref

  • glsl: fix indirect tess factor access for compact_arrays=false drivers

Rebecca Mckeever (9):

  • panvk: Add jm and bifrost dirs

  • panvk: Add push_uniform/constant helpers

  • panvk: Make helper functions panvk_cmd_buffer agnostic

  • panvk: Move panvk_descriptor_state to bifrost subdir

  • panvk: Move vkCmdDraw* functions to their own file

  • panvk: Move vkCmdDispatch* functions to their own file

  • panvk: Move vkCmd*Event functions to their own file

  • panvk: Add Valhall DescriptorSetLayout implementation

  • panvk: Add Valhall Descriptor{Set,Pool} implementations

Renato Pereyra (2):

  • anv: Attempt to compile all pipelines even after errors

  • intel/perf: Move sysmacros.h include from header to implementation

Rhys Perry (95):

  • aco/tests: add tests for hidden breaks/continues

  • aco/tests: add tests for divergent merge phi with undef

  • nir/dead_cf: stop reindexing blocks for each non-block cf node

  • aco/stats: fix s_waitcnt parsing

  • aco/stats: don’t use VS counter pre-GFX10

  • aco/waitcnt: fix DS/VMEM ordered writes when mixed

  • aco: make wait_imm indexable

  • aco/waitcnt: add target_info

  • aco/waitcnt: refactor for indexable wait_imm

  • aco/stats: refactor for indexable wait_imm

  • aco: add wait_imm::unpack and wait_imm::max

  • radv: keep track of unaligned dynamic vertex access

  • aco: form hard clauses in VS prologs

  • aco: copy VS prolog constants after loads

  • aco: support VS prologs with unaligned access

  • aco/util: improve small_vec assertion

  • radv: advertise VK_EXT_legacy_vertex_attributes

  • aco: don’t count certain pseudo towards VMEM_STORE_CLAUSE_MAX_GRAB_DIST

  • aco/tests: support GFX12

  • aco: add SFPU/ValuPseudoScalarTrans instr class

  • aco: add GFX11.5+ opcodes

  • aco: support GFX12 in assembler

  • aco/tests: add GFX12 assembler tests

  • aco: don’t change prefetch mode on GFX11.5+

  • aco/gfx12: disable s_cmpk optimization

  • aco: add GFX12 wait counters

  • aco/waitcnt: support GFX12 in waitcnt pass

  • aco/stats: support GFX12 in collect_preasm_stats()

  • aco: update VS prolog waitcnt for GFX12

  • aco/lower_phis: create loop header phis for non-boolean loop exit phis

  • aco: create lcssa phis for continue_or_break loops when necessary

  • aco: use scalar phi lowering for lcssa workaround

  • aco: remove nir_to_aco

  • aco/lower_phis: don’t create boolean loop header phis in some situations

  • radv: malloc graphics pipeline stages

  • aco: support GFX12 in insert_NOPs

  • aco/gfx12: implement subgroup shader clock

  • aco/gfx12: implement workgroup barrier

  • aco/gfx12: sign-extend s_getpc_b64

  • aco/gfx12: don’t create v_fmac_legacy_f32

  • aco/gfx12: use ttmp9/ttmp7 for workgroup id

  • radv/gfx12: don’t add workgroup id shader args

  • aco/gfx12: remove MIMG vector affinity

  • aco/gfx12: decrease max_nsa_vgprs for VSAMPLE

  • aco/gfx12: disallow SCC and most constants for BUF SOFFSET

  • aco: fix fddx/y with uniform inf/nan input

  • meson: remove –depfile for aco_tests

  • ac/llvm: implement load_subgroup_id

  • aco/gfx12: implement load_subgroup_id

  • ac/nir: skip subgroup_id/local_invocation_index lowering for gfx12

  • aco/gfx12: fix s_wait_event immediate

  • aco: don’t combine vgpr into writelane src0

  • aco: implement nir_atomic_op_ordered_add_gfx12_amd

  • aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd

  • ac/nir: support lowering of sub-dword push constants

  • radv: lower sub-dword push constants

  • ac/llvm: remove support for sub-dword push constants

  • aco: remove support for sub-dword push constants

  • aco/gfx6: set glc for buffer_store_byte/short

  • aco: inline store_vmem_mubuf/emit_single_mubuf_store

  • aco: use ac_hw_cache_flags

  • aco: use GFX12 scope/temporal-hint

  • ac: stop using radeon_info for ac_get_hw_cache_flags

  • aco: use ac_get_hw_cache_flags()

  • aco: remove some missing label resets

  • nir/opt_loop: rematerialize derefs instead of creating phis

  • nir/opt_loop: fix formatting

  • aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)

  • radv: lower push constants in NIR

  • ac/llvm: remove push constants

  • aco: remove push constants

  • aco/insert_exec_mask: ensure top mask is not a temporary at loop exits

  • vtn: ensure TCS control barriers have a large enough memory scope

  • aco: use 1.5x vgprs for gfx1151 and gfx12

  • aco: skip continue_or_break LCSSA phis when not needed

  • aco: use s_pack_ll_b32_b16 for pack_32_2x16_split

  • aco: combine extracts into s_pack_ll_b32_b16

  • aco: use s_pack_*_b32_b16 more in p_insert/p_extract lowering

  • aco: turn split(vec()) into p_parallelcopy instead of p_create_vector

  • aco: add missing isConstant()/isTemp() checks

  • aco: fix follow_operand with combined label_extract and label_split

  • aco: use alignment information in visit_load_constant()

  • aco: fix wmma raw hazard

  • aco: replace constant v_bfrev_b32 with v_mov_b32 to create vopd

  • aco/gfx11: don’t use v_bfrev_b32 with wave64

  • glsl: always lower non-TCS outputs to temporaries

  • gallium: remove PIPE_CAP_SHADER_CAN_READ_OUTPUTS

  • nir/linking_helpers: remove special case for read mesh outputs

  • nir/linking_helpers: remove varying accesses in nir_remove_unused_io_vars

  • nir/linking_helpers: remove nested IF

  • radv: remove unecessary nir_remove_unused_varyings cleanup passes

  • aco/gfx11.5: workaround export priority issue

  • aco: fix validation of v_s_ opcodes

  • docs: update ACO_DEBUG documentation for scheduler options

  • docs: update ACO_DEBUG documentation for perfwarn

Rob Clark (63):

  • tu: Add missing error path cleanup

  • tu: Fix a6xx lineWidthGranularity

  • freedreno/ir3: Skip DAG validation on release builds

  • llvmpipe: Fix build error with clang-18

  • freedreno/ci: Switch a618_piglit to deqp-runner

  • vulkan/android: Add helper to probe AHB support

  • vulkan: Don’t request Ycbcr conversion for rgb

  • vulkan: Add helper to resolve Android external format

  • tu: Skip YUV conversion for RGB formats

  • tu: Support VkExternalFormatANDROID

  • freedreno/ci: Remove some skips

  • freedreno/ci: Remove some obsolete skips

  • freedreno/ci: Refactor out common a6xx skips list

  • freedreno/ci: Skip unsupported legacy gl stuff

  • freedreno/ci: Skip max-texture-size

  • freedreno/ci: Add a common skips file to a618_piglit.

  • freedreno/ci: Skip built-in-functions VS/GS tests

  • freedreno/ci: Skip some slow tests

  • freedreno/ci: Increase a630/a618 piglit fraction

  • freedreno/ir3: Fix ldg/stg offset

  • egl/android: Fix sRGB visuals

  • docs/features: Add missing AHB for tu

  • tu: Don’t advertise AHB handle time on non-android

  • freedreno: Namespace DEFINE_CAST()

  • virgl: Update headers

  • loader: Add better support for virtgpu nctx driver loading

  • freedreno/loader: Switch over to probe_nctx

  • vulkan/android: Fix suggestedYcbcrModel with !mapper4

  • tu: Fix imageview + ahb

  • vulkan/android: Fix YcbcrRange for !mapper4

  • ir3: Add some more missing progress accumulation

  • gallium/tc: Add optional buffer replacement limit

  • freedreno: Use buffer replacement limit

  • gallium/tc: Allow replacement if replacing valid_range

  • freedreno/drm: Add rd dumper support

  • st/mesa/pbo: Set src type on image_store

  • freedreno: Handle non-null cb with null buffer

  • u_blitter+d3d12: Move stencil fallback clear to caller

  • freedreno/a6xx: Implement S8 support

  • freedreno: Implement stencil blit fallback

  • freedreno: Use LINEAR for staging resources

  • freedreno/a6xx: Drop 16b packed image formats

  • freedreno/bc: Rework flush order

  • freedreno/a6xx: Tweak blitter traces

  • freedreno/a6xx: Skip blitter for L/A conversions

  • freedreno/a6xx: Add more format swizzles

  • freedreno/a6xx: Allow blit based transfers

  • freedreno: Enable the X1-85

  • tu: Fix issues with 16k (or larger) page sizes

  • freedreno/drm/virtio: Fix issues with 16k (or larger) page sizes

  • freedreno/a6xx: Implement reg stomper support

  • freedreno/a7xx: Fix GRAS_UNKNOWN_80F4 writes

  • freedreno/cffdec: Fix a7xx CP_EVENT_WRITE decoding

  • tu/drm/virtio: Add missing a7xx case

  • freedreno/drm: Handle a7xx case

  • freedreno: Move GENX/CALLX magic to common

  • freedreno: Extract out common UBWC helper

  • freedreno: Extract out shared LRZFC layout helpers

  • freedreno/a6xx: Allocate lrcfc when needed for direction tracking

  • freedreno/a6xx: Refactor CP_EVENT_WRITE emit

  • freedreno/a6xx: Rework CCU_CNTL emit for a7xx

  • freedreno/a6xx: Initial a7xx support

  • gallium: Add option to not add version to libgallium filename

Robert Mader (3):

  • egl: Implement EGL_EXT_config_select_group

  • egl: Implement EGL_MESA_x11_native_visual_id

  • egl/x11: Allow all RGB visuals to match 32-bit RGBA EGLConfigs

Rohan Garg (21):

  • anv: formatting fix when printing pipe controls

  • anv: allocate space for generated indirect draw id’s using the temporary allocation helper

  • intel/brw: update Xe2 max SIMD message sizes

  • Revert “iris: slow clear higher miplevels on single sampled 8bpp resources that have TILE64”

  • intel/eu/xe2+: Fix src1 length bits of SEND instruction with UGM target.

  • intel/brw: Advertise fp64 atomic add’s when we have 64 bit float support and a LSC

  • intel/brw: We no longer have atomic fmin/fmax ops for fp64 in xe2

  • intel/genxml: add the new state byte stride instruction

  • intel/genxml: update 3DSTATE_CPSIZE_CONTROL_BUFFER for xe2+

  • isl: enable compression for CPS buffers on xe2+

  • intel/genxml: update CFE_STATE for LNL

  • intel/genxml: Update XY_BLOCK_COPY_BLT

  • intel/genxml: update MI_SEMAPHORE_WAIT for Xe2

  • intel/genxml: Update STATE_COMPUTE_MODE for Xe2

  • anv: 3D stencil surfaces have fewer layers for higher miplevels

  • isl: disable CCS for 3D depth/stencil surfaces when WA is applicable

  • isl: Enable volumetric STC_CCS,HiZ+CCS on gfx12.0

  • intel/genxml: Add RESOURCE_BARRIER for xe2

  • intel/compiler: fix shuffle generation on LNL

  • anv: flag WSI images as scanout images for ISL

  • anv: reuse existing macro to query for flushes

Roland Scheidegger (1):

  • lavapipe: add option to enable snorm blending

Romain Naour (1):

  • glxext: don’t try zink if not enabled in mesa

Roman Stratiienko (11):

  • vulkan/android: Add basic u_gralloc support

  • vulkan/android: Add common vkGetSwapchainGrallocUsage{2}ANDROID

  • vulkan/android: Add android buffer classification to vk_image

  • vulkan/android: Add common helpers for the ANB extension

  • vulkan/android: Add common helpers for the AHB extension

  • vulkan/android: Add common vkGetAndroidHardwareBufferPropertiesANDROID

  • turnip/android: Migrate to common ANB code

  • v3dv/android: Migrate ANB and AHB to use common helpers

  • u_gralloc/fallback: Extract modifier from QCOM native_handle

  • turnip/android: Use DETECT_OS_ANDROID in tu_device

  • turnip/android: Use DETECT_OS_ANDROID in freedreno_rd_output

Romaric Jodin (1):

  • intel/brw: allocate large table in the heap instead of the stack

Ruijing Dong (14):

  • radeonsi/vcn: add vcn5 encoding interface change

  • radeonsi/vcn: add vcn5.0 for h264 enc only

  • radeonsi/vcn: add hevc support for vcn5

  • radeonsi/vcn: enable decoding in vcn5.

  • radeonsi/vcn: correct tile_size_bytes_minus1

  • radeonsi/vcn: add cdef modes for vcn5 encoding

  • radeonsi/vcn: apply cdef mode to vcn5

  • radeonsi/vcn: share functions between vcn4/vcn5

  • frontends/va: parsing uniform_tile_spacing flag

  • radeonsi/vcn: add header files for vcn5 av1 tile

  • radeonsi/vcn: enable av1 encoding in vcn5

  • radeonsi/vcn: enable roi feature for vcn5

  • radeonsi/vcn: remove tile_config_flag

  • radesonsi/vcn: update vcn4 tile processing logic

Ryan Neph (7):

  • venus: reclaim signal semaphore feedback resources for wasteful clients

  • venus: sync headers for VK_EXT_external_memory_acquire_unmodified

  • venus: enable VK_EXT_external_memory_acquire_unmodified

  • venus: factor image memory barrier fixes to common implementation

  • venus: refactor image memory barrier fix storage and conventions

  • venus: skip barrier fixes as early as possible

  • venus: chain VkExternalMemoryAcquireUnmodifiedEXT for wsi ownership transfers

Rémi Bernon (2):

  • zink: Add VKAPI_PTR specifier to zink_stub_function_not_loaded.

  • zink: Add VKAPI_PTR specifier to generated stub functions.

Sagar Ghuge (8):

  • intel/compiler: Fix destination type for CMP/CMPN

  • intel/disasm: Fix cache load/store disassembly for URB messages

  • iris: Load 32-bit MMIO PREDICATE register from buffer

  • intel/compiler: No need to re-type the destination register

  • intel/fs: Adjust destination register size for untyped atomic on Xe2+

  • intel/fs: Adjust destination register size for global atomic on Xe2+

  • intel/compiler: Don’t use half float param for sample_b

  • intel/compiler: Add indirect mov lowering pass

Samuel Pitoiset (399):

  • radv: fix image format properties with fragment shading rate usage

  • docs: Add an alternative way to debug GPU hangs with RADV

  • radv/rt: add radv_ray_tracing_state_key

  • radv/rt: pass radv_ray_tracing_state_key to radv_rt_pipeline_compile()

  • radv/rt: rework the helper that hashes a ray tracing pipeline

  • radv/ci: add more flakes

  • radv: simplify DB_Z_INFO.NUM_SAMPLES with null ds target on GFX11

  • radv: remove bogus VkShaderCreateInfoEXT::flags being 0 assert for compute

  • radv: simplify radv_emit_primitive_restart_enable()

  • radv: inline radv_get_pa_su_sc_mode_cntl() in radv_emit_culling()

  • radv: remove useless DB_Z_INFO.NUM_SAMPLES when emitting the MSAA state

  • radv: pre-compute VGT_TF_PARAM.DISTRIBUTION_MODE

  • radv: use the bound GS copy shader when emitting shader objects

  • radv: add GS copy shader BO to the cmdbuf BO list at bind time

  • radv: add RT prolog BO to the cmdbuf BO list at bind time

  • radv: add shaders BO to the cmdbuf BO list at bind time

  • radv: emit compute pipelines directly from the cmdbuf

  • radv: precompute compute/task shader register values

  • radv: clear unwritten color attachments for monolithic PS earlier

  • radv: compact SPI_SHADER_COL_FORMAT as late as possible

  • radv: rename col_format_non_compacted to spi_shader_col_format

  • radv: store cb_shader_mask for fragment shaders and epilogs

  • radv: add a new dirty state for emitting the color output state

  • radv/ci: document a recent regression on GFX6-8

  • radv: split cmdbuf dirty flags into dirty/dirty_dynamic

  • radv: precompute existing legacy GS register values later

  • radv: precompute fragment shader register values

  • radv: precompute mesh shader register values

  • radv: precompute legacy GS register values

  • radv: precompute vertex shader register values

  • radv: precompute DB_SHADER_CONTROL for fragment shaders later

  • vulkan: Update XML and headers to 1.3.284

  • aco: add support for remapping color attachments

  • radv: implement VK_KHR_dynamic_rendering_local_read

  • radv: advertise VK_KHR_dynamic_rendering_local_read

  • radv: add a new mechanism for tracking registers per cmdbuf

  • radv: move common registers between VS/GS and NGG

  • radv: precompute NGG register values

  • radv: remove unused parameter to radv_pipeline_emit_pm4()

  • radv: stop recomputing the last VGT API stage when emitting graphics shaders

  • radv: do not emit non-context registers to radv_pipeline::ctx_cs

  • radv: track and bind more VRS states from the graphics pipeline

  • radeonsi: remove the _unused parameter in all radeon_xxx macros

  • radv: remove gfx10_emit_ge_pc_alloc()

  • radv: do not emit VGT_GS_OUT_PRIM_TYPE to ctx_cs on GFX11

  • radv: simplify radv_emit_hw_ngg() slightly

  • radv: simplify radv_emit_hw_vs() slightly

  • radv: simplify radv_emit_hw_gs() slightly

  • radv: fix the late scissor workaround for GFX9 since a recent refactoring

  • radv: make radv_conv_gl_prim_to_gs_out() a non-static function

  • radv: emit graphics pipelines directly from the cmdbuf

  • radv: add graphics shaders context registers that need to be tracked

  • radv: add more radeon_opt_set_xxx variants

  • radv: track all graphics shaders context registers

  • radv: simplify radv_emit_ps_inputs() slightly

  • radv: stop using radv_physical_device for radeon helpers

  • radv: introduce radeon_set_reg_seq()

  • radv: remove redundant radeon_set_perfctr_reg() helper

  • radv: rename radeon perfctr uconfig helpers

  • radv: add a helper to configure ring buffer descriptors

  • radv: only enable VK_MESA_image_alignment_control on GFX9-11.5

  • radv: reject unsupported buffer formats earlier

  • ac,radv,radeonsi: add a helper to translate buffer numformat

  • ac,radv,radeonsi: add a helper to translate buffer dataformat

  • radv: simplify radv_emit_default_sample_locations()

  • radv: pass radv_physical_device to radv_emit_default_sample_locations()

  • radv: use float instead of double for viewport zscale/ztranslate

  • radv: add more helpers to emit viewports

  • radeonsi: refactor si_translate_border_color()

  • ac,radv,radeonsi: introduce a helper to build a sampler descriptor

  • radv: stop checking the output value of radv_translate_tex_numformat

  • radv: use PIPE_FORMAT in radv_translate_tex_numformat()

  • ac,radv,radeonsi: add a function to translate tex numformat

  • radv: use PIPE_FORMAT in radv_translate_colorswap()

  • ac,radv,radeonsi: add a function to translate colorswap

  • radv: use PIPE_FORMAT in radv_translate_dbformat()

  • ac,radv,radeonsi: add a function to translate db format

  • ac,radv,radeonsi: add a function to get the color format endian swap

  • radv: allow 3d views with VK_IMAGE_CREATE_2D_VIEW_COMPATIBLE_BIT_EXT

  • radv: simplify creating gfx10 texture descriptors for sliced 3d/2d view of 3d

  • radv: remove redundant check for VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 on GFX6-10

  • radv: stop checking the return value of ac_get_cb_number_type()

  • radv: add radv_is_colorbuffer_format_blendable()

  • amd/common: move some format related helpers to ac_formats.c

  • ac,radv,radeonsi: add helper to know if a format is supported by CB

  • ac,radv,radeonsi: add helper to know if a format is supported by DB

  • ac,radv,radeonsi: introduce a helper to build a FMASK descriptor

  • radeonsi: reject some texture formats but only on GFX8/CARRIZO

  • radv: mark some formats as unsupported on GFX8/CARRIZO

  • radv: set image view descriptors as buffer for non-graphics GPU

  • ac,radv,radeonsi: add a helper to get the tile mode index

  • radv: tidy up swizzle in radv_set_mutable_tex_desc_fields()

  • radv: tidy up custom pitch for gfx10.3 in radv_set_mutable_tex_desc_fields()

  • radv: tidy up meta_va in radv_set_mutable_tex_desc_fields()

  • radv: set ITERATE_256 for GFX10+ in radv_set_mutable_tex_desc_fields()

  • radv: stop clearing unnecessary bitfields in radv_set_mutable_tex_desc_fields()

  • ac,radv,radeonsi: add a helper to set mutable tex desc fields

  • ac,radv,radeonsi: add a function for getting border color swizzle

  • radv: only set ALPHA_IS_ON_MSB if the image has DCC on GFX6-9

  • radv: adjust determining if alpha should be on MSB

  • ac,radv,radeonsi: add function to determine if alpha should be on MSB

  • ac,radv,radeonsi: add a common helper for translating swizzle

  • amd/common: only pass gfx_level to ac_get_gfx10_format_table()

  • amd/common: add new helpers to build buffer descriptors

  • radv: use new common helpers for building buffer descriptor

  • aco: use new common helpers for building buffer descriptors

  • radv: remove unused radv_translate_buffer_dataformat()

  • radv: remove useless check about FIXED formats

  • radv: simplify radv_is_vertex_buffer_format_supported()

  • radv: replace vk_to_non_srgb_format() by vk_format_no_srgb()

  • radv: fix setting a custom pitch for CB on GFX10_3+

  • radv: replace db_{z,stencil}_{read,write}_base by db_{depth,stencil}_base

  • radv: tidy up radv_initialise_ds_surface()

  • radv: separate non-mutable vs mutable fields for ds surface

  • amd/common: add a function to initialize ds surface

  • radv: use the common helper for initializing DS surfaces

  • radeonsi: use the common helper for initializing DS surfaces

  • radv: only enable DB_STENCIL_INFO.ITERATE_FLUSH when necessary

  • radv: do not check image usage for ITERATE256 with TC-compat HTILE

  • ac,radv,radeonsi: add function to get the number of ZPLANES

  • ac,radv,radeonsi: a function that sets mutable DS surface fields

  • ac,radv,radeonsi: move ZRANGE_PRECISION to mutable DS fields

  • amd/common: add ac_gpu_info::has_etc_support

  • radv: use PIPE_FORMAT in radv_translate_tex_dataformat()

  • ac,radv,radeonsi: add a function to translate texture data format

  • amd/common: add a helper to set the third word of buffer descriptor

  • ac,radv: add a stride parameter to ac_build_attr_ring_descriptor()

  • radeonsi: use common build buffer descriptor helpers

  • radv: align DCC control settings to RadeonSI for GFX8

  • amd/common: add a function to initialize cb surface

  • radv: use the common helper for initializing CB surfaces

  • radeonsi: use the common helper for initializing CB surfaces

  • ac,radv,radeonsi: a function that sets mutable CB surface fields

  • radv: apply the workaround for no PS inpputs and LDS on GFX11 only

  • radv: apply the SQ_THREAD_TRACE_WPTR workaround on GFX11 only

  • radv: fix flushing DB meta cache on GFX11.5

  • radv: only emit streamout enable for legacy streamout

  • amd/common: define SDMA v7.0 for GFX12

  • ac,radv,radeonsi: add ac_gpu_info::has_tc_compatible_htile

  • radv: update NUM_THREAD_FULL bitfields

  • radv: enable GS_FAST_LAUNCH_2 by default on GFX12

  • radv: do not emit non-existent registers on GFX12

  • radv: update configuring sample locations on GFX12

  • radv: update configuring viewport/scissor on GFX12

  • radv: update configuring PS states on GFX12

  • radv: update configuring NGG states on GFX12

  • radv: update configuring VGT states on GFX12

  • radv: update configuring DB states on GFX12

  • radv: update configuring rasterization states on GFX12

  • radv: update configuring some CB states on GFX12

  • radv: update configuring occlusion query state on GFX12

  • radv: update configuring MSAA state on GFX12

  • radv: update configuring GFX preamble on GFX12

  • radv: update configuring tess rings on GFX12

  • radv: update binning settings on GFX12

  • radv: update emitting discard rectangles on GFX12

  • radv: update shader input arguments for GS stage on GFX12

  • aco: adjust loading local invocation ID for GS on GFX12

  • radv: do not emulate clear state for shadowed regs on GFX12

  • radv: update cache flush emission on GFX12

  • radv: update emitting stipple line on GFX12

  • radv: disallow merging multiple draws into one wave on GFX12

  • radv: emit SQ_NON_EVENT packets after drawing with streamout on GFX12

  • radv: update configuring the number of patch control points on GFX12

  • radv: update configuring VGT_SHADER_STAGES_EN on GFX12

  • radv: enable GE_CNTL.DIS_PG_SIZE_ADJUST_FOR_STRIP on GFX12

  • radv: update NUM_THREAD_FULL bitfields on GFX12

  • radv: update global graphics shader pointers on GFX12

  • radv: update SDMA resource type on GFX12

  • radv: update VS input VGPRs on GFX12

  • radv: do not enable HTILE for depth/stencil storage images

  • radv: allow STORAGE for depth formats

  • radv: fix configuring the number of patch control points on GFX6

  • radv: configure DB_Z_INFO.NUM_SAMPLES on GFX12

  • radv: configure DB_RENDER_CONTROL to zero on GFX12

  • radv: do not enable MEM_ORDERED on GFX12

  • radv: update configuring the attribute ring on GFX12

  • radv: do not flush L2 metadata on GFX12

  • radv: mark all images coherent with TC L2 on GFX12

  • radv: update configuring SPI_SHADER_PGM_LO_LS on GFX12

  • radv: update configuring SPI_SHADER_PGM_LO_ES on GFX12

  • radv: update configuring SPI_SHADER_PGM_RSRC4_{HS,GS,PS} on GFX12

  • radv: update configuring GE_CNTL.PRIM_GRP_SIZE_GFX11 on GFX12

  • radv: update configuring SPI_PS_IN_CONTROL on GFX12

  • radv: configure PA_SC_HISZ_CONTROL on GFX12

  • radv: configure SPI_SHADER_GS_OUT_CONFIG_PS on GFX12

  • radv: update configuring GS_VGPR_COMP_CNT on GFX12

  • radv: do not set DX10_CLAMP on GFX12

  • radv: fix VRS subpass attachments with mipmaps

  • ac,radeonsi: set COLOR_SW_MODE for mutable CB surfaces on GFX12

  • radv: configure PA_SC_SAMPLE_PROPERTIES on GFX12

  • radv: update number of input VGPRs for VS on GFX12

  • radv: update configuring color buffers on GFX12

  • radv: update configuring depth stencil buffers on GFX12

  • radv: update configuring PA_SC_WINDOW_SCISSOR on GFX12

  • radv: do not emit SPI_SHADER_PGM_RSRC3_GS on GFX12

  • radv: fix configuring NGG registers on GFX12

  • radv: do not set VGT_PRIMITIVEID_EN.PRIMITIVEID_EN on GFX12

  • radv: cleanup radv_precompute_registers_hw_{ngg,fs}

  • radv: assert that GDS/GDS OA buffers can’t be created on GFX12

  • radv: only set valid bitfields for CB/DS surfaces address

  • radv: only emit VGT_GS_MAX_PRIMS_PER_SUBGROUP on GFX9

  • radv: only emit SQ_PERFCOUNTER_MASK on GFX7-9

  • radv: do not set VGT_SHADER_STAGES_EN.DYNAMIC_HS on GFX9

  • radv: only emit SPI_SHADER_PGM_SRC3_GS on GFX7+

  • radv: only emit CB_COLOR0_DCC_CONTROL on GFX8

  • radv: use pipe_format when building image view descriptors

  • ac,radv,radeonsi: add a function to build texture descriptors

  • amd/common: add MIN_LOD for texture descriptors on GFX12

  • Revert “radv/ci: Bring back vkcts-navi21-llvm-valve”

  • radv: update configuring depth clamp enable on GFX12

  • radv: update configuring COVERAGE_TO_SHADER_SELECT on GFX12

  • radv: fix emitting VGT_PRIMITIVEID_RESET in the GFX preamble on GFX12

  • radv: only set valid bitfields for CB/DS surfaces address on GFX12

  • radv: add a helper to get image VA

  • ac,radeonsi import PM4 state from RadeonSI

  • ac,radeonsi: add a function to initialize compute preambles

  • radv: initialize compute preambles with the common helper

  • radv: fix creating unlinked shaders with ESO when nextStage is 0

  • radv: pass a radv_shader to radv_get_compute_pipeline_metadata()

  • radv: don’t assume that TC_ACTION_ENA invalidates L1 cache on gfx9

  • ac,radv: add a helper for SQTT control register

  • ac,radv,radeonsi: add more parameters to ac_sqtt

  • amd: allow to emit privileged config registers in PM4

  • amd: mark more registers that need RESET_FILTER_CAM in PM4

  • amd: add a common implementation for SQTT using PM4

  • radv: emit more consecutive registers for SQTT on GFX8-9

  • radv: use the common SQTT implementation

  • radv: update VGT_TESS_DISTRIBUTION.ACCUM_ISOLINE value

  • radv: do not set registers set by CLEAR_STATE in the preamble on GFX10-11.5

  • radv: emit SPI_GS_THROTTLE_CNTL1 when the attr ring is emitted

  • radv: fix incorrect buffer_list advance for multi-planar descriptors

  • radv: use BDA in the DGC prepare shader

  • radv: remove dynamic uniform/storage buffers support with DGC

  • radv: do not use nir_pkt3() when the packet len is constant with DGC

  • radv: add new macros for emiting packets in DGC

  • radv: remove redundant nir_builder param in some DGC helpers

  • radv: add a helper to load the pipeline VA for DGC

  • radv: store a pointer to the logical device in dgc_cmdbuf

  • radv: allow VK_NV_device_generated_commands_{compute} with LLVM

  • radv: always save/restore all shader objects for internal operations

  • radv: update configuring WALK_ALIGN8_PRIM_FITS_ST on GFX12

  • ac/surface: add NBC view support on GFX12

  • radv: declare a new user SGPR for the streamout state buffer on GFX12

  • radv/nir: lower nir_intrinsic_load_xfb_state_address_gfx12_amd

  • radv: implement streamout on GFX12

  • radv: force using indirect descriptor sets for indirect compute pipelines

  • radv: emit indirect sets for indirect compute pipelines with DGC

  • radv: fix emitting indirect descriptor sets in the DGC prepare shader

  • radv: cleanup getting AC_UD_TASK_RING_ENTRY for mesh shader

  • radv: use radv_shader_info::user_data_0 for task shaders

  • radv: remove dead mesh shader code for indirect draws

  • radv: remove useless masking in radv_cs_emit_indirect_mesh_draw_packet()

  • radv: remove useless draw_id to radv_emit_userdata_task()

  • radv: add the DGC preprocess BO to the cmdbuf BO list

  • radv/amdgpu: allow cs_execute_ib() to pass a VA instead of a BO

  • radv/amdgpu: use the non-IB path for dumping CS with external IBs

  • ac/parse_ib: dump PKT3_DISPATCH_{TASKMESH_GFX,TASKMESH_DIRECT_ACE}

  • radv/amdgpu: fix chaining CS with external IBs on compute queue

  • radv: add a helper to execute a DGC IB

  • radv: add support for computing the DGC ACE IB size

  • radv: prepare for DISPATCH_TASKMESH_GFX emission in the DGC shader

  • radv: prepare for DISPATCH_TASKMESH_DIRECT_ACE emission in the DGC shader

  • radv: refactor some DGC helpers in preparation for the ACE IB

  • radv: add a helper to pad DGC IB

  • radv: add support for preparing the ACE IB in DGC

  • radv: add support for executing the DGC ACE IB

  • radv: fix incorrect cache flushes before decompressing DCC on compute

  • radv: improve clarity of DGC offset computations

  • radv: pre-compute the base upload offset in radv_prepare_dgc()

  • radv: add a helper that determines if DGC uses task shaders

  • radv: split allocating and emitting push constants with DGC

  • radv: rework emitting push constants with DGC

  • radv: reserve space for push constants in the DGC ACE IB

  • radv: adjust the base upload offset when DGC uses task shaders

  • radv: emit push constant for task shaders with DGC

  • radv: disable conditional rendering with DGC and task shaders

  • radv: fix a synchronization issue with non-preprocessed DGC with task shader

  • radv: enable task shaders support with NV DGC

  • radv: suspend user conditional rendering when DGC has task shaders

  • radv: rename radv_get_user_sgpr() to radv_get_user_sgpr_info()

  • radv: add radv_get_user_sgpr{_loc}() helpers

  • radv: use radv_get_user_sgpr_loc() for the GS copy shader too

  • radv: remove unused parameter to dgc_emit_draw_mesh_tasks_ace()

  • radv: do not emit compute userdata for empty dispatches

  • radv: cleanup using vtx_base_sgpr for userdata with DGC

  • radv: use radv_dgc_with_task_shader() more

  • radv: move radv_CmdPreprocessGeneratedCommandsNV() to radv_cmd_buffer.c

  • radv: use the graphics pipeline from the DGC info

  • radv: use radv_get_user_sgpr() more in DGC

  • vulkan: Update XML and headers to 1.3.289

  • radv: advertise VK_KHR_maintenance7

  • ci: bump vkd3d-proton to 3d46c082906c77544385d10801e4c0184f0385d9

  • radv: remove unused parameter to radv_pipeline_import_retained_shaders()

  • radv: simplify importing libraries with retained shaders

  • radv: remove unused get_vs_output_info() function

  • radv: remove unnecessary radv_pipeline_has_ngg() function

  • radv: move radv_hash_shaders() to radv_graphics_pipeline.c

  • radv: simplify determining when the rasterization primitive is unknown

  • radv: simplify determining when a VS prolog is needed

  • radv: stop passing a pipeline to some graphics related helpers

  • radv: rework generating all graphics state for compiling pipelines

  • radv: remove radv_descriptor_set_layout::shader_stages

  • radv: use blake3 for hashing descriptor set layouts

  • radv: use blake3 for hashing pipeline layouts

  • radv: disable VK_EXT_sampler_filter_minmax on TAHITI and VERDE

  • ac,radeonsi: add ac_is_reduction_mode_supported()

  • radv: use ac_is_reduction_mode_supported()

  • radv: fix marking RADV_DYNAMIC_COLOR_ATTACHMENT_MAP as dirty

  • nir/gather_info: handle uses_fbfetch_output for sparse image loads

  • nir/gather_info: handle uses_fbfetch_output for texture operations

  • radv: destroy the perf counter BO in radv_device_finish_perf_counter()

  • radv: add radv_device_init_perf_counter()

  • radv: add helpers for init/deinit device memory cache

  • radv: add helpers for init/deinit RGP

  • radv: simplify keeping shader info for GPU hangs debugging

  • radv: add radv_device_init_trap_handler()

  • radv: add helpers for init/deinit device fault detection

  • radv: add radv_device_init_rmv()

  • radv: regroup all tools initialization in one helper

  • radv: use zero allocation for the device queues

  • radv/meta: remove non-valuable comments

  • radv/meta: remove unnecessary blit2d_dst_temps struct

  • radv/meta: remove redundant check for hw resolve pipelines

  • radv/meta: remove unused number of rectangles for internal operations

  • radv/meta: remove useless checks for NULL handles before destroying

  • radv/meta: add a helper to create compute pipeline

  • radv/meta: add a helper to create pipeline layout

  • radv/meta: add a helper to create descriptor set layout

  • zink/ci: skip arb_shader_image_load_store also on NAVI31/VANGOGH

  • zink/ci: remove redundant arb_shader_image_load_store skips on POLARIS10

  • radv: do not expose ImageFloat32AtomicMinMax on GFX11_5

  • radv: fix programming DB_RENDER_CONTROL for NULL depth/stencil on GFX11_5

  • radv: expose BufferFloat32AtomicMinMax on GFX11_5

  • radv: disable SPM trace on GFX11_5

  • ac/rgp: assume GFX11_5 use the same SQTT/RGP versions as GFX11

  • radv: allow to capture with RGP on GFX11_5

  • radv/meta: fix potential race condition when creating the copy VRS pipeline

  • radv/meta: rework creating the VRS copy HTILE pipeline

  • radv/meta: remove the depth resummarize operation

  • radv/meta: avoid potential NULL deref with the gfx depth decompress pipeline

  • radv/meta: move locking around the gfx depth decompress pipeline

  • radv/meta: remove unused parameter to radv_get_depth_pipeline()

  • radv/meta: rework creating the gfx depth decompress pipeline

  • radv/meta: create the compute depth decompress pipeline on-demand

  • radv/meta: cleanup creating the compute depth decompress pipeline

  • radv/meta: separate creating the fill/copy pipelines

  • radv/meta: create the fill/copy pipelines on-demand

  • radv/meta: cleanup radv_device_init_meta_blit_{color,depth,stencil]()

  • radv/meta: move the locking around creating blit pipelines

  • radv/meta: cleanup meta_emit_blit()

  • radv/meta: rework creating blit pipelines

  • radv/meta: create fmask expand layouts regardless on-demand

  • radv/meta: rework creating FMASK expand pipelines

  • radv/meta: create fmask copy layouts regardless on-demand

  • radv/meta: rework creating copy expand pipelines

  • radv/meta: fix potential race condition when creating DCC retile pipelines

  • radv/meta: fix potential memleak when creating DCC retile pipelines

  • radv/meta: rework creating DCC retile pipelines

  • radv/meta: remove useless memset when destroying DCC retile state

  • radv/meta: rework creating GFX depth/stencil resolve pipelines

  • radv/meta: rework creating GFX color resolve pipelines

  • radv/meta: rework creating compute color resolve pipelines

  • radv/meta: rework creating compute depth/stencil resolve pipelines

  • radv/meta: cleanup creating HW resolve pipelines

  • radv/meta: rework creating HW resolve pipelines

  • radv/meta: rework creating DCC decompress compute pipelines

  • radv/meta: rework creating clear HTILE mask pipeline

  • radv/meta: create clear HTILE mask pipeline on-demand when needed

  • radv/meta: create DCC comp-to-single pipelines on-demand when needed

  • radv/meta: add a helper to create itob pipelines

  • radv/meta: create itob pipelines on-demand when needed

  • radv/meta: add a helper to create btoi pipelines

  • radv/meta create btoi pipelines on-demand when needed

  • radv/meta: add a helper to create btoi r32g32b32 pipeline

  • radv/meta: create btoi r32g32b32 pipeline on-demand when needed

  • radv/meta: update the helper that creates itoi pipelines

  • radv/meta: create itoi pipelines on-demand when needed

  • radv/meta: add a helper to create itoi r32g32b32 pipeline

  • radv/meta: create itoi r32g32b32 pipelines on-demand when needed

  • radv/meta: update the helper that creates clear pipelines

  • radv/meta: create clear pipeliones on-demand when needed

  • radv/meta: add a helper to create clear r32g32b32 pipeline

  • radv/meta: create clear r32g32b32 pipelines on-demand when needed

  • radv: fix shaders cache corruption with indirect pipeline binds

  • radv/meta: stop checking that creating NIR shaders failed

  • radv/meta: remove unnecessary goto

  • radv/meta: stop creating similar pipeline layouts for depth decompress

  • radv/meta: create the layouts for blit pipelines on-demand

  • radv/meta: create the layouts for FS resolve pipelines on-demand

  • radv/meta: create the layouts for depth decompress on-demand

  • radv/meta: create the layouts for FMASK copy on-demand

  • radv/meta: create the layouts for FMASK expand on-demand

  • radv/meta: create the layouts for compute resolve on-demand

  • radv/meta: create the louts for DCC comp-to-single clear on-demand

  • radv/meta: rework getting clear color pipelines

  • radv/meta: create the layout for clear color on-demand

  • radv/meta: rework getting depth stencil clear pipelines

  • radv/meta: create the layout for clear depth/stencil on-demand

Saroj Kumar (2):

  • mesa: Add functions to print blake3

  • mesa: replace shader_info::source_sha1

Sathishkumar S (3):

  • util/format: add planar3 y8_u8_v8_440 pipe format

  • frontends/va,gallium/vl: add support for yuv440 format

  • radeonsi/vcn: enable yuv440 jpeg decode

Sebastian Wick (1):

  • vulkan/wsi/wayland: refactor wsi_wl_swapchain_wait_for_present

Sergi Blanch Torne (21):

  • mr-label-maker: specialize CI labels

  • ci: kernel stored in a different s3 bucket

  • ci: identify and label S3 buckets

  • ci: disable Collabora’s farm due to maintance

  • Revert “ci: disable Collabora’s farm due to maintance”

  • ci: fix stress counter in run’n’monitor

  • ci: disable Collabora’s farm due to maintenance

  • Uprev Piglit to cf8daaf5ba90fc9b8a0e144355026e2a14c79944

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci: disable Collabora’s farm due to runners maintenance

  • Revert “ci: disable Collabora’s farm due to runners maintenance”

  • ci: continue stress run’n’monitor

  • ci: Fix parse GitLab pipeline url

  • ci: run_n_monitor, collect and summarize

  • ci: disable Collabora’s farm due to maintenance

  • Revert “ci: disable Collabora’s farm due to maintenance”

  • ci: run_n_monitor, arguments review and unicode

  • ci: run_n_monitor, pretty duration with padding

  • ci: run_n_monitor, listing job names with a padding

  • ci: run_n_monitor, sort by name when listing jobs

  • ci: fix run_n_monitor single execution

Sil Vilerino (5):

  • d3d12: Fix static analysis issues due to bad parenthesis closing

  • nir: Mark variable as ASSERTED to fix unused variable warning treated as error

  • d3d12: Video Encode - Fix inputs for older OS support query cap

  • d3d12: Add missing case for CQP in d3d12_video_encoder_disable_rc_qualitylevels

  • Revert “d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported” This reverts commit d6bb4ddc638f3ee37fbbe066c631dad80aaeb2d3. Fixes: d6bb4ddc638 (“d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported”)

Simon Ser (1):

  • glapi: fix param type in TexGenxOES

Sushma Venkatesh Reddy (4):

  • drm-uapi: Sync i915_drm.h with a78313bb206e

  • anv/drirc: add option to provide low latency hint

  • anv: Fix I915_PARAM_HAS_CONTEXT_FREQ_HINT check

  • intel/clflush: Utilize clflushopt in intel_invalidate_range

Sviatoslav Peleshko (5):

  • anv: Fix descriptor sampler offsets assignment

  • anv,driconf: Add fake non device local memory WA for Total War: Warhammer 3

  • intel/brw: Actually retype integer sources of sampler message payload

  • intel/elk: Actually retype integer sources of sampler message payload

  • mesa: Fix PopAttrib not restoring states that changed on deeper stack level

Tapani Pälli (14):

  • iris: change stream uploader default size to 2MB

  • anv: skip gfx push constants alloc optimization on gfx9/11

  • iris: ForceZeroRTAIndexEnable if last geom stage does not write layer id

  • vulkan/runtime: add a subpass bit for legacy dithering

  • anv: VK_EXT_legacy_dithering support

  • docs/features: add VK_EXT_legacy_dithering

  • ci: update failures list with angle for jsl, tgl

  • anv/android: enable emulated astc for applications

  • anv: implement WA 14018283232

  • mesa: remove some conditions in mipmap code

  • isl: fix condition for enabling sampler route to lsc

  • isl/iris/anv: provide drirc toggle intel_sampler_route_to_lsc

  • anv: move some pc was to batch_emit_pipe_control_write

  • anv: fix a cmd_buffer reference in simple shader

Tatsuyuki Ishi (5):

  • radv: Remove radv_queue::device again

  • vk_entrypoints_gen: Add missing ATTR_WEAK for instance and physdev entrypoints

  • vk_entrypoints_gen: Rework ATTR_WEAK to unify Unix and MinGW

  • vk_entrypoints_gen: Apply hidden visibility to generated symbols

  • vk_cmd_queue_gen: Exclude CmdDispatchGraphAMDX

Thomas H.P. Andersen (2):

  • nvk: advertise EXT_depth_range_unrestricted

  • nvk/upload_queue: fix the _fill method

Tim Huang (2):

  • amd: add GFX v11.5.2 support

  • amd/vpelib: support VPE IP v6.1.3

Timothy Arceri (36):

  • glsl: wrap nir_opt_loop in NIR_PASS()

  • glsl: use hash table when serializing resource data

  • glsl: move geom input array sizing to nir linker

  • lima: drop unrequired opt from standalone compiler

  • glsl: remove unused detect_recursion_linked()

  • lima: remove the standalone compiler

  • glsl: add support for glsl es 310/320 to standalone compiler

  • nir: clarify and update loop conditional instruction

  • nir: more aggressively remove in loop during partial unroll

  • nir: support more loop unrolling for logical operators

  • nir: add merge loop terminators optimisation

  • nir: add test for opt_loop_merge_terminators

  • nir: correctly track current loop in nir_opt_loop()

  • nir: test opt_loop_merge_terminators() skips unhandled loops

  • nir: add additional opt_loop_merge() test of deref handling

  • glsl: drop dump-builder support from standalone compiler

  • glsl: remove Par-linking from the standalone linker

  • glsl: remove do_function_inlining()

  • glsl: make glsl_to_nir() more generic

  • glsl: remove unused symbol table functionality

  • glsl: remove out of date TODO

  • glsl: move call to create explicit ifc layout out of glsl_to_nir

  • glsl: drop glsl ir optimisation from the standalone compiler

  • glsl: make warning tests pass linking

  • glsl/mesa: remove UniformHash field

  • glsl/standalone: init EmptyUniformLocations

  • glsl/tests: fix test_gl_lower_mediump

  • mesa: remove _mesa_get_log_file() wrapper

  • util/mesa: move mesa/main log code to util

  • mesa: add unreachable to _mesa_shader_stage_to_subroutine_prefix()

  • glsl: set how_declared to hidden for compiler temps

  • glsl: fix cross validate globals

  • glsl: remove out of date comment

  • nir: set disallow_undef_to_nan for legacy ARB asm programs

  • glsl: fix glsl to nir support for lower precision builtins

  • glsl: always copy bindless sampler packing constructors to a temp

Timur Kristóf (25):

  • ac/nir/esgs: Slightly refactor emitting IO loads and stores.

  • ac/nir/tess: Slightly refactor emitting LS outputs.

  • ac/nir: Add helper macros for emitting IO code.

  • ac/nir/esgs: Implement packed 16-bit ES->GS I/O using helper macros.

  • ac/nir/tess: Implement packed 16-bit LS->HS I/O using helper macros.

  • ac/nir/tess: Implement packed 16-bit HS->TES I/O using helper macros.

  • aco: Add missing nir_builder include.

  • ac/nir: Move some helpers to new file.

  • ac/nir: Add helper for pre-rasterization output info.

  • ac/nir/ngg: Use new pre-rasterization output info helper.

  • ac/nir/legacy: Use new pre-rasterization output info helper.

  • nir: Add nir_opt_load_store_update_alignments.

  • radv: Add TES num_linked_patch_inputs.

  • radv: Add shader stats for inputs and outputs.

  • radv: Fix TCS -> TES I/O linking typo of VARYING_SLOT vs. BIT.

  • nir/opt_varyings: Print FS VEC4 type when debugging relocate_slot.

  • nir/opt_varyings: Don’t promote flat inputs when moving post-dominator.

  • ac/nir/tess: Adjust TCS->TES output mapping for linked shaders.

  • radv: Properly link TCS->TES IO again.

  • nir/lower_io: Add option to implement mediump as 32-bit.

  • radv: Ignore mediump IO flag.

  • ac/nir/tess: Only write tess factors that the TES reads.

  • ac/nir/tess: Fix per-patch output LDS mapping.

  • ac/nir/tess: Fix per-patch output VRAM mapping.

  • radv: Use number of TES inputs for TCS-TES linking.

Tomeu Vizoso (2):

  • etnaviv/nn: Make parallel jobs disabled by default

  • etnaviv: handle missing alu conversion opcodes

Turo Lamminen (1):

  • radv: Optimize memcpy in write_image_descriptor

Tvrtko Ursulin (1):

  • intel/hang_replay: fix batch address

Valentine Burley (40):

  • docs: Update VK_EXT_legacy_vertex_attributes entries

  • tu: Add missing VK_EXT_legacy_vertex_attributes feature

  • tu: Change commas to semicolons in VK_EXT_map_memory_placed features

  • drm-shim: Stub syncobj reset ioctl

  • tu: Expose VK_EXT_nested_command_buffer

  • freedreno/devices: Fix indentation for Adreno A32

  • freedreno/ci: Update expectations

  • wsi: Guard DRM-dependent function implementations with HAVE_LIBDRM

  • tu: Add support for VkBindMemoryStatusKHR

  • tu: Add support for NULL index buffer

  • tu: Add support for version 2 of all descriptor binding commands

  • tu: Advertise VK_KHR_maintenance6

  • tu: Move event related related code to tu_event.cc/h

  • tu: Handle all dependencies of CmdWaitEvents2

  • mr-label-maker: Update nouveau directories

  • mr-label-maker: Separate freedreno and turnip labels

  • tu: Handle the new sync2 flags

  • tu: Remove declaration of unused update_stencil_mask function

  • tu: Switch to vk_ycbcr_conversion

  • tu: Use vk_sampler

  • tu: Use device->vk.enabled_features instead of iterating twice

  • tu: Move sampler related code to tu_sampler.cc/h

  • tu: Drop tu_init_sampler helper function

  • tu: Advertise VK_KHR_shader_float_controls2

  • tu: Use the common version of vkGetBufferMemoryRequirements2

  • tu: Move buffer related code to tu_buffer.cc/h

  • tu: Use the common version of vkQueueBindSparse

  • tu: Use vk_buffer_view

  • tu: Drop tu_buffer_view_init helper function

  • tu: Move buffer view related code to tu_buffer_view.cc/h

  • tu: Rename tu_query.cc/h to tu_query_pool.cc/h

  • tu: Use the common versions of vkBegin/EndQuery()

  • tu: Use vk_query_pool

  • tu: Don’t disable 2 10-bit formats

  • freedreno,tu,ir3: Move threadsize_base and max_waves to fd_dev_info

  • freedreno/ci: Use the common a6xx-skips on a750

  • tu: Enable VK_KHR_shader_subgroup_uniform_control_flow

  • tu/kgsl: Remove unused variable

  • vulkan/wsi: Refactor can_present_on_device

  • tu: Always report that we can present on kgsl

Vignesh Raman (3):

  • virtio/ci: separate hiden jobs to -inc.yml files

  • ci: add farm variable for devices in collabora farm

  • ci/lava: add farm in structured log files

Vinson Lee (2):

  • panvk: Remove duplicate variable src_idx

  • panvk: Fix assert

Vlad Schiller (2):

  • pvr: Handle VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO

  • pvr: Handle VK_STRUCTURE_TYPE_IMAGE_FORMAT_LIST_CREATE_INFO

WANG Xuerui (2):

  • meson: Force use of LLVM ORCJIT for hosts without MCJIT support

  • meson: Additionally probe -mtls-dialect=desc for TLSDESC support

Weifeng Liu (1):

  • anv/anroid: Query gralloc for tiling mode

X512 (2):

  • egl/haiku: fix double free of BBitmap

  • egl/haiku: fix synchronization problems, add missing header

Yiwei Zhang (31):

  • venus: avoid client allocators for ring internals

  • venus: silence a stack array false alarm

  • venus: workaround excessive dma-buf import failure on turnip

  • venus: fix to destroy all pipeline handles on early error paths

  • meson: disallow Venus debug + LTO build via GCC

  • turnip: msm: clean up iova on error path

  • turnip: msm: fix racy gem close for re-imported dma-buf

  • venus: drop the workaround for excessive dma-buf import oom on turnip

  • turnip: virtio: fix error path in virtio_bo_init

  • turnip: virtio: fix iova leak upon found already imported dmabuf

  • turnip: virtio: fix racy gem close for re-imported dma-buf

  • vulkan: cast to avoid -Wswitch for Android struct beyond VkStructureType

  • venus: directly use vk drm and pci props in renderer info

  • venus: move custom props fill from GPDP2 to props init

  • venus: move props sanitization to a separate helper

  • venus: define VN_SET_VK_PROPS(_EXT) to simplify vk props init

  • vulkan: drop redundant core props query and copy helpers

  • venus: drop internal memory pools

  • venus: allow non-wsi image alias path to passthrough upon bind memory

  • ci/venus: skip a timeout test

  • anv: use os_get_option instead of getenv

  • venus: defer qfb buffer init upon query being used

  • venus: refactor vn_android_image_from_anb

  • venus: refactor to add vn_android_image_from_anb_internal

  • venus: support VK_ANDROID_NATIVE_BUFFER_SPEC_VERSION 8

  • vulkan: properly ignore unsupported feature structs

  • venus: tentative fix for test flakiness from invalid ring wait

  • venus: simplify cached mem type emulation

  • venus: clarify wsi image ownership

  • venus: fix a race condition between gem close and gem handle tracking

  • Revert “meson: disallow Venus debug + LTO build via GCC”

Yogesh Mohan Marimuthu (4):

  • radeonsi: remove si_query_hw_ops table and call func directly

  • radeonsi: use reseults_end instead of unprepared to init query buffer

  • radeonsi: rename query_hw_ops to hw_query_ops match sw

  • radeonsi: add more comments in si_query.c

Yonggang Luo (2):

  • util: Rename DETECT_OS_UNIX to DETECT_OS_POSIX

  • gallivm: add lp_context_ref for combine usage of LLVMContextSetOpaquePointers

Yukari Chiba (7):

  • llvmpipe: add gallivm_add_global_mapping

  • llvmpipe: make unnamed global have internal linkage

  • util: detect RISC-V architecture

  • gallivm: add riscv support to the mattrs setting code

  • llvmpipe: add function name to gallivm_jit_function

  • llvmpipe/tests: add a new test for multiple symbols for orc jit testing

  • llvmpipe: add an implementation with llvm orcjit

Yusuf Khan (7):

  • nouveau: Fix crash when destination or source screen fences are null

  • nouveau/headers: Make nvk_cl**** turn to nv_push_cl****

  • nvk: remove NVK_MME_COPY_QUERIES

  • zink/query: begin time elapsed queries even if we arent in a rp

  • nvc0/vbo: wrap draw_vbo for multidraw performance

  • nv50/vbo: wrap draw_vbo to avoid ovehead from multidraw

  • aux/draw: Use the draw info we get passed in instead of our own

Zach Battleman (2):

  • intel/brw: update comment to accurately reflect intended behavior

  • intel/brw: update Wa_1805992985 to use workarounds mechanism

Zack Middleton (2):

  • gles1: fix GL_OES_vertex_array_object

  • gles1: fix glBufferSubData()

Zan Dobersek (14):

  • fdperf: use snprintf instead of asprintf

  • fdperf: select_counter() should work with a countable value

  • fdperf: prettify logic around the reserved CP counter

  • fdperf: improve reads of counter values

  • fdperf: simplify counter value output

  • freedreno: add a7xx perfcounter support

  • tu: fix ZPASS_DONE interference between occlusion queries and autotuner

  • tu: avoid memory polling in occlusion query endings using ZPASS_DONE

  • tu: use either the 16-bit or 32-bit descriptor

  • ir3_nir_opt_preamble: handle 8-bit preamble loads and stores

  • ir3: rework TYPE_S8 as TYPE_U8_32

  • tu: support KHR_8bit_storage

  • tu: add format feature flag checks for VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT

  • freedreno/drm: add mesautil dependency

bbhtt (1):

  • nvk: Clean up unused header from libdrm_nouveau

chiachih (9):

  • amd/vpelib: Resolve mismatch with shader

  • amd/vpelib: Remove linear_0_125 TF

  • amd/vpelib: Remove gamma cached table

  • amd/vpelib: Remove support for non-linear FP16

  • amd/vpelib: adding blend gamma bypass

  • amd/vpelib: Remove checks for pitch alignment

  • amd/vpelib: Fix Color Adjustment Failing Test Cases

  • amd/vpelib: Fix blndgam bypass flag assignment

  • amd/vpelib: Bypass de/regam on HLG

msizanoen (1):

  • egl/wayland: Fix direct scanout with EGL_EXT_present_opaque

nyanmisaka (1):

  • frontends/va: add support for A2RGB10/X2RGB10/A2BGR10/X2BGR10

tarsin (4):

  • turnip: Change tu_image to use common initialization helpers

  • turnip: Convert tu_device_memory to use vk_device_memory

  • turnip: Split tu_image_init to use layout setting logic separately

  • turnip: Support AHardwareBuffer