Mesa 21.3.0 Release Notes / 2021-11-17

Mesa 21.3.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 21.3.1.

Mesa 21.3.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 21.3.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

a2753c09deef0ba14d35ae8a2ceff3fe5cd13698928c7bb62c2ec8736eb09ce1  mesa-21.3.0.tar.xz

New features

  • VK_EXT_color_write_enable on lavapipe

  • GL_ARB_texture_filter_anisotropic in llvmpipe

  • Anisotropic texture filtering in lavapipe

  • VK_EXT_shader_atomic_float2 on Intel and RADV.

  • VK_EXT_vertex_input_dynamic_state on RADV.

  • VK_KHR_timeline_semaphore on lavapipe

  • VK_EXT_external_memory_host on lavapipe

  • GL_AMD_pinned_memory on llvmpipe

  • GL 4.5 compatibility on llvmpipe

  • VK_EXT_primitive_topology_list_restart on RADV and lavapipe.

  • ES 3.2 on zink

  • VK_KHR_depth_stencil_resolve on lavapipe

  • VK_KHR_shader_integer_dot_product on RADV.

  • OpenGL FP16 support on llvmpipe

  • VK_KHR_shader_float16_int8 on lavapipe

  • VK_KHR_shader_subgroup_extended_types on lavapipe

  • VK_KHR_spirv_1_4 on lavapipe

  • Experimental raytracing support on RADV

  • VK_KHR_synchronization2 on Intel

  • NGG shader based culling is now enabled by default on GFX10.3 on RADV.

  • VK_KHR_maintenance4 on RADV

  • VK_KHR_format_feature_flags2 on RADV.

  • EGL_EXT_present_opaque on wayland

Bug fixes

  • RADV/ACO: Monster Hunter Rise Demo renders wrong results

  • radv: Odd lack of implicit host memory invalidation

  • Regression/Bisected: Crash in Godot games when steam overlay enabled

  • RADV: IsHelperInvocationEXT query is not considered volatile in ACO

  • ANV: error submitting with the same semaphore for wait and signal - regression?

  • [TGL+] anv: some dEQP-VK.drm_format_modifiers.* fails

  • Mesa 21.3rc3 - compile failure

  • iris: subslice assertion failure on some hardware during initialization

  • Final Fantasy V (Old version): Random text characters are not displayed

  • Diagonal rendering artifacts in Tomb Raider

  • dota2 black squares corruption

  • [hsw][bisected][regression] dEQP-VK.reconvergence.*nesting* failures

  • anv: dEQP-VK.wsi.wayland.<various> failures

  • radv_android.c: build errors due to commit 49c3a88

  • dEQP-EGL.functional.sharing.gles2.multithread.* regression with Iris

  • [radeonsi] Euro Truck Simulator 2: broken mimimap

  • [regression][bisected] Launching Valheim OpenGL game leads to GPU Hang

  • Android Meson build regression: hardware/system information apps crash on Raspberry Pi 4

  • radv: format properties are broken with modifiers

  • anv: dEQP-VK.graphicsfuzz.cov-multiple-one-iteration-loops-global-counter-write-matrices fails

  • iris: CCS modifier tests failing with suballocation

  • [RADV] For the game “World War Z: Aftermath” (Vulkan API) should used RADV_DEBUG=invariantgeom param

  • RADV: Resident Evil Village needs invariantgeom when NGG culling is enabled

  • radv: VK_EXT_vertex_input_dynamic_state

  • anv: dynamic state emission is busted

  • radeonsi: out of bounds access/compiler warning

  • RADV: Rendering issues in Resident Evil 2 with NGGC

  • GPU Hang/reset/forced reboot - latest mesa - mesa-demos/gloss

  • crocus: Incorrect stride when used through prime

  • radv: Vulkan games and demo apps are broken since “use DCC compatible with image stores for < 4K resolutions”

  • anv: descriptorBindingUniformBufferUpdateAfterBind feature is not supported

  • Cheza board reboots into another image on retry

  • freedreno: several regressions in org.skia.skqp.SkQPRunner

  • android: radv_android.c building errors after commits 9fc16b6 and 48cae11

  • iris: Implement memory sub-allocation

  • Assault Android Cactus ( STEAM AppID 250110) - Black triangles on Main menu character

  • World War Z - Renders red if FSR is enabled

  • Significant performance drop on Radeon HD 8400

  • turnip/a650: most VK_EXT_filter_cubic tests in dEQP-VK.texture.filtering.* fail

  • Ender Lilies: Turnip: Fails to render in-game

  • [nir][radv] Out of range shift when compiling Resident Evil Village shaders

  • [nir][radv] Out of range shift when compiling Resident Evil Village shaders

  • GL_EXT_disjoint_timer_query glGetInteger64v GL_TIMESTAMP failing with GL_INVALID_ENUM

  • Valgrind errors in VBO display list code since vertex store rework

  • Issue with Turnip compilation on Oneplus 8

  • freedreno: primtype_mask

  • [radv] bufferImageGranularity is 64

  • ../mesa-9999/src/amd/llvm/ac_llvm_helper.cpp:63:14: error: ‘class llvm::AttributeList’ has no member named ‘hasAttribute’; did you mean ‘getAttributes’?

  • GPU Reset POLARIS with Unigine Heaven and X4

  • RADV: consistent crash in Splitgate

  • llvmpipe doesn’t compile a shader with an inner scope in a for loop

  • llvmpipe doesn’t compile the increment of a for a loop

  • Mesa 21.2.1 implementation error: unexpected state[0] in make_state_flags()

  • freedreno: regression in org.skia.skqp.SkQPRunner#gles_localmatriximagefilter

  • [Radeonsi] VA-API Encoding no longer works on AMD PITCAIRN

  • turnip: Geometry flickering in Genshin Impact after 83e9a7fbcf53b90d0de66985dbbf91986fc7b05d

  • i915g: Need to link fail on non-unrolled loops

  • spirv2dxil.c:128:22: error: passing argument 7 of ‘spirv_to_dxil’ from incompatible pointer type [-Werror=incompatible-pointer-types]

  • OSMesa problem resizing

  • iris: Perform busy tracking for resources without GEM_BUSY/GEM_WAIT

  • [RADV] The game “Aliens: Fireteam Elite” start crashing after commit 2e56e2342094e8ec90afa5265b1c43503f662939

  • radeonsi: Smart Access Memory not being enabled by default?

  • Memory leak: si_get_shader_binary_size is missing a call to ac_rtld_close

  • dEQP-GLES3.stress.draw.unaligned_data.random.4 segfault

  • gl_DrawID is incorrect for glMultiDrawElementsBaseVertex/glMultiDrawElementsIndirect

  • iris: Scanout buffers now mapped WB cause glitches on screen

  • turnip: dEQP-VK.spirv_assembly.instruction.graphics.spirv_ids_abuse.lots_ids_* fails

  • i915g: nir_to_tgsi: Error : CONST[0]: The same register declared more than once

  • i915: GPU hang when doing FB fetch and gl_FragDepth write in one shader

  • ../mesa-9999/src/amd/compiler/aco_instruction_selection.cpp:10009:30: error: ‘exchange’ is not a member of ‘std’

  • radv: disable DCC for displayable images with storage on navi12/14

  • RADV: Menu static/artifacts in Doom Eternal

  • Crash happens when testing GL_PIXEL_PACK_BUFFER

  • Possible miscompilation of an integer division with vulkan

  • panfrost G31 - Cathedral crash- opengl 2.1 game (I guess)

  • freedreno C++14 build error

  • panfrost / armv7 - crash with mesa newer than 21.0.3

  • iris: recursive mutex acquire when re-using BO with aux map

  • llvmpipe doesn’t compile a valid shader with an useless switch

  • i915g: dEQP-GLES2.functional.fbo.completeness.renderable.texture.color0.rgb10_a2 failure

  • i915g: polygon offset CTS failures

  • GetFragDataLocation(prog, “gl_FragColor”) generates INVALID_OPERATION, but specs don’t say it should

  • anv: VK_EXT_memory_budget doesn’t know about device local memory

  • turnip: dEQP-VK.api.version_check.entry_points regression

  • Possible miscompilation of a comparison with unsigned zero

  • i915g: FXT1 support

  • dEQP-VK.wsi.android.swapchain.create#image_swapchain_create_info crash on Android R

  • Nine Regression with util: Switch the non-block formats to unpacking rgba rows instead of rects.

  • Add an Intel NDK Android build job

  • android: anv building error after commit e08370d

  • panfrost G31 Unreal Tournament - various glitches (apitrace)

  • Miscompilation of a switch case

  • ci/virgl: “dEQP error: waiting got error - 16, slow gpu or hang?” flakes

  • [radeonsi][regression] CPU is being used ~10 times more than usual after c5478f9067f.

  • i915g: cos/sin accuracy

  • glGetTexImage with PBO is not accelerated on Gallium

  • radeonsi: bad performance on PBO packs

  • dEQP-VK.wsi.android.swapchain.create#image_swapchain_create_info crash on Android R

  • [kbl] GPU hang launching UE4Editor (unreal engine)

  • turnip: A few dEQP-VK.pipeline.framebuffer_attachment.* tests failing due to “FINISHME: unaligned store of msaa attachment”

  • ci: new freedreno trace job running for lavapipe

  • i915g: Emit TXP

  • The image is distorted while use iGPU(Intel GPU) rendering and output via dGPU (AMD GPU)

  • Radeon 5700XT: Small render glitches around “heat balls” in dhewm3 (Doom 3)

  • lima: regression in plbu scissors cmd

  • freedreno: regression in org.skia.skqp.SkQPRunner#gles_multipicturedraw_*_tiled

  • Incorrect rendering

  • intel/isl: Wrong surface format name in batch

  • Unused graph areas created for device and format in VK_LAYER_MESA_overlay

  • [RADV] FSR in Resident Evil: Village looks very pixelated on Polaris

  • iris: regression in yuzu

  • 21.2.0rc1 Build Failure - GCC6.3

  • Crash in update_buffers after closing KDE “splash screen” downloader

  • Firefox (wayland) crash in wayland_platform

  • Crash in update_buffers after closing KDE “splash screen” downloader

  • Firefox (wayland) crash in wayland_platform

  • radeonsi: persistent, read-only buffer maps are slow to read

  • substance painter flickering with jagged texture and masks shown black

  • radv: FP16 mode in FidelityFX FSR doesn’t look right

  • Regression, ACO: DOOM Eternal hangs with ACO

  • Regression in Turnip with KGSL and Zink running opengl in proot

  • [bsw][i965][bisected][regression] waffle crashing after patch

  • Validation crash on wlroots after wl_shm appeared

  • [RADV] Blocky corruption in Scarlet Nexus and vkd3d-proton 2.4

Changes

Adam Jackson (18):

  • glx/drisw: Nerf PutImage when loaderPrivate == NULL

  • mesa: (correctly) flush more in _mesa_make_current

  • egl/dri2: Stop disabling pbuffer support on msaa configs

  • dri: Reformat DRI context attribute #defines

  • glx: Fix and simplify the share context compatibility check

  • glx: Store the context vtable on the glx screen

  • glx/dri2: Require the driver to support v4 of __DRI_DRI2

  • glx/drisw: Remove some misplaced error checks

  • glx/dri: Collect the GLX context attributes in a struct

  • glx: Simplify context API profile computation

  • glx: Remove some unused declarations from glxclient.h

  • glx: Move __glFreeAttributeState next to its one caller

  • glx: Clarify a debug message

  • glx: Don’t strip off window/pixmap support from float fbconfigs

  • wsi/x11: Fix a misunderstanding about how xcb_get_geometry works

  • wsi/x11: Fetch and discard the SYNC extension info

  • dri: Remove the allow_fp16_configs option, always allow them

  • egl/dri: Enable FP16 for EGL_EXT_platform_device

Adrian Bunk (1):

  • util/format: NEON is not available with the soft-float ABI

Alejandro Piñeiro (12):

  • broadcom: don’t define internal BPP values twice

  • vulkan: add vk_spec_info_to_nir_spirv util method

  • spirv: set medium precision with RelaxedPrecision decorator

  • broadcom/qpu: update/remove comments

  • broadcom/qpu: add new lookup opcode description helper

  • broadcom/qpu: use and expand version info at opcode description

  • broadcom/compiler: remove commented out vir_LOAD_IMM methods

  • broadcom/compiler: remove qpu_acc helper

  • broadcom/common: remove unused debug helper

  • v3d/v3dv: add unlikely for any V3D_DEBUG check

  • v3dv: use NULL for vk_error on initialization failures

  • v3dv/pipeline: don’t clone the nir shader at pipeline_state_create_binning

Alyssa Rosenzweig (243):

  • panfrost: Add perf_debug macros

  • panfrost: Warn on software conditional rendering

  • panfrost: Warn on going out of AFBC

  • panfrost: Log reasons for flushes

  • panfrost: Warn on get_fresh_batch_for_fbo

  • panfrost: Warn on get_fresh_batch

  • panfrost: Warn on transitions to linear

  • pan/bi: Copy liveness routines back

  • pan/bi: Copy back add_successor

  • pan/bi: Copy back bi_foreach_successor

  • pan/bi: Copy block bi_block

  • pan/bi: Clean up useless casts

  • pan/bi: Clean up liveness freeing

  • pan/bi: Shrink live array to 8-bits

  • meson: Build panfrost with tools=panfrost

  • panfrost: Remove unnecessary bifrost_compiler deps

  • panfrost: Only build libpanfrost with GL/VK

  • pan/bi: Add explicit cast for lod_or_mode

  • pan/bi: Remove duplicate NIR compiler options

  • pan/bi: Mark mod to string as maybe unused

  • panfrost,panvk: Remove broken v4 spilling code

  • targets/graw-xlib: Add missing dep_x11

  • pan/mdg: Garbage collect silly quirk

  • panfrost: Move context initalization to the vtable

  • panfrost: Make sampler view creation private

  • panfrost: Move sysval analysis out of per-gen

  • panfrost: Compile pan_cmdstream per-gen

  • panfrost: Statically determine uses_clamp

  • panfrost: Don’t make get_index_buffer_bounded per-gen

  • panfrost: Match sampler “nearest” names

  • panfrost: Share sampler code across archs

  • panfrost: Share blend code across architectures

  • panfrost: #ifdef pan_merge_empty_fs

  • panfrost: #ifdef fragment RSD packing

  • panfrost: Add a concatenation macro for genxml

  • panfrost: Use PAN_ARCH for the rest of pan_cmdstream

  • panfrost: Move init_batch to GenXML vtbl

  • panfrost: Make panfrost_batch_get_bifrost_tiler per-gen

  • panvk: Fix sampler filter modes on Bifrost

  • asahi: Identify texture address field

  • asahi: Fix sampler filtering flag

  • asahi: Identify texture dimension field

  • asahi: Set texture dimension field

  • asahi: Calculate cube map stride

  • asahi: Calculate resource offsets for cube maps

  • asahi: Implement cube map tiling transfers

  • asahi: Use agx_rsrc_offset for linear transfer_map

  • asahi: Allow tiled cube maps

  • asahi: Simplify can_tile type signature

  • asahi: Require tiling for cube maps

  • asahi: Assert texture layer is nonzero

  • agx: Don’t set helper invocation kill bit

  • agx: Fix mismatched units in load_ubo

  • agx: Dump register file when failing to allocate

  • agx: Use consistent ncomps

  • agx: Plug memory leak in register allocator

  • asahi: Enable instancing

  • agx: Drop dated /* TODO: RA */

  • agx: Handle load_instance_id

  • agx: Add agx_ushr helper

  • agx: Add udiv-by-constant routine

  • agx: Include divisors in the vertex shader key

  • agx: Implement instanced arrays

  • agx: Define p_extract for type converts

  • asahi: Pass instance_divisor to the compiler

  • agx: Add agx_format_shift routine

  • agx: Shift vertex buffer stride in the compiler

  • asahi: Add integers to agx_vertex_formats

  • asahi: Generalize src_offset for non-4byte formats

  • pan/va: Add initial ISA.xml for Valhall

  • pan/va: Add ISA.xml parser and support code

  • pan/va: Assert no instructions are duplicated

  • pan/va: Add Valhall assembler

  • pan/va: Check for FAU conflicts in the assembler

  • pan/va: Add disassembler generator

  • pan/va: Add dis/assembler test cases

  • pan/va: Add negative test cases for the assembler

  • pan/va: Add assembler test harness

  • pan/va: Add disassembler test harness

  • pan/va: Integrate the tests into meson test

  • pan/bi: Remove unused pointer from bi_instr

  • pan/bi: Remove unused option

  • pan/bi: Parse file names in standalone compiler

  • pan/bi: Zero initialize shader_info

  • pan/bi: Do more mesa/st stuff in standalone compiler

  • pan/bi: Add quirks for Mali G78

  • pan/bi: Only call clause code on Bifrost

  • pan/bi: Output binaries from standalone compiler

  • pan/bi: Add helpers for unit testing

  • pan/bi: Add instruction equality helper

  • pan/bi: Add instruction unit test macro

  • pan/bi: Remove redundant check in clamp fusing

  • pan/bi: Constify BIR manipulation

  • pan/bi: DCE after bifrost_nir_lower_algebraic_late

  • pan/bi: Add discard flag to bi_index

  • pan/bi: Remove unused BIR_FAU_HI

  • pan/bi: Model *ADD_IMM instructions in IR

  • pan/bi: Model RSCALE for Valhall

  • pan/bi: Model Valhall special values as FAU

  • pan/bi: Fix typo in FAU enum

  • pan/bi: Rename NOP.i32 to NOP

  • pan/bi: Rename CLPER_V7 back to CLPER

  • pan/bi: Add strip_index helper

  • pan/bi: Add helper to swizzle a constant

  • pan/bi: Use bi_apply_swizzle in constant folding

  • pan/bi: Refactor constant folding for testability

  • pan/bi: Add constant folding unit test

  • pan/bi: Fix UBO push with nir_opt_shrink_vectors

  • pan/bi: Garbage collect stuff in bi_layout.c

  • pan/bi: Add branch_offset immediate

  • pan/bi: Clean up and export bi_reconverge_branches

  • pan/bi: Clarify the logic of bi_reconverge_branches

  • pan/bi: Align staging registers on Valhall

  • pan/va: Allow floating-point swizzles on ATEST

  • gallium/tests: Fix warning calculating absdiff

  • pan/bi: Inline away bi_must_last

  • pan/bi: Remove dated ASSERTED properties

  • pan/bi: Expose unit tested scheduler predicates

  • pan/bi: Add BIT_ASSERT helper for unit testing

  • pan/bi: Teach meson about scheduler predicate test

  • pan/bi: Teach meson about Bifrost packing test

  • pan/bi: Teach meson about format pack tests

  • glsl/standalone: Lower COMPUTE shader precision

  • pan/bi: Restrict swizzles on same cycle temporaries

  • pan/bi: Test restrictions on same-cycle temporaries

  • pan/bi: Remove incorrect errata workaround

  • pan/bi: Use getopt for bifrost_compiler

  • pan/bi: Lower fragment output with <4 components

  • pan/bi: Add bi_entry_block helper

  • pan/bi: Handle asymmetric staging in bi_count_read_registers

  • pan/bi: Stub 64-bit in count_write_registers

  • pan/bi: Validate the live set starts empty

  • nir/lower_mediump_io: Don’t remap base unless needed

  • nir/lower_mediump: Fix metadata in all passes

  • pan/bi: Make bi_opt_push_ubo optional

  • pan/bi: Add a noopt debug option

  • panfrost: Add LINEAR debug option

  • panfrost: Remove unused #defines

  • panfrost: Use _PU for non-dithered formats

  • panfrost: Add blend helper packing the equation

  • panfrost: Fix is_opaque when blend_enable=false

  • panfrost: Simplify blend_factor_constant_mask

  • panfrost: Add basic fixed-function blending tests

  • panfrost: Leverage Bifrost’s 2*src blend factor

  • panfrost: Test src*dst + dst*src blending

  • pan/va: Document IEEE 754 conformance of clamps

  • pan/bi: Constant fold texturing lowerings

  • pan/bi: Unit test new constant folding patterns

  • pan/bi: Simplify bi_compose_clamp

  • pan/bi: Fuse abs/neg more on Valhall

  • pan/bi: Add shader equality helper for unit tests

  • pan/bi: Use FABSNEG pseudo ops for modifier prop

  • pan/bi: Add optimizer unit tests

  • pan/bi: Use FCLAMP pseudo op for clamp prop

  • pan/bi: Add fclamp unit tests

  • pan/bi: Fuse DISCARD with conditions

  • pan/bi: Unit test DISCARD+FCMP fusing

  • docs/panfrost: Update llvm option

  • drm-shim: Support kernels with >4k pages

  • panfrost: Fix leak of render node fd

  • panfrost: Rewrite the clear colour packing code

  • panvk: Use pan_pack_color

  • panfrost: Mark R5G6B5 as blendable

  • panfrost: Unit test clear colour packing

  • panfrost: Add dither state to the clear colour tests

  • panfrost: Handle non-dithered clear colours

  • panfrost: Add unit tests for non-dithered clears

  • panfrost: Disable shader-assisted indirect draws

  • pan/bi: Set eldest_colour dependency for ST_TILE

  • pan/bi: Don’t set td in blend shaders

  • pan/bi: Correct the sr_count on +ST_TILE

  • pan/bi: Extract load_sample_id to a helper

  • pan/bi: Set the sample ID for blend shader LD_TILE

  • panfrost: Evaluate blend shaders per-sample

  • pan/bi: Use ST_TILE for multisampled blend output

  • pan/bi: Use CLPER_V6 on Mali G31

  • panfrost: Remove unneeded quirks from T760

  • panfrost: Fix UNORM 10 sizes

  • panfrost: Use blendable check for tib read check

  • panfrost: Delete unpacks for blendable formats

  • pan/mdg: Insert moves before writeout when needed

  • pan/lower_framebuffer: Don’t replicate so much

  • pan/lower_framebuffer: Use fmul_imm

  • pan/lower_framebuffer: Unify UNORM handling

  • pan/lower_framebuffer: Don’t treat UNORM 4 special

  • pan/lower_framebuffer: Don’t open-code pad_vec4

  • pan/lower_framebuffer: Don’t open-code pan_unpacked_type_for_format

  • pan/mdg: Handle swapped 565 and 1010102 unorm

  • panfrost: Zero initialize blend_shaders

  • panfrost: Port v5 blend shader issue to blitter

  • panfrost: Fix NULL dereference in allowlist code

  • panfrost: Rip out primconvert code

  • panfrost/ci: Switch to suite support

  • panfrost/ci: Don’t skip matrix inverse tests

  • panfrost: Protect the variants array with a lock

  • panfrost: Remove null check in batch_cleanup

  • panfrost: Simplify get_fresh_batch_for_fbo

  • panfrost: Don’t use ralloc for resources

  • panfrost: Move bo->label assignment into the lock

  • panfrost: Remove get_fresh_batch

  • panfrost: Inline add_fbo_bos

  • panfrost: Switch resources from an array to a set

  • panfrost: Cache number of users of a resource

  • panfrost: Maintain a bitmap of active batches

  • panfrost: Add foreach_batch iterator

  • panfrost: Prefer batch->resources to rsrc->users

  • panfrost: Remove rsrc->track.users

  • panfrost: Remove writer = NULL assignments

  • panfrost: Replace writers pointer with hash table

  • panfrost: Take a ctx when submitting/destroying

  • panfrost: Raise maximum texture size

  • panfrost: Remove CACHE_LINE_SIZE #define

  • panfrost: Remove stale TODOs and XXXs

  • panfrost: Remove unused functions

  • pan/bi: Simplify condition

  • pan/bi: Assert l != NULL in bi_ra

  • pan/bi: Remove unused clause_start field

  • pan/bi: Fix format specifiers in disassembler

  • docs/panfrost: Remove obsolete note on Android.mk

  • docs/panfrost: We’re conformant now!

  • docs/panfrost: Add web chat link

  • panfrost: Fix incorrect test condition

  • panfrost: Add ASTC stretch factor enums

  • panfrost: Assert ASTC/AFBC are not used on v4

  • panfrost: Use ASTC 2D enums

  • panfrost: Encode 3D ASTC dimensions

  • panfrost: Move special_varying to compiler definitions

  • panfrost: Fix off-by-one in varying count assert

  • panfrost: Introduce PAN_MAX_VARYINGS define

  • panfrost: Don’t set CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER

  • panfrost: Fix PAN_MESA_DEBUG=sync with INTEL_blackhole_render

  • nir: Add Mali-specific derivative opcodes

  • pan/bi: Optimize abs(derivative)

  • panfrost: Don’t allow rendering/texturing 48-bit

  • panfrost: Detect implementations support AFBC

  • panfrost,panvk: Use dev->has_afbc instead of quirks

  • panfrost: Fix gl_FragColor lowering

  • panfrost: Workaround ISSUE_TSIX_2033

  • panfrost: Add internal afbc_formats

  • panfrost: Decompress for incompatible AFBC formats

  • panfrost: Enable AFBC on v7

  • mesa: Require MRT support for GL3/ES3

  • nir/lower_pntc_ytransform: Support PointCoordIsSysval

Andreas Baierl (5):

  • lima: CI: Enable GL_R8 and GL_RG8 texture formats

  • lima: Expose GL_EXT_clip_control

  • lima: Remove depth near/far workaround

  • lima: Fix glFrontFace handling

  • lima/parser: add shader disassembly to dump

Andreas Bergmeier (1):

  • v3dv: implement VK_EXT_physical_device_drm

Antonio Caggiano (3):

  • ci/freedreno: Test with non-redistributable traces

  • freedreno/ci: Add a manual job for tracking performance

  • pps: Restore documentation

Anuj Phogat (1):

  • intel/dg2: Add L3 configuration

Arvind Yadav (1):

  • radeonsi: remove the use of PKT3_CONTEXT_REG_RMW

Axel Davy (1):

  • util: Fix translate from block compressed to rgba

Bas Nieuwenhuizen (72):

  • zink: set dedicated allocation when needed

  • util/fossilize_db: Update parsed_offset correctly.

  • util/fossilize_db: Reset file position to parsed_offset on cache_offset read failure.

  • util/fossilize_db: Flush files after header write.

  • util/fossilize_db: Be conservative about header length check for locking.

  • util/fossilize_db: Only allocate entries after full read.

  • util/fossilize_db: Use uint64_t for file size.

  • util/fossilize_db: Unlock the cache file if the entry already exists.

  • util/fossilize_db: Add extra flock mutex.

  • radv: Use correct signedness in misalign test.

  • radv: Allocate space for inline push constants.

  • nir/lower_scratch: Ensure we don’t lower vars with unsupported usage.

  • nir/inline_functions: Handle halting functions.

  • radv: Check format before calling depth_only/stencil_only.

  • util/fossilize_db: Don’t corrupt keys during entry read.

  • nir: Avoid visiting instructions multiple times in nir_instr_free_and_dce.

  • radv: Expose a bufferImageGranularity of 1.

  • radv: Fix CPU AABB build.

  • radv: Fix arrayOfPointers for instances in accel struct build.

  • radv: Add accel struct build support for the object-to-world matrix.

  • radv: Add more acceleration structure formats.

  • radv: Add optimized CPU BVH builds.

  • radv: Add bvh node definitions to a header.

  • radv: Modify load_sbt_amd intrinsic to get the descriptor.

  • aco: Implement call scope.

  • radv: Refactor some nir_channels usage to use nir_channel.

  • radv: Do more meta shader lowering.

  • radv: Implement NULL accel struct descriptor write.

  • nir: Add AMD rt intrinsics.

  • radv: Add support for ray launch size.

  • aco: Add support for ray launch size.

  • nir: Support ray launch size in divergence analysis.

  • radv: Support nir_intrinsic_load_global_constant.

  • radv: Add RT cache flushes.

  • radv: Add pipeline type.

  • radv: Add group info to pipeline.

  • radv: Add raytracing pipeline properties.

  • radv: Make some pipeline functions non-static.

  • radv: Add scaffolding for RT pipeline compilation incl libraries.

  • radv: Add main loop variables.

  • radv: Add helper to inline shaders into the main shader.

  • radv: Add helper to parse raytracing stages.

  • radv: Add pass to lower anyhit shader into an intersection shader.

  • radv: Add ray traversal loop.

  • radv: Combine all the parts together with a main loop for an RT pipeline.

  • radv: Add support for setting a dynamic stack size.

  • radv: Add caching for RT pipelines.

  • radv: Experimentally enable RT extensions.

  • radv: Add DMA buffer update function for internal use.

  • radv: Add an internal indirect dispatch command.

  • radv: Add an indirect dispatch struct to the header.

  • radv: Add copy/serialization info to accel struct headers.

  • radv: Add acceleration structure queries.

  • radv: Add GPU copy/serialization/deserialization shader.

  • radv: Add CPU copying of acceleration structures.

  • radv: Add GPU copying of acceleration structures.

  • radv: Add CPU serialization of acceleration structures.

  • radv: Add GPU serialization of acceleration structures.

  • radv: Fix Android build for common functions.

  • radv: Don’t invalidate VCACHE after clear_htile_mask.

  • radv: Add VK_FORMAT_R16G16B16A16_UNORM for accel. structures.

  • radv: Handle copying zero queries.

  • amd/common: Add fallback for misreported clocks for RGP.

  • radv: Document cache coherency rules.

  • radv: Add hooks after in-renderpass meta operations.

  • radv: Try to do a better job of dealing with L2 coherent images.

  • radv: Fix modifier property query.

  • radv: Add bufferDeviceAddressMultiDevice support.

  • radv: Disable coherent L2 optimization on cards with noncoherent L2.

  • meson: Check arguments before adding.

  • util: Add support for clang::fallthrough.

  • radv: Fix memory corruption loading RT pipeline cache entries.

Boris Brezillon (137):

  • panfrost: Fix pan_blitter_emit_bifrost_blend()

  • panfrost: Add explicit padding to pan_blend_shader_key

  • pan/gen_pack: Generalize the PREFIX() trick

  • panvk: Add missing midgard_pack dependency

  • pan/gen_pack: Add pan_size() and pan_align() macros

  • panfrost: Move the polygon list init logic to pan_cmdstream.c

  • pan/gen_macros: Move the TEXTURE definition to gen_macros.h

  • pan/gen_macros: Map {TEXTURE,SAMPLER} to the arch-specific descriptor

  • pan/gen_macros: Include midgard_pack.h from gen_macros.h

  • panfrost: Stop including midgard_pack.h directly

  • panfrost: s/[idep_]midgard_pack/[idep_]pan_packers/

  • panfrost: Get rid of the mali_xxx enum redefinitions

  • panfrost: Add generic mappings for the gen-specific tiler descriptor macros

  • pan/gen_pack: Add parens around packed1/2 vars in pan_merge()

  • panfrost: Get rid of all _packed structs in pan_context.h

  • panfrost: Move panfrost_modifier_to_layout() to pan_texture.c

  • panfrost: Only emit special attribute buffer entries on pre-v6 hardware

  • panvk: Prepare per-gen split

  • panfrost: Prepare indirect dispatch helpers to per-gen XML

  • panfrost: Prepare indirect draw helpers to per-gen XML

  • panfrost: Fix pan_blit_ctx_init() when start > end

  • panfrost: Make pan_blit() return the tiler job pointer

  • panfrost: v7 does not support RGB32_UNORM textures

  • panvk: Make the per-arch static lib depend on panvk_entrypoints.h

  • panvk: Fix panvk_copy_fb_desc()

  • panvk: Don’t use pan_is_bifrost()

  • panvk: Fix blend descriptor emission

  • panvk: Only advertise MSAA-4

  • panvk: We don’t support linear filtering on integer formats

  • panvk: Don’t advertise min/max filter

  • panvk: Fix chan_size calculation in panvk_emit_blend()

  • panvk: Narrow the allow-forward-pixel-kill condition

  • panvk: Clamp blend constants before copying them to the cmdbuf state

  • panvk: Don’t allocate an array of blend constants

  • panvk: Close the panfrost device in the panvk_physical_device_init() error path

  • panvk: Reset panvk_pool->transient_bo in panvk_pool_reset()

  • panvk: Fix a BO leak in panvk_pool_alloc_backing()

  • panvk: Initialize clear values to zero when load_op != OP_CLEAR

  • panvk: Don’t take a BO reference when binding memory to an image

  • panvk: Only set PAN_DBG_TRACE if PANVK_DEBUG_TRACE is set

  • panvk: Disable the BO cache

  • panfrost: Patch Z32_S8X24 format when creating a sampler view

  • panfrost: Fix the Z32_S8X24 and X32_S8X24 definitions

  • panfrost: RGB10_A2_SNORM is not a valid texture format on v6+

  • panfrost: Drop the R and T flags on SCALED formats

  • panfrost: RGB332_UNORM is not a valid texture format on v6+

  • panfrost: Prepare blitter helpers to per-gen XML

  • panfrost: Prepare blend helpers to per-gen XML

  • panfrost: Prepare pan_cs helpers to per-gen XML

  • panfrost: Move panfrost_major_version() to gen_macros.h

  • panfrost: Prepare pandecode to per-gen XML

  • panfrost: Prepare scoreboard helpers to per-gen XML

  • panfrost: Prepare pan_encoder.h to per-gen XML

  • panfrost: Prepare texture helpers to per-gen XML

  • panfrost: Prepare shader helpers to per-gen XML

  • panfrost: Fix indirect draws when vertex or instance count is 0

  • panfrost: Fix collision in the indirect draw shader table

  • panfrost/ci: Skip the indirect_draw+XFB tests

  • pan/bi: Relax check on 8bit swizzles

  • pan/bi: Allow passing RT conversion descriptors to fragment shaders

  • pan/blit: Fix a NULL dereference in the preload path

  • pan/blit: Extend pan_preload_fb() to return emitted jobs

  • panvk: Initialize the blend shader logic

  • panvk: Preload FB attachments when required

  • panvk: Merge identical BO entries before submitting a job

  • panvk: Move copy stubs to a separate file

  • panvk: Move blit/resolve stubs to a separate file

  • panvk: Get rid of panvk_emit_fragment_job()

  • panvk: Don’t use the subpass to calculate the FB descriptor size

  • panvk: Don’t check the bind_point in panvk_cmd_prepare_fragment_job()

  • panvk: Make panvk_cmd_alloc_tls_desc() more generic

  • panvk: Add a panvk_cmd_prepare_tiler_context() helper

  • panvk: Stop dereferencing the subpass in panvk_cmd_close_batch()

  • panvk: Issue a fragment job if at least one target is cleared

  • panvk: Implement vkCmdClear{DepthStencil,Color}Image()

  • panvk: Implement vkCmdCopyImage()

  • panvk: Implement vkCmdCopyBufferToImage()

  • panvk: Implement vkCmdCopyImageToBuffer()

  • panvk: Implement vkCmdCopyBuffer()

  • panvk: Implement vkCmdFillBuffer()

  • panvk: Implement vkCmdUpdateBuffer()

  • pan/decode: Fix DCD size in Pre frame decoding

  • pan/blit: Let the caller offset the start/end coords passed to the blitter

  • pan/blit: Fix 3D blittering

  • panvk: Implement vkCmdBlitImage()

  • panvk: Always allocate at least one BLEND descriptor for fragment shaders

  • panvk: Fix the static scissor/viewport case

  • panvk: Fix TLS initialization for multi-draw batches

  • panvk: Extend panvk_cmd_close_batch() to handle current_batch == NULL

  • panvk: Make panvk_cmd_open_batch() return the new batch

  • panvk: Use the local batch variable when we have one

  • panvk: Don’t invalidate the vertex attributes when binding a new pipeline

  • panvk: Fix the pipeline binding logic

  • panvk: Fix panvk_pipeline_builder_upload_sysval()

  • panvk: Fix multisample image copies

  • panvk: Avoid allocating sysvals UBOs when the pipeline has one

  • panvk: Handle input varyings without previous writes

  • panvk: Fix an overflow on cmdbuf->state.clear

  • panvk: Don’t expect subpasses to use all RTs

  • panvk: Only prepare texture descriptors when the image is sampled

  • panvk: Fix 1DArray image to buffer copy

  • panvk: Fix size overflow in GetBufferMemoryRequirements()

  • panvk: Fix stencil clear assignment in panvk_cmd_fb_info_set_subpass()

  • panvk: Handle VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS) when creating image views

  • panvk: Split var copies before lowering them

  • panvk/ci: Trigger bifrost jobs on vulkan changes

  • pan/bi: Fix 1DArray image coordinate retrieval

  • pan/lower_fb: Support SNORM8 unpacking

  • pan/lower_fb: Re-order components when dealing with raw formats

  • pan/lower_fb: Add support for B10G10R10A2_UINT variants

  • pan/lower_fb: Add support for rgb10a2 _SINT variants

  • panfrost: Use an identity swizzle for RAW formats

  • panfrost: Add a common genxml file so we can share a few definitions

  • panfrost: Split command stream descriptor definitions per-gen

  • panfrost: Move genxml related files to a subdir

  • nir: Make sure src->num_components < dst->num_components in nir_ssa_for_src()

  • nir/lower_blend: Pad src to a 4-component vector

  • nir/lower_blend: Don’t lower RTs whose format is set to NONE

  • nir/lower_blend: Make sure we’re not passed scaled formats

  • nir/lower_blend: Shrink blended result if needed

  • pan/blend: Allow passing blend constants through a sysval

  • panvk: Fill the blend constants sysval

  • panvk: Lower blend operations when needed

  • panvk/ci: Enable blend tests

  • panvk: Fix allocation of BOs bigger than the slab size

  • panvk: Don’t use panfrost_get_default_swizzle() on v7+

  • panvk: Fix wls_size retrieval

  • panvk: Pass the render target index to panvk_meta_clear_attachment()

  • panvk: Allow clear_attachment of RTs > 0

  • panvk: Support clearing ZS attachments

  • nir: Add a nir_sysvals_to_varyings() helper

  • spirv: Let spirv_to_nir() users turn sysvals into input varyings

  • spirv: Always declare FragCoord as a sysval

  • spirv: Declare PointCoord as a sysval

  • vulkan: Fix weak symbol emulation when compiling with MSVC

  • vulkan: Set unused entrypoints to vk_entrypoint_stub when compiling with MSVC

  • vulkan: Fix entrypoint generation when compiling for x86 with MSVC

Boyuan Zhang (5):

  • radeon/vcn: initilize num_temporal_layers for hevc

  • radeon/vcn: track width and height of the last frame

  • radeon/vcn: check frame size change for vp9 header flags

  • radeon/vcn: set min value for num_temporal_layers

  • frontends/va: add num_temporal_layers check

Caio Marcelo de Oliveira Filho (27):

  • vulkan/util: Add and use vk_multialloc_zalloc variants

  • anv: Zero initialize pipeline structs

  • spirv: Implement SPV_EXT_shader_atomic_float16_add

  • vulkan: Update XML and headers to 1.2.185

  • anv: Advertise support for VK_EXT_shader_atomic_float2

  • nir/dead_cf: Do not remove loops with loads that can’t be reordered

  • nir: Update documentation for location to mention Task/Mesh

  • nir: Add a way to identify per-primitive variables

  • nir: Add per-primitive I/O intrinsics

  • compiler: Add new non-Multiview Task/Mesh builtins

  • compiler: Add Task/Mesh to shader_info

  • nir/lower_io: Identify Mesh output as arrayed

  • nir/divergence_analysis: Handle Task/Mesh shaders

  • nir: Don’t lower Task/Mesh I/O to temporaries

  • nir: Allow Task/Mesh to lower compute system values

  • spirv: Implement non-Multiview parts of SPV_NV_mesh_shader

  • anv: Simplify subgroup_size_type rules for compute shaders

  • anv: Refactor subgroup_size_type rules into a single function

  • spirv: Identify non-temporal memory access

  • nir/lower_io_to_vector: Allow Task/Mesh to load from outputs

  • intel: Add and use max_constant_urb_size_kb

  • iris: Document push constants allocation

  • anv: Validate vertex related states only when VS is present

  • anv: Move together primitive pipeline emit calls

  • anv: Identify code paths specific to graphics primitive pipeline

  • intel/compiler: Convert test_eu_compact to use gtest

  • intel/compiler: Remove unused `ret` declaration

Caio Oliveira (1):

  • util/ra: Fix deserialization of register sets

Carsten Haitzler (1):

  • panfrost: tidy up GPU naming to be in line with official names

Charlie Turner (5):

  • ci: Build libdrm earlier for x86_test-vk

  • ci: Fix syntax error in radv fails files

  • ci: Support per-driver skip lists.

  • radv/ci: Remove duplication in dEQP skip lists.

  • radv/ci: Fix the GPU_VERSION for polaris10

Charmaine Lee (2):

  • aux/draw: Check for preferred IR to take nir-to-tgsi path in draw module

  • svga: fix render target views leak

Chia-I Wu (43):

  • venus: refactor vn_EndCommandBuffer

  • egl/surfaceless: try kms_swrast before swrast

  • meson: allow egl_native_platform to be specified

  • vulkan/wsi: replace prime_blit_buffer by a bool

  • venus: clean up vn_AllocateMemory

  • venus: suballocate memory in more cases

  • venus: log more WSI messages

  • vulkan/wsi/x11: do not inherit last_present_mode

  • venus: print warnings when stuck in busy waits

  • iris, crocus: add idep_genxml to per_hw target dependencies

  • venus: update venus-protocol headers

  • venus: break up vn_device.h

  • venus: break up vn_device.c

  • venus: free queues after vkDestroyDevice is emitted

  • venus: use uint32_t in vn_ring_submit

  • venus: minor cleanup to physical device init loop

  • venus: pre-initialize device groups

  • venus: fix device group enumeration with unsupported devices

  • venus: group physical device fields with a struct

  • venus: no supported device is not an error

  • venus: initialize physical devices once

  • venus: reorder version fields in vn_instance

  • venus: init roundtrip fields in vn_instance later

  • venus: add vn_renderer_submit_simple_sync

  • venus: support reply shmem without ring

  • venus: init experimental features before the ring

  • venus: add and use VN_CS_ENCODER_INITIALIZER

  • venus: rework vn_instance_submission

  • venus: make ring buffer size configurable

  • venus: update venus-protocol headers

  • venus: raise the ring buffer size to 64KB

  • venus: refactor vn_instance_enumerate_physical_devices

  • venus: separate physical device init and filter

  • venus: copy VkPhysicalDeviceImageDrmFormatModifierInfoEXT

  • venus: add vn_refcount

  • venus: convert bo and shmem to use vn_refcount

  • venus: add a helper to destroy vn_descriptor_set

  • venus: add vn_refcount to vn_descriptor_set_layout

  • venus: keep layouts of descriptor sets alive

  • radv: plug leaks in radv_device_init_accel_struct_build_state

  • vulkan/wsi/wayland: fix an invalid u_vector_init call

  • util/vector: make util_vector_init harder to misuse

  • venus: add atrace support

Christian Gmeiner (46):

  • etnaviv: export supported prim types

  • etnaviv: remove primconvert

  • ci: include etnaviv support in ARMHF container.

  • ci: update kernel

  • ci/bare-metal: add telnet based serial

  • ci/bare-metal: add support for eth008 power relay

  • ci/bare-metal: add etnaviv

  • lima: fix leak of the screen hash table

  • util/tests: rename bitset test names

  • util/bitset: add bitwise AND, OR and NOT

  • util/tests: add bitwise AND, OR and NOT tests

  • util/bitset: add right shift

  • util/tests: add bitset SHR tests

  • util/bitset: add left shift

  • util/tests: add bitset SHL tests

  • util/bitset: s/BITSET_SET_RANGE/BITSET_SET_RANGE_INSIDE_WORD

  • util/bitset: add BITSET_SET_RANGE(..)

  • util/tests: add set bit range test

  • freedreno/isa: add leading zero’s

  • freedreno/isa: simplify custom_target

  • freedreno/isa: add next_instruction(..)

  • freedreno/isa: add defines for fprintf(..) usage

  • freedreno/isa: store max size for needed bitset

  • freedreno/isa: generate ir3-isa.h

  • freedreno/isa: generate isaspec-decode.h

  • freedreno/isa: add bitmask_t to encode.py

  • freedreno/isa: add bitmask to/from uint64_t helper

  • freedreno/isa: add BITMASK_WORDS define

  • freedreno/isa: add store_instruction(..)

  • freedreno/isa: generate marcos used for printf(..)

  • freedreno/isa: add split_bits(..) methods

  • freedreno/isa: decode: switch bitmask_t to BITSET_WORD’s

  • freedreno/isa: encode: switch bitmask_t to BITSET_WORD’s

  • freedreno/isa: update documentation

  • freedreno/isa: add shbang and make executable

  • freedreno/isa: move isaspec to a new home

  • compiler/isaspec: add print(..) helper

  • compiler/isaspec: keep track of written data

  • compiler/isaspec: add alignment support

  • etnaviv: use better name for fd hash table

  • etnaviv: fix leak of the screen hash table

  • etnaviv: fix indentation

  • etnaviv: move drm version readout to drm layer

  • etnaviv: allow screen creation with NULL renderonly object

  • etnaviv: extend screen_create(..) with gpu_fd

  • etnaviv: add etna_lookup_or_create_screen(..)

Clayton Craft (1):

  • anv: don’t advertise vk conformance on GPUs that aren’t conformant

Connor Abbott (81):

  • tu: Triage some CTS failures

  • ir3: Preserve gl_ViewportIndex in the binning shader

  • tu: Use NIR for clear/blit shaders

  • ir3: Delete old packed struct encoding

  • tu: Handle multisample vkCmdCopyColorImage()

  • tu: Make tile stores use a dedicated CS

  • tu: Implement non-aligned multisample GMEM STORE_OP_STORE

  • freedreno: Rename and document tess primid-related sysvals

  • tu, freedreno/a6xx, ir3: Rewrite tess PrimID handling

  • tu, freedreno/a6xx: Fix setting PC_XS_OUT_CNTL::PRIMITVE_ID

  • ir3: Document RA-related register flags better

  • tu: Read some input attachments directly

  • freedreno/a6xx: Add new register fields

  • freedreno, tu: Stop asking for foveation quality

  • freedreno, tu: Set GRAS_LRZ_PS_INPUT_CNTL::SAMPLEID

  • freedreno/a6xx: Document GRAS_SC_CNTL::SINGLE_PRIM_MODE

  • tu: Fix feedback loops in sysmem mode

  • tu: Fix xfb when there is a hole at the end

  • freedreno: Decode a650+ CP_START_BIN/CP_END_BIN packets

  • tu: Fix logic errors with subpass implicit dependencies

  • tu: Consider depth/stencil for implicit dependencies

  • ir3: Add pass to remove unreachable blocks

  • ir3/ra: Remove logical_unreachable

  • ir3: Copy-propagate single-source phis

  • ir3: Print physical successors/predecessors

  • ir3/print: Use mesa_stream_log_printf for (kill)

  • ir3/merge_regs: Set wrmask for pcopy destinations

  • ir3/ra: Reinitialize interval when inserting

  • ir3/ra: Fix available bitset for live-through collect srcs

  • ir3/ra: Handle huge merge sets

  • ir3/ra: Make ir3_reg_interval_remove_all() useful for spilling

  • ir3: Add loop depth to ir3_block

  • ir3: Add ra_foreach_src_n/ra_foreach_dst_n

  • ir3: Fix RA debug printing

  • ir3: Properly validate pcopy reg sizes

  • ir3: Fix compress_regs_left accounting for half-regs

  • ir3: Initial support for spilling non-shared registers

  • ir3: Fix getting stp/ldp components in ir3_info

  • ir3, turnip, freedreno: Report stp/ldp in shader stats

  • freedreno/ci: Add spillall tests

  • tu: Properly handle waiting on an earlier pipeline stage

  • tu: Add a650-specific CCU flush workaround

  • tu: Remove some stale bypass xfails

  • ir3: Remove ir3_instr::name

  • ir3: Make instruction IP 32 bits

  • ir3: Make ir3_register::name 32-bits

  • ir3/ra: Fix type mismatch when comparing intervals

  • lima: Add a NIR load duplicating pass

  • lima/gpir: Rewrite register allocation for value registers

  • freedreno/computerator: Add support for pvtmem

  • ir3/lower_pcopy: Use right flags for src const/immed

  • ir3/lower_pcopy: Set entry->done in the swap loop

  • tu: Fix VS primid with tess + GS

  • freedreno/a6xx: Fix VS primid with tess + GS.

  • ir3: Add bar to beginning of HS with tess_use_shared

  • freedreno, turnip: Disable 8bpp UBWC on a650

  • ir3: Make trig replacement expression exact

  • freedreno/a6xx: Name TPL1_DBG_ECO_CNTL

  • freedreno, turnip: Set TPL1_DBG_ECO_CNTL better

  • ir3: Use source in ir3_output_conv_src_type()

  • tu/clear_blit: Constify some image views

  • tu: Implement VK_KHR_imageless_framebuffer

  • ir3/lower_subgroups: Support 16-bit READ_* sources

  • ir3: Skip src size validation for cat1

  • tu: Expose VK_KHR_shader_subgroup_extended_types

  • ir3: Initialize local size earlier

  • ir3/ra: Don’t reset round-robin start for each block

  • ir3/ra: Use killed sources in register eviction

  • ir3/cp: Add missing const promotion check

  • ir3/cp: Fix inlining 32->16 const into meta instructions

  • nir/lower_ubo_vec4: Fix align_mul=8 special case

  • ir3: Fix printing branch type

  • ir3: Make ir3_create_collect() take a block

  • ir3: Always create barycentrics in the input block

  • ir3: Remove separate regmask.h

  • ir3: Handle special regs in regmask

  • ir3/legalize: handle WAR for special regs

  • ir3: Fix check for immediate range

  • ir3: Fix handling cat6 immediates

  • ir3: Fold ldc src immediates

  • ir3/spill: Mark root as non-spillable after inserting

Corentin Noël (8):

  • ci: actually run piglit tests with virgl

  • ci: Re-enable piglit trace for virgl

  • ci: Disable llvmpipe optimizations when running virgl CI

  • ci: Increase the default Rust toolchain version

  • ci: Increase crosvm version

  • ci: Use crosvm to run dEQP tests for virgl

  • glx: Prevent crashes when an extension isn’t found

  • virgl: Set GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION to 1

Daniel Schürmann (54):

  • aco/optimizer: ensure to not erase high bits when propagating packed constants

  • aco/ra: don’t allocate vector space for MIMG NSA operands

  • aco: include <cstddef> in aco_util.h

  • nir/lower_alu_to_scalar: don’t skip gaps in write_mask

  • nir/opt_shrink_vectors: don’t shrink vectors used by intrinsics

  • nir: consider write_mask in nir_ssa_def_components_read()

  • nir/opt_shrink_vectors: reverse iteration order

  • nir/shrink_vectors: shrink ALU properly

  • nir/shrink_vectors: shrink vecN properly

  • nir: return false for loops in contains_other_jump()

  • aco/print_ir: fix printing of VOPC_SDWA definitions

  • aco: use VOPC_SDWA on GFX9+

  • aco: add instr_is_16bit() helper function

  • aco/ra: refactor subdword definition info

  • aco/ra: refactor subdword operand stride

  • aco/validate: simplify get_subdword_bytes_written()

  • aco/opcodes: remove definition_size[]

  • aco: add more validation rules for SDWA operands

  • nir/loop_analyze: consider instruction cost of nir_op_flrp

  • nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once

  • radv: call nir_lower_flrp() after the first radv_optimize_nir()

  • aco: remove redundant s_and exec after nir_op_inot

  • aco: only apply extract if not used more than 4 times

  • aco: refactor nir_op_imul selection

  • aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16

  • aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64

  • aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16

  • aco: fix init_any_pred_defined() for loop header phis

  • aco: refactor lower_phis()

  • aco/lower_bool_phis: avoid creating trivial phis

  • aco/lower_phis: propagate constants before emitting merge code

  • aco/lower_phis: optimize loop exit phis

  • aco: fix p_insert lowering with 16bit sources

  • aco: rewrite SDWA selector

  • aco: remove explicit dst_preserve flag

  • aco/print_ir: always print SDWA dst & src selections

  • aco: preserve subdword RC when lowering p_insert/p_extract

  • aco/ra: Fix potential out-of-bounds array accesses.

  • aco/ra: don’t copy linear VGPRs within CF in get_reg_create_vector()

  • aco: stop scheduling if clause-forming fails

  • aco: make clause-forming depend on the number of moved instructions

  • aco: try forming clauses even if reg_pressure exceeds

  • aco: clang-format

  • aco/ra: fix intersects()

  • aco/ra: refactor affinities into assignment struct

  • aco/ra: remove some redundant code

  • aco/ra: split register assignment for phis into separate function

  • aco/ra: try more aggressive to assign phi defs the same register

  • aco/ra: for phis try to find an operand-matching register earlier

  • aco/ra: don’t set affinities for ssa-repair phis

  • aco/ra: create affinities between nested phis

  • aco/ra: create nested affinities for loop header phis

  • aco/ra: don’t rewrite affinities for phi operands after register assignment

  • driconf: set vk_x11_strict_image_count for Wolfenstein: Youngblood

Daniel Stone (7):

  • vulkan/wsi/wayland: Cosmetic alignment fix

  • vulkan/wsi/wayland: Initialise wl_shm pointer in VkImage

  • egl/wayland: Error on invalid native window

  • egl/wayland: Allow EGLSurface to outlive wl_egl_window

  • CI: Disable LAVA devices

  • Revert “CI: Disable LAVA devices”

  • fdno/resource: Rewrite layout selection for allocation

Danylo Piliaiev (39):

  • freedreno: fix wrong tile aligment for 3 CCU gpu

  • tu: handle half-reg fs outputs

  • tu: delay decision of forcing sysmem due to subpass self-dependencies

  • turnip: reduce maxComputeWorkGroupSize

  • tu: disable gmem in primary cmdbuffer if secondary has it disabled

  • tu: add “flushall” and “syncdraw” debug options

  • freedreno/decode: print estimated crash location without colored output

  • tu: declare VK_EXT_extended_dynamic_state2 but leave it disabled

  • tu: implement dynamic depth bias enable

  • tu: implement dynamic primitive restart enable

  • tu: implement dynamic rasterizer discard enable

  • tu: enable VK_EXT_extended_dynamic_state2

  • turnip: provide dummy CmdSetLogicOpEXT and CmdSetPatchControlPointsEXT

  • freedreno: rename Z_TEST_ENABLE->Z_READ_ENABLE, Z_ENABLE->Z_TEST_ENABLE

  • turnip: apply workaround for depth bounds test without depth test

  • ir3: prohibit folding of half->full conversion into mul.s24/u24

  • ir3/a6xx,freedreno: account for resinfo return size dependency on IBO_0_FMT

  • turnip: consider shader’s immediates size for sub-stream allocation

  • turnip: re-emit vertex params after they are invalidated

  • util/u_trace: make u_trace usable for other than gallium drivers

  • util/u_trace: auto-generation of serialization funcs for tracepoints

  • turnip: implement basic perfetto support

  • u_trace: helpers for tracing tiling GPUs and re-usable VK cmdbuffers

  • turnip/perfetto: reusable command buffers support

  • u_trace: pass command stream through tracing functions

  • turnip: support tracing of gmem/sysmem load/store/clears

  • turnip/kgsl: fix compilation after perfetto introduction

  • turnip: consider multiview_mask when clearing depth-stencil attachment

  • turnip: Move to common DEFINE_HANDLE_CASTS casting macro

  • turnip: clamp per-tile scissors to max viewport size in binning pass

  • turnip: fix vbs emission when there are holes in bindings

  • ir3: remove obsolete assert for intrinsic_store_output in tess

  • turnip: do nothing on dispatch with zero total workgroups

  • ir3: support source modes for resinfo.b

  • ir3/freedreno: handle non-uniform resinfo

  • ir3/freedreno: handle non-uniform a1en instructions

  • turnip: fix streamout buffer offset calculations

  • ir3/ra: Check register file upper bound when updating preferred_reg

  • tu: fix rast state allocation size on a6xx gen4

Dave Airlie (134):

  • lvp: fixup multi draw memcpys

  • lavapipe: fix multi-draw regression in shader parameters test

  • lavapipe: fix indexed multi draw draw_id increment

  • draw: handle resetting draw_id between instances.

  • softpipe/aniso: move DDQ calculation to after scaling.

  • wl/shm: don’t fetch formats if not requested.

  • clover/il: return IL only for spirv and correct length

  • gallivm: add anisotropic filter weight table.

  • draw: add shader access to aniso filter table.

  • llvmpipe: add filter table shader accessor

  • gallivm: add support for anisotropic sampling.

  • llvmpipe: add support for max aniso query.

  • draw: add sampler max_aniso query.

  • llvmpipe: enable GL_ARB_texture_filter_anisotropic

  • llvmpipe/virgl/ci: update traces for aniso

  • docs: update anisotropic info for softpipe/llvmpipe/lavapipe

  • crocus/gen4-5: fix ff gs emit on VS vue map change.

  • llvmpipe/linear: fix ppc64/s390 build

  • llvmpipe: add some extra linear rast checks.

  • llvmpipe: add support for time elapsed queries.

  • llvmpipe: rework query fence signalling for get_query_result_resource

  • gallivm/img: use uint for image coord builder.

  • draw/llvmpipe: multiply polygon offset units by 2

  • teximage: return correct desktop GL error for compressedteximage

  • crocus/gen4: restrict memcpy mapping to gen5

  • intel/fs: restrict max push length on older GPUs to a smaller amount

  • intel/decode: add gfx4 constant buffer decode

  • intel/decode: add gfx4 vertex shader decode

  • crocus/gen45: fix mapping compressed textures

  • intel/genxml: fix raster operation field in blt genxml

  • crocus: add support for set alpha to one with blt.

  • virgl: disable anisotropic filtering.

  • virgl: add support for anisotropic texture filtering

  • ci: bump to latest virglrenderer for anisotropic support

  • clover/llvm: turn off optional CL 3 features.

  • nir/libclc: handle null callee name when lowering

  • vtn: add support for atomic flag test/set/clear

  • nir: add 32-bit bool of fisfinite

  • nir: add fisnormal lowering

  • gallivm: handle fisfinite/fisnormal

  • clover: fix api zero sized enqueue

  • clover: return CL_INVALID_PLATFORM properly.

  • clover: add kernel attributes support for SPIR-V

  • clover: fix compilation with clang + llvm 12.

  • clover/nir: don’t convert to NIR on library link

  • clover: only return CLC version as 1.2 (even for 3.0)

  • llvmpipe: add support for user memory pointers

  • lavapipe: add host ptr support.

  • docs: add llvmpipe host memory extensions

  • crocus/blt: add pitch/offset checks to fix blt corruption

  • crocus: align staging resource pitch on gen4/5 to allow BLT usage.

  • intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5

  • draw: handle primitive ID for quads/quad strips.

  • draw/gs: add clipvertex support for compatibility

  • draw/tess: add clipvertex support for compatibility

  • draw: add vertex color clamping to gs/tes

  • llvmpipe: enable GL compatibility profiles

  • gallivm: don’t lower local invocation index in frontend

  • llvmpipe/cl: limit kernel input size.

  • gallivm: fix idiv/irem for 8/16/64-bit and 32-bit INT_MIN/-1

  • gallivm: fix non-32 bit popcounts.

  • llvmpipe: init renderer string once to avoid races.

  • vulkan/wsi/sw: wait for image fence before submitting to queue

  • crocus: copy views before adjusting

  • crocus: drop u_primconvert header.

  • crocus: add missing line smooth bits.

  • crocus: add missing fs dirty on reduced prim change.

  • vulkan/wsi: add support for detecting mit-shm pixmaps.

  • vulkan/wsi/sw: add support for using host_ptr for shm pixmaps.

  • vulkan/wsi/sw: add mit-shm support for pixmap allocation

  • meson: fix regression finding shm dep

  • llvmpipe/fs: fix multisample depth/stencil fs writes.

  • llvmpipe: consolidate scissor plane code between line/tri

  • llvmpipe/scissor: rewrite scissor planes interaction.

  • llvmpipe: adjust scissor planes for multisample.

  • gallium: add a sample0 only option to blitter.

  • u_blitter: add support for sample0 only resolves.

  • lavapipe: VK_KHR_depth_stencil_resolve support

  • crocus/gen7: add missing IVB/GT2 geom shader workaround.

  • intel/decode/gfx6: add support for gfx6 CC/VIEWPORT pointers.

  • gallivm/ssbo: fix up dynamic indexed ssbo load/stores/atomics

  • gallivm/ssbo: cast ssbo index to int type.

  • lavapipe: enable dynamic index ubo/ssbo

  • llvmpipe/cs: rework thread pool for avoid mtx locking

  • gallivm/coro: use a phi instead of alloca

  • llvmpipe: shorten hold time on the screen mutex

  • llvmpipe/cs: rework coroutine context handling (v2)

  • gallivm: add initial support for 16-bit float builder.

  • gallivm/nir: handle conversion to 16-bit texel fetch

  • gallivm/nir: fix f2b32

  • gallivvm/nir: handle non-32bit mask scatter stores

  • gallivm: add 16-bit sin/cos via llvm intrinsic

  • llvmpipe: lower_flrp16

  • gallivm/nir: handle 16-bit exp/lod using intrinsics.

  • gallivm/nir: call pow with correct flt builder

  • gallivm/nir: pass the correct float builder to ddx/y

  • gallivm: increase tgsi nesting call stack size

  • gallivm: use llvm intrinsics for 16-bit round/trunc/roundeven

  • llvmpipe: enable FP16 and update CL + traces piglit results.

  • lavapipe: enable KHR_shader_float16_int8

  • gallivm/nir: handle subgroup reduction across all types

  • lavapipe: enable KHR_shader_subgroup_extended_types

  • docs: update docs for new llvmpipe/lavapipe features

  • lavapipe: enable KHR_spirv_1_4

  • lavapipe: fix vertex attributes/descriptor binding

  • lavapipe: don’t access pColorBlendState when not legal

  • gallium/format: move two vertex formats into the proper place.

  • lavapipe/ci: drop some fails I fixed recently

  • lavapipe: move to 1.2 features/properties structs.

  • gallivm/nir: fix subgroup invocation read.

  • lavapipe: enable vulkan 1.2 support.

  • lavapipe: move to new shared features/properties

  • lavapipe: cleanup image create function.

  • lavapipe: fixup image binding flags.

  • llvmpipe: overhaul fs/cs variant keys to be simpler.

  • gallivm: use pmulhrsw to make aos sampling more accurate.

  • crocus/gen6: don’t reemit the svbi when debugging

  • crocus/query: don’t loop on ready status after gpu hang.

  • gallivm/format: clamp SINT conversion rather than truncate.

  • llvmpipe/cs: change submission pattern for threadpool

  • llvmpipe: fix 4-bit output scaling.

  • lvp/fence: quick fix to previous commit.

  • device_select: close dri3 fd after using it.

  • wsi/x11: cleanup properly after mit shm paths are used.

  • Revert “lvp/fence: quick fix to previous commit.”

  • lavapipe: fix fence handling around wsi submission

  • crocus: Honor scanout requirement from DRI

  • crocus/gen5: reemit shaders on gen5 after new program cache bo.

  • crocus/gen5: add dirty flags for urb fences.

  • llvmpipe: fix userptr for texture resources.

  • lavapipe: drop EXT_acquire_xlib_display

  • vulkan/wsi: set correct bits for host allocations/exports for images.

  • llvmpipe: disable 64-bit integer textures.

  • llvmpipe: fix compressed image sizes.

Derek Foreman (2):

  • egl/wayland: Support RGBA ordered formats

  • egl/wayland: Properly clear stale buffers on resize

Dmitry Baryshkov (1):

  • freedreno/regs: add bit to control continuous clock with 7nm PHYs

Dylan Baker (19):

  • VERSION: bump version for 21.3 development cycle

  • docs/relnotes/new_features: empty for next release cycle

  • docs: update calendar for 21.2.0-rc1

  • docs: mark mesa 21.0 as done

  • freedreno/ir3: Add build id to the disassembler test

  • docs: add release notes for 21.2.0

  • docs: update calendar for 21.2.0-rc2

  • docs: update calendar for 21.2.0-rc3

  • docs: update calendar and link releases notes for 21.2.0

  • docs: Add calendar entries for 21.2 release.

  • bin/gen_release_notes: Add basic tests for parsing issues

  • bin/gen_release_notes: Don’t consider issues for other projects

  • bin/gen_release_notes: Fix commits with multiple Closes:

  • docs: add release notes for 21.2.2

  • docs/relnotes/21.2.2: Add SHA256 sum

  • docs: update calendar and link releases notes for 21.2.2

  • docs: add release notes for 21.2.3

  • docs” Add SHA256 sum for mesa 21.2.3

  • docs: update calendar and link releases notes for 21.2.3

Ed Baker (1):

  • frontends/va: Fix test_va_api VAAPIDisplayAttribs tests

Ed Martin (1):

  • winsys/radeonsi: Set vce_encode = true when VCE found

Eduardo Lima Mitev (1):

  • turnip: Add support for VK_VALVE_mutable_descriptor_type

Ella-0 (13):

  • v3dv: Add is_unorm, is_snorm and is_float format functions

  • v3dv: Implement VK_EXT_custom_border_color

  • v3dv: implement VK_EXT_color_write_enable

  • v3dv: Implement VK_EXT_pipeline_creation_cache_control

  • v3dv: Implement VK_EXT_provoking_vertex

  • v3dv: Implement VK_EXT_pipeline_creation_feedback

  • v3d/compiler: Handle point_coord_upper_left

  • v3d: Don’t handle PIPE_SPRITE_COORD_UPPER_LEFT twice

  • v3dv: Expose correct point size granularity

  • v3dv: Implement VK_EXT_vertex_attribute_divisor

  • ci/v3dv: Update fails with multiview failing with points

  • v3d: add R10G10B10X2_UNORM to format table

  • v3dv: enable VK_KHR_surface_protected_capabilities

Emma Anholt (233):

  • nir: Validate after deserialization.

  • nir_to_tgsi: Fix image declarations.

  • gallium/ttn: Add a debug flag for dumping the shaders.

  • freedreno/ir3: Reduce choose_instr_dec() and _inc() overhead.

  • gallium/ureg: Sort the output decls.

  • freedreno: Lock access to msm_pipe for RB object suballocation.

  • ci/freedreno: Enable the MSAA deqp tests.

  • gallivm: Default brilinear filtering to off.

  • gallivm: Always take the per-pixel LOD path for cubemaps.

  • i915g: Add support for shader-db.

  • nir_to_tgsi: Pack our tex coords into vec4 nir_tex_src_backend[12].

  • nir_to_tgsi: Add support for TXP.

  • nir_to_tgsi: Add support for HW atomics.

  • nir_to_tgsi: Declare buffers for all of num_ssbos.

  • nir_to_tgsi: Add support for nir_intrinsic_load_sample_pos.

  • turnip: Fix assertions on checking mutable combined samplers support.

  • gallium/dri2: Make dri_init_options just init DRI options.

  • gallium/driconf: Allow the driver to parse the driconf options.

  • ci: Stop disabling filter hacks for llvmpipe.

  • ci/i915: Update deqp expectations for another test passing.

  • ci: Uprev deqp-runner and use “suite” support to merge softpipe runs.

  • ci/llvmpipe: Use the deqp-runner suite support to consolidate jobs.

  • ci/i915g: Merge the two dEQP runs together.

  • ci: Save dEQP results on all tests.

  • ci/virgl: Use deqp-runner suite support to reduce CI job count.

  • ci/zink: Use deqp-runner suite support to reduce the CI job count.

  • ci: Update piglit to 4545a28cd8fea03fbab0e5f90bfbd812c32f3be1

  • ci/freedreno: Clear out TF API errors xfails.

  • freedreno/a5xx: Disable TF when pausing or transitioning to non-TF.

  • freedreno/a5xx: Don’t try to emit FS images in binning command streams.

  • ci/freedreno: Mark border_color as passing on a5xx.

  • ci/a5xx: Skip some piglit stress tests that destabilize CI.

  • ci/freedreno: Organize, fill out, and document our VK xfails.

  • ci/freedreno: Generalize the spirv_ids_abuse skips.

  • ci/freedreno: Clean up and fill out the tess timeout annotations.

  • ci/freedreno: Skip the slow dEQP-VK.ubo.random.all_shared_buffer.48 in CI.

  • ci/freedreno: Add jobs to manually do a full VK on freedreno.

  • i915g: Use the devmaster quadratic approximation for sin/cos.

  • i915g: Reapply clang-format.

  • nir: Move phi src setup to a helper.

  • i915g: Make the 1D workaround keep TXP’s .w channel in the right spot.

  • i915g: Add support for blitting compressed textures.

  • i915g: Add missing support for sRGB S3TC.

  • i915g: Fix up the format mapping for DXT1_*RGB

  • i915g: Add support for FXT1.

  • i915g: Fix 3D texture layouts for width != height.

  • i915g: Implement cube/3d texture_subdata() as a series of per-layer maps.

  • ci/turnip: Add a new flake from running more of the CTS.

  • ci/freedreno: Move freedreno’s deqp testing to suite support.

  • freedreno/a6xx: Apply the cube image size lowering to GL, too.

  • freedreno/ir3: Only lower cube image sizes once.

  • freedreno/ir3: Use the resinfo path for ssbo sizes on GL, too.

  • freedreno/ir3: Move a6xx’s get_ssbo_size shl to NIR.

  • freedreno/a6xx: Skip setting up image dims constants.

  • freedreno/a5xx: Use ST4_ constants for SSBO/image state types.

  • freedreno/a5xx: Reduce packet emits for SSBO state.

  • ci/freedreno: Mark a new flaky SSBO length test.

  • ci/freedreno: Flake the rest of the pbuffer/window dEQP-EGL tests.

  • i915g: Fix polygon offset by telling draw the Z format.

  • i915g: Correct PIPE_SHADER_CAP_MAX_TEMPS.

  • i915g: Reduce ARB_fp max tex indirections to match i915c.

  • i915g: Clear some xfails that are now skips.

  • i915g: Add comments explaining various xfails.

  • i915g: clang-format fixup.

  • freedreno/ir3: Apply the a6xx samgq workaround to TES/TCS/GS as well.

  • freedreno/ir3: Align driver param upload size/offset for indirect uploads.

  • freedreno/a6xx: Sync TFB BO access against prior TFB writes.

  • ci/lavapipe: Add a fractional run with ASan

  • ci/llvmpipe: Add a fractional ASan run.

  • nir: Set .driver_location for GLSL UBO/SSBOs when we lower to block indices.

  • nir/nir_lower_uniforms_to_ubo: Set the explicit stride of the UBO 0 uniform.

  • nir_to_tgsi: Use explicit sizes of NIR variables for UBO declarations.

  • ci/freedreno: Annotate a bunch of piglit fails/crashes.

  • ci/freedreno: Add a bunch of recent a530 and a630 flakes.

  • ci/v3dv: generalize the buffer_access.through_pointers flakes.

  • ci/freedreno: Fix xfail update for arb_draw_indirect.

  • freedreno/ir3: Don’t use isam for coherent image loads on a6xx.

  • freedreno/ir3: Clarify what’s going on in a4xx SSBO atomics.

  • freedreno/ir3: Refactor a3xx ibo/ssbo load/store instruction XML.

  • freedreno/ir3: Add encode/decode support for a5xx’s LDIB.

  • freedreno/ir3: Use LDIB for coherent image loads on a5xx.

  • osmesa: Add a unit test for resizing buffers.

  • cso: Revert using FS sampler count for other stages at context unbind.

  • mesa/st: Add an assertion for finalize_nir versus PIPE_CAP_TEXCOORD.

  • i915g: Simplify the process of texcoord mapping to TGSI semantics.

  • i915g: Expose PIPE_CAP_TGSI_TEXCOORD.

  • i915g: Add finalize_nir.

  • mesa/st: Add an optional GLSL link fail msg to finalize_nir.

  • i915g: Reject non-unrolled loops or non-flattend IFs at link time.

  • ci/iris: Mark create_context-no_error as failing.

  • ci/iris: Unmark dma_buf_import_export tests as failing.

  • ci/iris: Consistently use .test-manual-mr for our unstable hardware.

  • ci/iris: Switch GL/GLES testing to suites.

  • freedreno/a6xx: Emit a WFI after event writes flushing CCU.

  • ci/freedreno: Fix typo in glx-tfp flake annotation.

  • ci/freedreno: Mark a630 basic-glsl-misc-fs as flaky.

  • ci/freedreno: Skip slow SizedDeclarationsPrimitive in CI.

  • llvmpipe: Free CS shader images on context destroy.

  • llvmpipe: Fix leak of CS local memory with 0 threads.

  • llvmpipe: memcpy user_buffers at set_constant_buffer time.

  • nir_to_tgsi: Fix indirect addressing of atomic counters.

  • nir_to_tgsi: Don’t forget to add sampler views with our samplers.

  • nir_to_tgsi: Add support for memory_barrier_tcs_patch.

  • nir_to_tgsi: Clean up some unnecessary pointers-to-uregs.

  • nir_to_tgsi: Switch ssa_temp[] to be a ureg_src.

  • nir_to_tgsi: Allow SSA defs to include swizzles, abs, and neg.

  • mesa: Move the advanced blend bitmask to shader_info.

  • nir: Add a nir_instr_free() to replace ralloc_free(instr).

  • nir: Pull the instr list free function out to a helper.

  • nir/from_ssa: Use nir_instr_free() to free instrs instead of ralloc.

  • nir: Consistently pass the shader to the shader arg of instr creation.

  • nir: Consistently pass the instr to nir_src_copy().

  • nir: Add all allocated instructions to a GC list.

  • nir/lower_phis_to_scalar: Use nir_instr_free() to free instrs.

  • nir/tests: Fix transmuting an SSA dest to be non-SSA

  • nir: Switch from ralloc to malloc for NIR instructions.

  • nir: Drop the unused instr arg for src/dest copy functions.

  • ci/freedreno: Drop minetest from a3xx trace testing.

  • freedreno: Precompute resource pointer hash values.

  • freedreno: Use TC’s flag for whether get_query is in the driver thread.

  • freedreno: Move the batch cache to the context.

  • freedreno: Remove the submit lock locking.

  • freedreno: Use a BO bitset for faster checks for resource referenced.

  • freedreno: Remove dead fd_batch_reset().

  • ci/i915g: Clarify failure happening in fbo-fragcoord2.

  • mesa/st: Allow loops in GLSL when NIR is enabled, even if the HW can’t.

  • freedreno: Fix autotune regression since batch-cache rework.

  • freedreno: Assert to check for the previous regression.

  • ci/freedreno: Add some cubearray piglit flakes on a630 I noticed.

  • ci/baremetal: Retry if our network device spontaneously fails.

  • ci/freedreno: Update restricted trace sha1s.

  • nir_to_tgsi: Remove the abs on fcsel’s bool src.

  • freedreno/a5xx+: Rename GRAS_CNTL/RB_RENDER_CONTROL0 IJ_LINEAR_* bits.

  • freedreno/a5xx+: Set the IJ_LINEAR_* request bits if we need the regs.

  • tu: Move core features definitions to a helper function.

  • tu: Deduplicate extension/core feature flags.

  • tu: Add GetPhysicalDeviceFeatures2() support for more VK 1.2 core features.

  • tu: Move VK 1.1 core properties to a helper function and use macros for exts.

  • tu: Support VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES.

  • turnip: Move physical device 1.2 properties to a helper function.

  • mesa: Throw an error for compressed glGenerateMipmap on GLES2 contexts.

  • mesa: Prioritize checking for GLES2’s uniform transpose error.

  • mesa: Fix missing CopyTexImage formats for OES_required_internalformat.

  • ci/vc4,i915g: Add links to VK-GL-CTS issues for some of our xfails.

  • vulkan: Add helpers for filling exts for core features and properties.

  • vulkan: Support PHYSICAL_DEVICE_1_n_ features/properties in the helpers.

  • turnip: Use the shared now-in-core feature/prop extension helper functions.

  • anv: Use the shared now-in-core feature/prop extension helper functions.

  • radv: Use the shared now-in-core feature/prop extension helper functions.

  • vulkan: Update the XML and headers to 1.2.193

  • turnip: Set the VK_DRIVER_ID to our new enum.

  • turnip: Swizzle in 0, 1 for D24S8 STENCIL_ASPECT sampling.

  • turnip: Disable VK_EXT_display_control.

  • i915g: Improve debug output for the fresh-batch overflow case.

  • i915g: Remove dead VBUF_USE_POOL code.

  • i915g: Unifdef VBUF_MAP_BUFFER.

  • i915g: Use the non-vbuf code path by default to fix index overflows.

  • ci/freedreno: Disable flaky a530 for now.

  • gallium/dri: Make YUV formats we’re going to emulate external-only.

  • turnip: Match the blob’s format for vendorID and deviceID.

  • turnip: Expose a device name similar to the blob.

  • freedreno/rnndec: Fix use of undefined value_orig in the !ti case.

  • freedreno/rnndec: Avoid making 0-length variable length arrays.

  • freedreno/afuc: Avoid ubsan warns about shifting to the top bit of ‘int’

  • freedreno: Fix UBSan failures in cffdec’s (uint8_t)x << 24

  • freedreno: Reuse u_math.h instead of open coding ALIGN/ARRAY_SIZE.

  • freedreno: Reuse u_math.h instead of open coding uif().

  • freedreno: Move afuc tests to meson unit tests.

  • freedreno: Move crashdec/cffdec tests to be meson unit tests.

  • freedreno: Move the headergen2 test to be meson unit tests.

  • panfrost: Disable flaky piglit job for now.

  • ci/freedreno: Restart the run if cheza spontenously reboots.

  • freedreno/tools: Fix build failure when cffdump isn’t built but tests are.

  • freedreno/a6xx: Move the format table to common code.

  • freedreno/a6xx: Add int/scaled/snorm vertex formats to match turnip.

  • freedreno/a6xx: disable vertex fetch support flag for b8g8r8a8_srgb.

  • freedreno/a6xx: Add support for EXT_texture_sRGB_R8/RG8.

  • freedreno/a6xx: Drop texturing support from other scaled formats.

  • freedreno/a6xx: Add some more 16-bit rgb/rgba swaps to our format tables.

  • freedreno/a6xx+: Add support for the R8G8_R8B8 and G8R8_B8R8 formats.

  • util/format: Add an RGB planar format for YV12, like we have for NV12.

  • freedreno/a6xx: Put R8_G8_B8_420_UNORM in the format table.

  • freedreno/a6xx: Use fd6_pipe2tex() for the 2D src format.

  • freedreno/a6xx: Make the format table const.

  • freedreno/a6xx: Rewrite the format table format/swap helpers.

  • freedreno/a6xx: Add support for A/XRGB1555 formats.

  • freedreno/a6xx: Enable UBWC for RGBA5551 (and 1555) textures.

  • turnip: Give D32_SFLOAT_S8_UINT a native format.

  • turnip: Switch tu_format internals to using pipe_format more.

  • turnip: Do format lookups from the fd6 format table and cross-check.

  • turnip: Replace our format table with fd6_format_table.

  • i915g: Check for the scanout-layout conditions before setting level info.

  • mesa/st: Don’t bump locations of patch vars for !PIPE_CAP_TEXCOORD.

  • nir_to_tgsi: Include txf_ms’s sample index.

  • nir_to_tgsi: Add support for load_output/load_per_vertex_output.

  • gallium/ureg: Sort the input decls, too.

  • nir_to_tgsi: Add support for declaring image arrays.

  • nir_to_tgsi: Add support for load_barycentric_sample.

  • nir_to_tgsi: Add support for nir_intrinsic_load_barycentric_at_sample.

  • nir_to_tgsi: Turn GS PRIMID into an input instead of a sysval.

  • nir-to-tgsi: Avoid emitting TXL just for lod 0 on non-vertex shaders.

  • nir_to_tgsi: Sort FS output declarations to avoid virglrenderer bugs.

  • nir_to_tgsi: Add a workaround for virgl UBO array dynamic indexing.

  • nir_to_tgsi: Force the TXQ LOD argument to be scalar.

  • virgl: Add support for NIR shaders when VIRGL_DEBUG=nir.

  • turnip: Plug the vendor/device ID into the pipeline cache fields, too.

  • turnip: Fix allocation failure handling around device->name.

  • turnip: Free disk cache on pdev init failure.

  • ci/freedreno: Move the other a530 test jobs to test-manual-mr.

  • ci/freedreno: try to fix the a630 cubearray flake’s regex.

  • ci/freedreno: Disable the minetest trace due to flaky shader code.

  • ci: Update deqp to vulkan-cts-1.2.7.1.

  • ci: Update piglit to 7d7dd2688c214e1b3c00f37226500cbec4a58efb.

  • radeonsi: Fix leak of screen->perfcounters.

  • Revert “ci: Add osmesa to Windows GitLab CI”

  • ci/deqp-runner: Drop SUMMARY_LIMIT env var.

  • ci/deqp-runner: Simplify the –jobs argument setup.

  • ci/deqp-runner: Use new deqp-runner’s built-in renderer/version checks.

  • ci/deqp-runner: Drop silly CSV env vars.

  • ci/deqp-runner: Move remaining asan runs to –env LD_PRELOAD=

  • ci/deqp-runner: Drop LD_LIBRARY_PATH=/usr/local for libkms workaround.

  • ci/deqp-runner: Don’t start GPU hang detection for making junit results.

  • ci/deqp-runner: Move more non-suite logic under the non-suite ‘if’.

  • ci/piglit-runner: Fix funny indentation of the piglit-runner command.

  • ci/deqp-runner: Rename the deqp-drivername-*.txt files to drivername-*.txt

  • ci/piglit-runner: Merge piglit-driver-*.txt files into driver-*.txt.

  • ci: Enable testing radeonsi’s libva using libva-util unit tests.

  • freedreno: Fix gmem invalidating the depth or stencil of packed d/s.

  • freedreno/a6xx: Fix partial z/s clears with sysmem.

  • freedreno/a6xx: Don’t try to generate mipmaps for SNORM with our blitter.

  • freedreno/ir3: Fix off-by-one in prefetch safety assert.

  • freedreno/a6xx: Emit a null descriptor for unoccupied IBO slots.

  • mesa/st: Disable NV_copy_depth_to_color on non-doubles-capable HW.

Emmanuel Gil Peyrot (3):

  • radv: Support device initialization without LLVM dependencies

  • radv: Support shader compilation without LLVM dependencies

  • radv: Allow building when LLVM isn’t enabled

Enrico Galli (11):

  • microsoft/spirv_to_dxil: Adding continue opt pass to fix DXIL loop gen

  • nir_lower_readonly_images_to_tex: Fix typeo on image arrays

  • microsoft/compiler: Add support for arrays to image_store

  • microsoft/compiler: Correctly flag when using raw buffers

  • microsoft/spirv_to_dxil: Enable support for shared memory

  • microsoft/compiler: Add support for local_invocation_index

  • spirv_to_dxil: Convert out parameters to a single object

  • nir: Add CAN_REORDER to load_ubo_dxil

  • spirv_to_dxil: Add support for nir_intrinsic_load_num_workgroups

  • spirv_to_dxil: Add support for non-zero vertex and instance indices

  • nir_to_dxil: Add tagging raw SRVs in shader flags

Eric Engestrom (45):

  • docs: add release notes for 21.1.5

  • docs: update calendar and link releases notes for 21.1.5

  • docs: drop duplicate `21.1` branch name from release calendar

  • docs: add release notes for 21.1.6

  • docs: update calendar and link releases notes for 21.1.6

  • pick-ui: drop assert that optional argument is passed

  • pick-ui: show nomination type in the UI

  • pick-ui: show commit date

  • docs: add release notes for 21.1.7

  • docs: update calendar and link releases notes for 21.1.7

  • python: explicitly require python3

  • gitlab-ci: stop installing python-is-python3 package

  • python: drop python2 support

  • Revert “python: Explicitly add the ‘L’ suffix on Python 3”

  • isl: drop comment about “python 2 vs 3” as it doesn’t apply anymore

  • isl: drop left-over comment

  • glsl/tests: remove some dead code

  • python: drop explicit output_encoding=’utf-8’ in mako templates

  • docs: add release notes for 21.1.8

  • docs: update calendar and link releases notes for 21.1.8

  • docs: add plan for 21.3.x release cycle

  • docs: shorten “last release” note to fit on the website without horizontal scrolling

  • bin/khronos-update.py: update the branch name (s/master/main/)

  • bin/khronos-update.py: add upstream for vulkan_directfb.h & vulkan_screen.h

  • gitlab: convert old REVIEWERS into GitLab’s CODEOWNERS

  • CODEOWNERS: add SWR maintainers

  • CODEOWNERS: add intel group

  • CODEOWNERS: add android build system

  • CODEOWNERS: add @alyssa for Asahi and Panfrost

  • CODEOWNERS: add @bbrezillon for src/panfrost/vulkan/

  • CODEOWNERS: add @jenatali for Microsoft & D3D12

  • egl: sync eglext.h & egl.xml from Khronos

  • egl: implement EGL_EXT_present_opaque on wayland

  • VERSION: bump for 21.3.0-rc1

  • .pick_status.json: Update to 86b3d8c66ce17ddcaefa5bdea68882cc03a57f15

  • .pick_status.json: Mark 7a2e40df5e8490de739c66865f90fa6804e41f6d as denominated

  • VERSION: bump for 21.3.0-rc2

  • .pick_status.json: Update to 4856586ac605e89ee6c128b1a190f000311b49ba

  • VERSION: bump for 21.3.0-rc3

  • .pick_status.json: Update to c356f3cfce9459dc1341b6a2a0fd5336a9bdcc3c

  • VERSION: bump for 21.3.0-rc4

  • .pick_status.json: Update to 549924d53e359c04d7c14b12990178c86d3aad2d

  • meson: drop duplicate addition of surfaceless & drm to the list of platforms

  • VERSION: bump for 21.3.0-rc5

  • .pick_status.json: Update to ba6d389fa7a0ac512cb9d4cdd21efde990f041b1

Erico Nunes (2):

  • lima: avoid crash with negative viewport values

  • ci: enable CI for lima again

Erik Faye-Lund (52):

  • dxil: Set coord_components on the txf in lower_int_sampler

  • lavapipe: do not assert on more than 32 samplers

  • lavapipe: do not mark unsupported tests as crashing

  • gallivm: let nir_lower_tex handle projectors

  • gallivm: make rho-approximation opt-in instead of opt-out

  • gallivm: remove pointless no_filter_hacks flag

  • d3d12: split up root parameter update and set

  • microsoft/compiler: fix psv-output calculation

  • microsoft/compiler: harmonize num_psv_inputs with outputs

  • gallivm: use lp_build_log2_safe for pow

  • lavapipe: remove stale xfails

  • lavapipe: remove duplicate xfail with typo

  • lavapipe: lower mipmapPrecisionBits to 4

  • gallivm: remove code to force nearest s/t interpolation

  • llvmpipe: take intersection with bbox for non-legacy points

  • st/mesa: correct point_tri_clip for gles2

  • gallivm: fix texture-mapping with 16-bit result

  • draw: fix stippling of fractional lines

  • gallium/nir/tgsi: fixup indentation

  • gallium/nir/tgsi: initialize file_max for inputs

  • draw: improve numerical stability in clipper

  • llvmpipe: use preferred attribute interpolation for wide lines

  • llvmpipe: clamp z to 0..1 range when using polygon offset

  • llvmpipe: split coefficient calculation and store

  • llvmpipe: improve polygon-offset precision

  • lavapipe: fix reported subpixel precision for lines

  • draw/llvmpipe: correct exponent calculation for negative z

  • gallium/tgsi: remove unused helper

  • gallium/tgsi: rip out cylindrical wrap from ureg

  • gallium/tgsi: rip out cylindrical wrap support

  • softpipe: rip out cylindrical wrap support

  • llvmpipe: rip out cylindrical wrap support

  • microsoft/compiler: remove needless error-returns

  • microsoft/compiler: return errors from get_n_src

  • microsoft/compiler: trivial fixes to error-handling

  • Revert “zink: always init bordercolor value for sampler”

  • zink: do not warn about rare features until used

  • zink: initialize pQueueFamilyIndices

  • zink: avoid overflow when calculating size

  • zink: do not try to dereference null-key

  • zink: do not dereference null-pointer

  • zink: pctx can’t be null here

  • zink: return false on failure

  • zink: remove needless NULL-check

  • zink: avoid memcmping null pointers

  • zink: avoid checking if src is const twice

  • zink: give each major intrinsic it’s own emit function

  • zink: remove needless scope

  • zink: remove incorrect ASSERTED macro

  • zink: clean up const-value handling for get_ssbo_size

  • zink: reduce scope of version-struct hack

  • zink: avoid generating nonsensical code

Esme Xuan Lim (1):

  • docs/panfrost: Fix link to use rst syntax

Felix DeGrood (2):

  • iris: add tile cache flush to iris_copy_region

  • anv: dirty only state impacted by blorp_exec

Filip Gawin (18):

  • docs: make most important part of bugs.rst easier to find

  • radeonsi: improve rounding of zmin

  • radv: improve rounding of zmin

  • nir: fix shadowed variable in nir_lower_bit_size.c

  • nir: fix ifind_msb_rev by using appropriate type

  • meson: add crocus to default group of drivers for x86/x86_64

  • nouveau: fix forward declaration of struct

  • nouveau: use bool literals instead of integers

  • glsl: use bool literals instead of integers

  • r300: fix usage of COVERED_PTR_MASKING_ENABLE for r500

  • r300: make global variables const (if possible)

  • r300: assert that array in translate_vertex_program is initialized

  • aco: cleanup assignment of unique_ptrs

  • r300: implement forgotten tgsi’s cases of textures

  • r300: fix UB caused by 1 << 31 and 2 << 30

  • r300: avoid searching for temp variable twice

  • nir: avoiding reading unitialized memory when using nir_dest_copy

  • r300: fixes for UB caused by left shifts

Francisco Jerez (12):

  • iris: Add read-only domain for VF cache.

  • iris: Annotate all BO uses through VF cache domain.

  • iris: Insert buffer-local memory barriers for VF reads.

  • iris: Add separate dirty bit for VBO flushes.

  • iris: Insert buffer-local memory barriers for indirect draw parameters.

  • iris: Add read-write domain for data cache.

  • iris: Use DATA domain barrier for shader images instead of OTHER domain.

  • iris: Insert buffer-local memory barriers for SSBO reads and writes.

  • iris: Insert buffer-local memory barriers for UBO reads.

  • iris: Use separate dirty bits for UBO and SSBO flushes.

  • iris: Track dirty UBOs per-stage for more targeted flushing.

  • iris: Make sure a bound resource is flushed after iris_dirty_for_history.

Georg Lehmann (3):

  • radv: Use c_msvc_compat_args.

  • aco: Use cpp_msvc_compat_args.

  • radv: Remove dead min waves code.

Gert Wollny (3):

  • mesa: Add support for EXT_clear_texture

  • mesa: Add EXT_texture_mirror_clamp_to_edge to extension table

  • mesa: signal driver when buffer is bound to different texture format

Greg V (1):

  • util: make util_get_process_exec_path work on FreeBSD w/o procfs

Guilherme Gallo (9):

  • gitlab-ci: enable testing on Intel Whiskey Lake (experimental)

  • gitlab-ci: enable testing on Intel Comet Lake (experimental)

  • gitlab-ci: Fix trace expectations for iris devices

  • gitlab-ci: Fix octopus device type and tag

  • gitlab-ci: Add sleep for every `scheduler.jobs.logs` call

  • gitlab-ci: Implement a simple timeout detection for LAVA jobs

  • gitlab-ci: refactor timeout constants and tweak timeout values

  • ci: Uprev deqp-runner to 0.9.0

  • ci: Update linux kernel to v5.15

Gurchetan Singh (3):

  • drm-uapi: virtgpu_drm.h: context init feature

  • virgl/drm: query for context init ioctl and supported capset ids

  • virgl/drm: explicit context initialization

Hoe Hao Cheng (2):

  • zink: make codegen compatible with python 3.5

  • zink/codegen: do not enable extensions based on vulkan version

Hyunjun Ko (4):

  • tu: allow dynamic primitive topology with tessellation

  • freedreno/a5xx,a6xx: rename MSAA_ENABLE to LINE_MODE in GRAS_SU_CNTL

  • turnip: enable VK_EXT_line_rasterization

  • turnip: enable strictLines

Iago Toral Quiroga (40):

  • ci: disable Broadcom CI

  • v3dv: remove more dead clearing code

  • v3dv: refactor meta copy/clear code

  • v3dv: remove unused layer field from struct rcl_clear_info

  • v3dv: improve TLB layered image clears

  • v3dv: allow limiting amount of tile state allocated

  • v3dv: don’t overallocate tile state for meta TLB operations

  • v3dv: don’t emit frame setup more than once for multilayered framebuffers

  • v3dv: fix I/O lowering for GS

  • v3dv: drop unused parameters

  • v3dv: store multiview info in our render pass data

  • v3dv: move all our NIR pre-processing to preprocess_nir

  • v3dv: inject a custom passthrough geometry shader for multiview pipelines

  • broadcom/compiler: implement nir_intrinsic_load_view_index

  • v3dv: broadcast multiview draw commands

  • v3dv: don’t merge subpasses with different view masks

  • v3dv: use correct number of layers for multiview

  • v3dv: skip processing tiles for layers that are not in the view mask

  • v3dv: track first and last subpass that use a view index

  • v3dv: fix query error handling

  • v3dv: implement interaction of queries with multiview

  • v3dv: expose VK_KHR_multiview

  • v3dv: fill in drmFormatModifierTilingFeatures

  • v3dv: handle IMAGE_DRM_FORMAT_MODIFIER_EXPLICIT_CREATE_INFO_EXT

  • docs: flag VK_KHR_multiview as implemented for v3dv

  • broadcom/compiler: add a vir_get_cond helper

  • broadcom/compiler: Flags are per-thread state in V3D 4.2+

  • broadcom/compiler: make spills of conditional writes also conditional

  • broadcom/compiler: rewrite partial update liveness tracking

  • v3d,v3dv: add options to force 32-bit or 16-bit TMU precision

  • v3dv: don’t try to access pColorBlendState if rasterization is disabled

  • v3dv: add API entry points for sampler Ycbcr conversions

  • vulkan: allow creating color views from depth/stencil images

  • v3dv: make v3dv_image derive from vk_image

  • v3dv: use subresource helpers in more places

  • v3dv: make v3dv_image_view derive from vk_image_view

  • v3dv: honor VkPhysicalDeviceFeatures2 in pNext chain of VkDeviceCreateInfo

  • broadcom/compiler: don’t enable early fragment tests if shader writes Z

  • v3dv: start using Broadcom’s device identifiers

  • broadcom/compiler: fix assert that current instruction must be in current block

Ian Romanick (65):

  • nir/gcm: Clear out pass_flags before starting

  • util/queue: Don’t crash in util_queue_destroy when init failed

  • iris: Add a comment for iris_uncompiled_shader::nir

  • iris: Fix return type of iris_compile_*

  • iris: Unify iris_delete_[shader stage]_state functions

  • iris: Unify iris_create_[shader stage]_state functions

  • iris: Merge iris_create_[shader stage]_state funcs into iris_create_shader_state

  • iris: Ref count the uncompiled shaders

  • iris: Extract allocation bits from iris_upload_shader to iris_create_shader_variant

  • iris: Allocate shader variant in caller of iris_upload_shader

  • iris: Add the variant to the list as early as possible

  • iris: Don’t pass the shader key to iris_compile_[shader stage]

  • iris: add sync_compile option

  • iris: Enable threaded shader compilation

  • iris: Split iris_upload_shader in two

  • intel/compiler: Add id parameter to shader_debug_log callback

  • intel/compiler: Add id parameter to shader_perf_log callback

  • mesa: Fix tiny race condition in _mesa_debug_get_id

  • util: Add and use functions to calculate min and max int for a size

  • isl: Use CLAMP macro instead of MIN of MAX

  • nir/opcodes: Use u_intN_(min|max)

  • Revert “nir/algebraic: Convert some f2u to f2i”

  • intel/fs: sel.cond writes the flags on Gfx4 and Gfx5

  • gallium: Remove “optimize” parameter from pipe_screen::finalize_nir

  • intel/compiler: Document and assert some aspects of 8-bit integer lowering

  • nir/algebraic: Optimize some extract forms resulting from 8-bit lowering

  • intel/fs: Allow copy propagation between MOVs of mixed sizes

  • intel/fs: Emit better code for u2u of extract

  • nir/algebraic: Remove spurious conversions from inside logic ops

  • nir: intel/compiler: Add and use nir_op_pack_32_4x8_split

  • intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms

  • util/xmlconfig: Make unit tests more resilient against user env settings

  • util/xmlconfig: Test values set via the environment

  • nir/lower_bit_size: Support add_sat and sub_sat

  • nir/opcodes: Add integer dot-product opcodes

  • nir/algebraic: Basic patterns for dot_4x8

  • intel/compiler: Basic support for DP4A instruction

  • nir/algebraic: Add lowering for dot_4x8 instructions

  • nir/algebraic: Add some extract optimizations

  • spirv: Update headers and metadata from latest Khronos commit

  • spirv: Add support for SPV_KHR_integer_dot_product

  • intel/fs: Refactor some cmod propagation tests

  • intel/fs: Remove redundant inst->opcode checks in cmod prop

  • intel/fs: Add many cmod propagation tests involving MOV instructions

  • intel/fs: Fix a cmod prop bug when the source type of a mov doesn’t match the dest type of scan_inst

  • intel/compiler: Move type_is_unsigned_int to brw_reg_type.h

  • intel/fs: cmod propagate from MOV with any condition

  • intel/fs: Remove condition-based restriction for cmod propagation to saturated operations

  • intel/fs: Remove after parameter from test_saturate_prop

  • intel/fs: Remove type-based restriction for cmod propagation to saturated operations

  • anv: Enable KHR_shader_integer_dot_product

  • nir/lower_gs_intrinsics: Return progress if append_set_vertex_and_primitive_count makes progress

  • nir/lower_gs_intrinsics: Make nir_lower_gs_intrinsics be idempotent

  • iris: crocus: Use shader_info::is_arb_asm flag

  • iris: Calculate uses_atomic_load_store after all lowering

  • nir/edgeflags: Add a flag to indicate the edge flag input is needed

  • iris: Eliminate iris_uncompiled_shader::needs_edge_flag

  • iris: Move iris_set_max_shader_compiler_threads and iris_is_parallel_shader_compilation_finished

  • iris: Add finalize_nir

  • spirv: Silence unused parameter warnings in vtn_alu.c

  • spirv: Minor cleanup in SpvOpFOrdNotEqual

  • spirv: SpvOpFUnordNotEqual doesn’t need special treatment

  • spirv: Generate shorter code for SpvOpFUnord comparisons

  • nir/algebraic: Small optimizations for SpvOpFOrdNotEqual and SpvOpFUnordEqual

  • nir/loop_unroll: Always unroll loops that iterate at most once

Icecream95 (26):

  • pan/decode: Avoid undefined behaviour on shift in bits()

  • pan/gen_pack: Use 1U for unpacking log2 to avoid undefined behaviour

  • pan/bi: Print the clause of branch targets

  • pan/bi: Use padding bytes for checking whether to stop disassembly

  • pan/bi: Fix infinite loop parsing arguments for bifrost_compiler

  • pan/mdg: Analyze helper termination after scheduling

  • pan/bi: Use the computed scale for fexp NaN propagation

  • panfrost: Call primconvert and u_transfer_helper destroy functions

  • pan/bi,pan/mdg: Fix memory leak of hash tables

  • panfrost: Fix memory leaks for compute state

  • panfrost: Free TGSI tokens

  • panfrost: Free NIR when deleting shader state

  • pan/mdg: Reduce size of tex_opcode_props

  • panfrost: Fill tiler job padding again

  • panfrost: Add nocache debug flag for disabling the BO cache

  • panfrost: Only allow colour blit shaders to be killed

  • panfrost: drm-shim support

  • pan/bi: Extend bi_add_nop_for_atest for tilebuffer loads

  • lima: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED

  • lima: Fix crashes for GPUs with more than four cores

  • lima: Improve error messages for unsupported GP operations

  • lima: Add a noop drm-shim

  • pan/bi: Don’t set dependencies for +BLEND in blend shaders

  • pan/mdg: Remove use of global variables in disassembler

  • panfrost: Add ASTC 3D texture format entries

  • pan/mdg: Use the correct swizzle for condition moves

Ilia Mirkin (7):

  • st/mesa: fix pbo download store image type

  • mesa: don’t return errors for gl_* GetFragData* queries

  • mesa: rgb10_a2 is never color-renderable in gles2

  • glsl: fix explicit-location ifc matching in presence of array types

  • freedreno: use OUT_WFI for emit_marker

  • a4xx: add some better documentation for compute registers

  • a4xx/computerator: add initial backend

Italo Nicola (6):

  • ci: skip minio login if PIGLIT_REPLAY_UPLOAD_TO_MINIO is not set

  • virgl/ci: switch glmark2 traces from .rdc to .trace

  • virgl/ci: stop overriding GL version when running traces

  • virgl/ci: enable some traces that were previously crashing

  • main: don’t always clamp pixels read from snorm buffers

  • panfrost: fix null deref when no color buffer is attached

Iván Briano (8):

  • anv: Don’t advertise unsupported shader stages

  • anv: fix some multisample lines_wide CTS tests

  • anv: Unbreak wide lines on HSW/BDW

  • anv: fix feature/property/sizes reported for fragment shading rate

  • anv: Allow unused VkSpecializationMapEntries

  • anv: Don’t copy the lineStipple values if lineStipple is not enabled

  • vulkan: fix handling of aliases in enum members

  • vulkan: Generate defines for aliases of promoted enums

James Park (1):

  • aco: Work around MSVC restrict in c99_compat.h

Jan Beich (1):

  • meson: disable -Werror=thread-safety on FreeBSD

Faith Ekstrand (192):

  • intel/dev: Handle CHV CS thread weirdness in get_device_info_from_fd

  • intel/dev: Put the device name in intel_device_info

  • intel/dev: Handle BSW naming issues

  • intel/dev: Add a max_cs_workgroup_threads field

  • intel/dev: Drop a bogus assert

  • nir: Better document the Boissinot algorithm in nir_from_ssa()

  • iris: Re-emit MEDIA_VFE_STATE for variable group size shaders

  • anv: Handle errors properly in anv_i915_query

  • intel: Pull anv_i915_query into common code

  • anv: Use intel_i915_query_alloc for memory regions

  • iris: Use intel_i915_query for meminfo

  • intel/dev: Use intel_i915_query_alloc in query_topology

  • intel/perf: Use intel_i915_query_flags instead of hand-rolling it

  • intel/eu: Start validating LSC message descriptors

  • anv: Assume syncobj support

  • anv: Drop unused sync_file and BO semaphore code

  • anv: Stop reference counting semaphores

  • glsl/nir: Use nir_ssa_undef() from nir_builder

  • nir: Set IMAGE_DIM and IMAGE_ARRAY on deref intrinsics

  • nir: Set src_components = -1 for image intrinsic deref sources

  • nir: Add a format field to _deref image intrinsics

  • nir/lower_subgroups: Handle down-casts in uint_to_ballot_type

  • nir/lower_image: Handle index and bindless image_size

  • nir/lower_tex: Add a lower_txs_cube_array option

  • radv,radeonsi: Do cube size divide-by-6 lowering in NIR

  • turnip: Replace tu_lower_image_size with nir_lower_image

  • intel/eu: Don’t validate LSC transpose on ops that don’t have it

  • ttn: Don’t handle texop_txf_ms_mcs

  • amd: Don’t handle nir_tex_src_ms_mcs

  • panfrost: Don’t handle nir_texop_txf_ms_mcs

  • nir: Suffix all the MCS texture stuff _intel

  • docs,nir: Document NIR texture instructions

  • intel/blorp: Use nir_texop_txl

  • nir/lower_tex: Rework invalid implicit LOD lowering

  • nir: Validate newly documented texture restrictions

  • anv/android: Rework our handling of AHardwareBuffer imports

  • nir: Removing uses of SSA defs destroys SSA liveness

  • nouveau: Use nir_lower_tex for projectors

  • anv/blorp: Drop some can_ycbcr checks

  • anv/blorp: Use the isl_surf for computing level_width/height in anv_image_ccs_op

  • anv: Rename anv_get_format_plane to anv_get_format_aspect

  • anv: Rework depth/stencil early return in anv_get_format_plane

  • anv: Add a get_format_plane helper and use it in image setup

  • anv: Use anv_get_format_plane in anv_get_image_format_features

  • anv: Use anv_get_format_plane for color image view setup

  • anv: Stop assuming planes are in aspect-bit-order

  • anv/image: Rework YCbCr image aspects

  • anv: Rework our aspect/plane helpers

  • anv: Make anv_image_aspect_to_plane take an anv_image*

  • intel/eu: Set scope to TILE for TGM flushes

  • meson/intel: Don’t build genxml tests on Android

  • meson: Intel drivers don’t require expat on Android

  • meson/glsl: Only run GLSL tests if can_run_host_binaries()

  • intel/vec4: Don’t override emit_urb_write_opcode for SNB GS

  • intel/perf: Use a char array for OA perf query data

  • anv/android: Pass the correct pointer type to vk_errorf

  • anv/android: Drop unused device variables

  • ci: Build ANV on Android

  • include/drm-uapi: Bump headers

  • anv: Use I915_MMAP_OFFSET_FIXED for LMEM platforms

  • iris: SMEM buffers on discrete platforms are coherent

  • iris: Use a tiny table to map mmap modes to offsets

  • iris: Add an assert to iris_bo_gem_mmap_legacy()

  • iris: Add a new IRIS_MMAP_NONE map type

  • iris: Use I915_MMAP_OFFSET_FIXED for LMEM platforms

  • anv: Use I915_USERPTR_PROBE when available

  • intel/isl: Explicitly set offset_B = 0 in get_uncomp_surf for arrays

  • intel/isl: Add units to view dimensions in isl_surf_get_uncompressed_surf

  • intel/isl: Better document isl_tiling_get_intratile_offset_*

  • intel/isl: Add a missing assert in isl_tiling_get_intratile_offset_sa

  • intel/isl: Use uint64_t for computed byte offsets

  • anv/image: Use planes[i]->primary_surface.isl.format in check_drm_format_mod

  • anv: Delete anv_image::format

  • vulkan: Add a vk_image struct

  • anv: Make anv_image derive from vk_image

  • anv,vulkan: Move anv_image_expand_aspects to common code

  • anv,vulkan: Move VkImageSubresource* helpers from ANV

  • vulkan: Refactor and better document vk_image_expand_aspect_mask

  • radv: Add asserts to vk_format_depth/stencil_only

  • vulkan,radv: Move vk_format_depth/stencil_only to common code

  • vulkan: Add a vk_image_view struct

  • anv: Make anv_image_view derive from vk_image_view

  • anv,vulkan: Move ANV image layout helpers to common code

  • anv,vulkan: Move drm_format_mod to vk_image

  • anv,vulkan: Add a vk_image::wsi_legacy_scanout bit

  • anv: Move compute_heap_size lower in the file

  • anv: Rework init_meminfo

  • anv: compute available memory in anv_init_meminfo

  • anv: Set CONTEXT_PARAM_RECOVERABLE to false

  • intel/compiler: Add unified barrier support for CS

  • intel/isl: Add more parameters to isl_tiling_get_info

  • isl/docs/tiling: Add Tile4 docs

  • intel/fs: Add support for atomic_fadd

  • anv: Advertise support for shaderBufferFloat32AtomicAdd

  • nir: Properly clean up nir_src/dest indirects

  • nir: Stop sweeping indirects

  • spirv: Handle the SubgroupSize execution mode

  • intel/fs: Handle required subgroup sizes specified in the SPIR-V

  • iris: Handle states=NULL in iris_bind_sampler_states

  • iris: Return 1 for PIPE_COMPUTE_CAP_IMAGES_SUPPORTED

  • panvk: Use vk_queue

  • panvk: Use vk_command_buffer

  • vulkan: Add the pCreateInfo to vk_queue_init()

  • anv: Drop anv_queue::flags

  • radv: Drop radv_queue::flags/queue_family_index/queue_idx

  • lavapipe: Drop lvp_queue::flags

  • turnip: Drop tu_queue::flags/queue_family_index/queue_idx

  • v3dv: Drop v3dv_queue::flags

  • panvk: Drop panvk_queue::flags/queue_family_index

  • vulkan/device: Add a common GetDeviceQueue2 implementation

  • vulkan/device: Add a common DeviceWaitIdle implementation

  • anv: Switch to common GetDeviceQueues2 and DeviceWaitIdle

  • radv: Switch to common GetDeviceQueues2 and DeviceWaitIdle

  • turnip: Switch to common GetDeviceQueues2 and DeviceWaitIdle

  • v3dv: Use the common GetDeviceQueue implementation

  • lavapipe: Simplify DeviceWaitIdle

  • lavapipe: Switch to common GetDeviceQueue and DeviceWaitIdle

  • panvk: Switch to common GetDeviceQueue and DeviceWaitIdle

  • intel/fs: Rework fence handling in brw_fs_nir.cpp

  • intel/fs: Ignore SLM fences if shared is unused

  • intel/fs: Add the URB fence message

  • intel/fs: Emit URB fences when we have LSC

  • vulkan/shader_module: Fix the lifetime of temporary shader modules

  • v3dv: Use VK_DEFINE_*HANDLE_CASTS instead of rolling our own

  • st/texture: Dedent surface setup in CompressedTexSubImage

  • st/texture: Fall back to single-slice uploads in st_CompressedTexSubImage

  • Move a bunch of the CLC stuff from src/microsoft to common code

  • compiler/clc: Clean ups

  • compiler/clc: grab opencl-c.h from the system path by default

  • anv,iris,genxml: Use NumberOfBarriers on XeHP

  • vulkan/physical_device_features: Drop some unnecessary dependencies

  • vulkan/physical_device_features: Stop generating a header

  • radv: Use VK_DEFINE_*HANDLE_CASTS instead of rolling our own

  • vulkan: Update the XML and headers to 1.2.195

  • anv: Add an anv_image_get_memory_requirements helper

  • intel/isl: Add a max_buffer_size limit to isl_device

  • intel/isl: Simplify isl_format_supports_filtering

  • intel/isl: Stop claiming ASTC works on Cherry View

  • anv: Ask ISL about ASTC support

  • intel/isl: ASTC support was removed on Gfx12.5

  • genxml: Drop bit 27 from RENDER_SURFACE_STATE::Surface Format

  • nir/algebraic: Lower fisfinite

  • nir/algebraic: Add some boolean optimizations

  • nir/algebraic: Add some opts for comparisons of comparisons

  • vulkan: Drop vk_object_base_reset

  • vulkan: Track which objects are client-visible

  • vulkan/log: Assert if the driver logs a client-invisible object

  • vulkan/log: Log to instance messages during instance construction

  • anv: drop a misplaced and wrong comment

  • anv: Stop printing descriptor pool allocation failures

  • anv: s/vk_error/anv_error/g

  • vulkan/log: Handle logging to a physical device

  • vulkan/log: Add common vk_error and vk_errorf helpers

  • anv: Drop unused logging helpers

  • anv/queue: Plumb the queue through all the queue_submit calls

  • anv: Use the common vk_error and vk_errorf helpers

  • radv: Stop printing descriptor pool allocation failures

  • radv: Switch to the new common vk_error helpers

  • lavapipe: Switch to the new vk_error helpers

  • panvk: Switch to the new vk_error helpers

  • v3dv: Switch to the new vk_error helpers

  • turnip: Plumb non-startup errors through the new vk_error helpers

  • vulkan/log: Drop _impl from the log helper names

  • vulkan/instance: Use vk_error in vk_instance_init

  • vulkan/device: Use vk_error

  • vulkan/device: Use vk_errorf to report missing features

  • Revert “mesa: use simple_mtx_t for TexMutex”

  • nir/lower_discard_or_demote: Fix metadata

  • vulkan: Generate flag #defines based on bitwidth

  • vulkan: Generate #defines with every bit in a given bitfield

  • anv: Use the common wrapper for GetPhysicalDeviceFormatProperties

  • anv: Flip around the way we reason about storage image lowering

  • meson: Add and use an idep for Vulkan WSI

  • vulkan/wsi: Add a dispatch table for WSI entrypoints

  • vulkan/wsi: Add common wrappers for most entrypoints

  • anv: Use the common WSI wrappers

  • radv: Use the common WSI wrappers

  • turnip: Use the common WSI wrappers

  • v3dv: Use the common WSI wrappers

  • panvk: Use the common WSI wrappers

  • lavapipe: Use the common WSI wrappers

  • venus: Use the common WSI wrappers

  • vulkan/wsi/common: Delete the wrapper entrypoints

  • vulkan/wsi/x11: Delete the wrapper entrypoints

  • vulkan/wsi/wayland: Delete the wrapper entrypoints

  • vulkan/wsi/display: Delete the wrapper entrypoints

  • vulkan/log: Tweak our handling of a couple error enums

  • i965: Emit a NULL surface for buffer textures with no buffer

  • lavapipe: Don’t wrap errors returned from vk_device_init in vk_error

  • anv: Fix FlushMappedMemoryRanges for odd mmap offsets

  • anv: Also disallow CCS_E for multi-LOD images

  • vulkan/util: Include stdlib.h

Jeremy Newton (1):

  • Fix building AMD MM/GL with EL7

Jesse Natalie (62):

  • mesa/main: Check for fbo attachments when importing EGL images to textures

  • microsoft/compiler: Implement texture loads from UAVs

  • microsoft/clc: Add a test for compiling a kernel with a read-write image

  • gallium/dri: Move driConf -> st option processing to aux/util

  • xmlconfig: Use static inline for regex fallback to prevent -O0 issues

  • wgl: Parse driconf options

  • wgl: Add a driver name for driconf

  • u_driconf: Use a macro to avoid repeating option names

  • CI: Update Windows quick_gl baseline for mysterious new passes

  • spirv2dxil: Fix build after spirv_to_dxil signature change

  • ci/windows: Build spirv-to-dxil

  • llvmpipe: Don’t wait for already-terminated threads on Windows

  • mapi: Fix shared-glapi build with MSVC

  • wgl: Fix unit test when using shared glapi

  • static-glapi: Fix MSVC preprocessor definitions

  • wgl: Don’t use BUILD_GL32 for wgl frontend

  • wgl: Move opengl32.def to target instead of frontend

  • wgl: Move wgl* non-extension definitions to libgl-gdi

  • wgl: Make overridden entrypoints local to stw_ext_context

  • wgl: Refactor drivers to a libgallium_wgl.dll

  • docs: Update Windows llvmpipe doc for driver split

  • gl.h: Remove dllimport

  • wgl: Create contexts and DHGLRCs separately

  • wgl: Pass share context as pointer instead of DHGLRC

  • wgl: Make contexts current with pointer instead of DHGLRC

  • wgl: Allow creating framebuffers that aren’t in the global window list

  • wgl: Make contexts current with framebuffers instead of HDCs

  • wgl: Split DrvReleaseContext to support unbind via pointer

  • wgl: Add iPixelFormat to stw_pixelformat_info

  • wgl: Un-inline helpers which use stw_own_mutex

  • wgl: Add an explicit iPixelFormat for context creation

  • wgl: Use HWND instead of HDC as primary framebuffer handle

  • wgl: Add a stw_dev getter

  • wgl: Swap buffers via pointer instead of HDC

  • wgl: Add stw_* DLL exports for EGL support

  • meson: Include EGL after gallium

  • meson, egl: Support building for the Windows platform

  • egl: Add wgl/gallium dependencies for Windows platform

  • egl: Use the .def file for Windows

  • egl: Don’t try to dereference native displays unless there’s a detectable platform

  • egl: Detect Windows platform using GDI

  • egl: Add a basic Windows driver

  • symbols-check: Fix symbol demangling for Windows

  • egl: Update Windows .def to include missing exports

  • meson: Set /Zc:__cplusplus for MSVC

  • CI/windows: Build shared-glapi, EGL, gles2

  • microsoft/clc: Rename compiler DLL to clon12compiler

  • microsoft/clc: Clean up clc_context

  • microsoft/clc: Stop heap-allocating tiny fixed-size transparent structs

  • microsoft/clc: Split clc_object and rename entrypoints

  • microsoft/clc: Support SPIR intermediates in the compilation APIs

  • microsoft/clc: Parse SPIR-V specialization consts into metadata

  • microsoft/clc: Support passing specialization consts to spirv_to_nir

  • microsoft/clc: Add API to independently specialize SPIR-V

  • microsoft/clc: Add a test for specializing via SPIRV-Tools

  • clover: std::result_of is deprecated in c++17 and removed in c++20

  • clover: Delete unused ‘e’ exception reference vars

  • clover: Rename module -> binary, because C++20 makes module a keyword

  • compiler/clc: Null extensions should mean all supported, not all

  • compiler/clc: Preserve OCL kernel arg type metadata on LLVM13

  • util/hash_table: Clear special 0/1 entries for u64 hash table too

  • d3d12: Fix Linux fence wait return value

Jonathan Marek (1):

  • freedreno/registers: add a6xx media formats

Jordan Justen (51):

  • nir: Add nir_lower_image() to lower cube image sizes

  • intel/compiler: Rename brw_nir_lower_image_load_store to brw_nir_lower_storage_image

  • intel/compiler: Lower cube image sizes using nir_lower_image()

  • intel/compiler: Remove cube array size lowering in compiler backend

  • meson: Search for python3 before python for bin/meson_get_version.py

  • meson: Check that bin/meson_get_version.py ran without an error

  • intel/pci-ids: Re-enable DG1 and add SG1

  • intel/compiler: Regroup TCS barrier code paths

  • intel/compiler: Add unified barrier support for TCS

  • iris: Disable the Y-tiled modifiers on XeHP+

  • intel: Move subslice_total into devinfo

  • intel/devinfo: Add devinfo->max_scratch_ids

  • intel/dev: Add is_dg2 to devinfo

  • intel/isl: Enable MOCS 61 for external surfaces on TGL

  • intel/dev: Add display_ver and set adl-p to 13

  • iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13

  • Revert “iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13”

  • Revert “intel/dev: Add display_ver and set adl-p to 13”

  • intel/dev: Add display_ver and set adl-p to 13

  • iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13

  • intel/blorp: Move most of BLORP_CREATE_NIR_INPUT into a function

  • intel/blorp: Add compute support to BLORP_CREATE_NIR_INPUT

  • intel/blorp: Add shader_pipeline to brw_blorp_base_key

  • intel/blorp: Add brw_blorp_init_cs_prog_key

  • intel/compiler: Use INTEL_DEBUG=blorp to dump blorp compute shaders

  • intel/blorp: Add subgroup_id input for compute programs

  • intel/blorp: Add blorp_compile_cs

  • intel/blorp: Split out ps specific sampler state into a separate function

  • intel/blorp: Split out surface setup from state emission

  • blorp: Add blorp_alloc_general_state

  • intel/blorp: Emit compute program based on BLORP_BATCH_USE_COMPUTE

  • intel/gfx7: Change GPGPU Mode to bool

  • intel/blorp: Add blorp_get_cs_local_y, blorp_set_cs_dims

  • intel/blorp: Change discard terminology to bounds

  • intel/blorp: Add blorp_check_in_bounds()

  • intel/blorp: Use blorp_check_in_bounds for discards

  • blorp: Set view usage to ISL_SURF_USAGE_STORAGE_BIT for compute

  • blorp/clear: Simplify rbg-as-red channel packing

  • intel/blorp: Convert blorp_clear color_write_disable to a bitmask

  • intel/blorp: Support compute for slow clears

  • intel/blorp/blit: Rename wm_prog_key and prog_key to key

  • intel/blorp: Support some image/buffer blit operations using compute

  • anv: Store anv_queue_family type in cmd-pool

  • anv: Prevent starting a render pass on compute queues

  • anv/blorp: Make sure blorp type is supported by the queue

  • anv/blorp: Select pipeline based on BLORP_BATCH_USE_COMPUTE

  • anv/blorp: Add anv_blorp_batch_init, anv_blorp_batch_finish

  • anv/blorp: Force compute blorp on compute-only queues

  • anv/slice_hash: Don’t allocate more than once with multiple queues

  • intel/isl: Add mocs settings for DG2

  • Revert “iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13”

Jose Maria Casanova Crespo (8):

  • Revert “ci: disable Broadcom CI”

  • v3d/driconf: Expose non-MSAA texture limits for mutter and gnome-shell

  • v3d: export supported prim types by v3d

  • v3d: remove primconvert

  • vc4: export supported prim types by vc4

  • vc4: remove primconvert

  • v3d: Enable PIPE_CAP_PRIMITIVE_RESTART

  • v3d: Enable PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE

Joshua Ashton (26):

  • lavapipe: Use common Vulkan format helpers

  • radv: Fix DCC image store check

  • radv: Disable DCC on storage images that cannot support DCC image stores

  • ac/surface: Add modifiers capable of DCC image stores

  • ac/surface: Add ac_modifier_supports_dcc_image_stores helper

  • radv: Expose modifiers that support DCC image stores with STORAGE_IMAGE_BIT

  • radv: Push box traversal results onto stack in correct order

  • radv: Add noatocdithering option to RADV_DEBUG

  • vulkan/util: Cast vk_alloc pointers

  • radv: Rename radv_subpass_barrier function to radv_emit_subpass_barrier

  • radv: Define extern “C” linkage if C++

  • ac/surface: Add helper for checking if a surface supports DCC Image stores

  • radv: Use common DCC image store check

  • radeonsi: Use common DCC image store check

  • radv: Remove assert in radv_rt_bind_tables

  • radv: Do not pass result to insert_traversal_aabb_case

  • radv: Implement build_node_to_addr for GFX8 and below

  • radv: Implement software emulation for intersect_ray

  • radv: Enable raytracing extensions on older generations

  • radv: Add force_emulate_rt perftest option

  • ac/surface: Use 64 && 128 for GFX10_3 on non-modifier path

  • ac/surface: Add ac_modifier_max_extent

  • radeonsi: Check if modifier supports the image extent

  • radv: Respect max extent for modifiers

  • ac/surface: Expose modifiers capable of DCC image stores first

  • radv: Do early and late tests for fast clears

Joshua Watt (1):

  • v3d, vc4: Fix dmabuf import for non-scanout buffers

José Fonseca (1):

  • llvmpipe: Add a linear rasterizer optimized for 2D rendering.

Juan A. Suarez Romero (35):

  • broadcom/compiler: emit TMU flush before a jump

  • ci/v3dv: update expected results

  • ci/v3d: add piglit flake test

  • v3d: handle debug options with debug_named_value

  • v3dv: assert job->cmd_buffer is valid

  • ci/v3dv: update vulkan expected results

  • broadcom: remove v3dv3 from neon library

  • ci: update to VK-GL-CTS 1.2.7.0

  • drm-uapi: add v3d performance counters

  • v3d: check if device supports performance monitors

  • v3d: attach performance monitor to jobs

  • v3d: move queries to pipe queries

  • v3d: add fence wait function

  • v3d: implement performance counter queries

  • v3d/simulator: implement performance counters

  • gallium/hud: initialize query

  • ci/v3dv: update expected results

  • broadcom/compiler: change current block on setting spill base

  • v3d: print error on perfmon destroy error

  • ci/vc4: update piglit expected results

  • broadcom/compiler: set current block on incrementing unifa

  • ci/v3dv: update flakes

  • v3dv: initialize CL submission structure

  • v3d/ci: add piglit flake

  • broadcom/ci: use deqp-runner suites for gles

  • broadcom/qpu: remove duplicated opcode variable

  • broadcom/compiler: check instruction belongs to current block

  • mesa: fix default texture buffer format

  • broadcom: make vir_emit_last_thrsw() private

  • broadcom/compiler: force a last thrsw for spilling

  • broadcom/compiler: add V3D_DEBUG_NO_LOOP_UNROLL debug option

  • broadcom: add cl_nobin debug option

  • ci/v3dv: update flakes

  • ci/v3d: add piglit flake

  • ci/vc4: add piglit timeout

Kai Wasserbäch (3):

  • gallivm: add new wrapper around Module::setOverrideStackAlignment()

  • gallivm: fix FTBFS on i386 with LLVM >= 13, StackAlignmentOverride is gone

  • fix(clover/llvm): update code to build with recent versions of LLVM 14 (Git)

Karol Herbst (4):

  • nv50/ir/nir: fix smem size for GL

  • nv30: fix emulated vertex index buffers

  • clover: Local memory needs to be aligned.

  • spirv: Don’t add 0.5 to array indicies for OpImageSampleExplicitLod

Keith Packard (1):

  • iris: Map scanout buffers WC instead of WB [v2]

Kenneth Graunke (29):

  • gallium: Remove dead pb_malloc_buffer_create function prototype

  • iris: Rename bo->gtt_offset to bo->address

  • iris: Improve the memory layout of iris_bo by fixing pahole issues

  • iris: Drop dead drm_ioctl prototype

  • iris: Don’t try to CPU read imported clear color BOs

  • iris: Use the new I915_USERPTR_PROBE API

  • iris: Allow SET_DOMAIN to fail when allocating new GEM objects

  • iris: Stop using SET_DOMAIN on discrete GPUs altogether

  • iris: Bypass the BO cache when allocating buffers for aux map tables

  • iris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE.

  • i965: Only call lower_blend_equation_advanced for fragment shaders

  • glsl: Assert that lower_blend_equation_advanced is only called for FS

  • iris: Rewrite bo->index comment to refer to exec_bos[]

  • iris: Track written BOs via a bitfield rather than exec_object2 entries

  • iris: Defer construction of the validation (exec_object2) list

  • iris: Add some accessor wrappers for a few fields.

  • intel: Finish off the last scraps of bacon

  • iris: Move some iris_bo entries into a union

  • iris: Handle multiple BOs backed by the same GEM object in execbuf code

  • iris: Begin handling slab-allocated wrapper BOs in various places

  • iris: Introduce a BO_ALLOC_NO_SUBALLOC flag and set it in a few places

  • iris: Change the validation list debug code to print the BO list instead

  • iris: Move suballocated resources to a dedicated allocation on export

  • iris: Suballocate BO using the Gallium pb_slab mechanism

  • iris: Delete the MI_COPY_MEM_MEM resource_copy_region implementation.

  • iris: Require a 4K alignment for extra clear color BOs.

  • iris: Fix MOCS for buffer copies

  • iris: Fix parameters to iris_copy_region in reallocate_resource_inplace

  • intel/genxml: Fix MI_FLUSH_DW to actually specify the length properly

Kostiantyn Lazukin (1):

  • util/u_trace: Replace Flag with IntEnum to support python3.5

Kyle Brenneman (2):

  • Add copyright comments to the GLVND-related files.

  • Remove the shebang from eglFunctionList.py.

Leandro Ribeiro (8):

  • vulkan/wsi/wayland: check directly if we got globals successfully

  • vulkan/wsi/wayland: do not perform roundtrip when not querying formats

  • vulkan/wsi/wayland: fix crash when force_bgra8_unorm_first is true

  • vulkan/wsi/wayland: fold wsi_wl_display_swrast and wsi_wl_display_dmabuf into parent

  • vulkan/wsi/wayland: always initialize format vector

  • vulkan/wsi/wayland: add helper function find_format()

  • vulkan/wsi/wayland: create swapchain using vk_zalloc()

  • vulkan/wsi/wayland: memset members of image to zero

Leo Liu (8):

  • frontends/va: Add AV1 picture description

  • frontends/va: Add AV1 parameter buffers functions

  • frontends/va: Place AV1 picture and slice parameter buffers functions

  • frontends/va: Add AV1 profile main to the config

  • radeon/vcn: Enable the AV1 decode p010 mode

  • frontends/va: Reallocate p010 buffer for AV1 10 bits decode

  • radeon/vcn: reuse the dpb buffers when with the same size.

  • radeon/vcn: add a handling of error for incorrect reference lists

Lepton Wu (3):

  • gallium: Reset {d,r}Priv in dri_unbind_context

  • i965: Enable RGBX8888_SRGB format.

  • virgl: Add an option to disable coherent

Lionel Landwerlin (67):

  • isl: fix mapping of format->stringname

  • loader/dri3: create linear buffer with scanout support

  • nir/lower_shader_calls: adding missing stack offset alignment

  • anv: fix submission batching with perf queries

  • drm-shim: implement stat/fstat when xstat variants are not there

  • intel/disasm: fix missing oword index decoding

  • anv: don’t try to access Android swapchains

  • nir/lower_shader_calls: remove empty phis

  • anv/android: handle image bindings from gralloc buffers

  • genxml: add more INSTDONE registers for Gfx12.5

  • intel/error-decode: printout more registers

  • nir: prevent peephole from generating invalid NIR

  • intel/fs: fix framebuffer reads

  • microsoft/clc: small compile fix on Linux

  • clc: use the defined version for the parser

  • spirv: don’t fail on CapabilitySubgroupDispatch if supported

  • spirv: avoid shadowing local variable

  • spirv: workaround LLVM-SPIRV Undef variable initializers

  • spirv: don’t bother initializing variables to Undef

  • microsoft/clc: drop LLVM dependency to version < 12

  • nir: fix opt_memcpy src/dst mixup

  • spirv: switch Groups capability to non AMD specific field

  • microsoft/clc: drop MSVC specific function

  • microsoft/clc: fix compiler warning on uninitiailzed variable use

  • meson: extract libversion checks from clc & clover

  • anv: honor INTEL_DEBUG=sync

  • clc: add allowed extension for compile parameter

  • clc: print warnings/errors on their own line

  • clc: let user specify the targetted SPIRV version

  • anv: enable UBO indexing

  • intel/compiler: add missing line returns to logs

  • anv: remove redundant VertexURBEntryReadLength setting

  • nir/lower_io: preserve all metadata when no progress

  • anv: move GetBufferMemoryRequirement with other buffer functions

  • anv: implement vkGetDeviceBufferMemoryRequirementsKHR

  • anv: remove unused function

  • anv: move VkImage object allocation to anv_CreateImage

  • anv: implement vkGetDeviceImageMemoryRequirementsKHR

  • anv: implement vkGetDeviceImageSparseMemoryRequirementsKHR

  • anv: enable VK_KHR_maintenance4

  • vulkan: put generated defines into their own header

  • vulkan: handle new VK_KHR_synchronization2 image layouts

  • vulkan: remove unused VkCommand

  • vulkan/util: generate define for a selected few enums

  • vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2

  • anv: add missing transition handling bits

  • anv: make semaphore helper work on a single object

  • anv: improve readability of pipelined states

  • anv: implement VK_KHR_synchronization2

  • spirv: deal with null pointers

  • anv: switch to use VkFormatFeatureFlags2KHR internally

  • intel/nir: allow unknown format in lowering of storage images

  • anv: start computing KHR_format_features2 flags for storage images

  • anv: implement VK_KHR_format_feature_flags2

  • anv: fill correct surface state for lowered storage image

  • isl: only bump the min row pitch for display when not specified

  • vulkan/wsi/wayland: don’t expose surface formats not fully supported

  • anv: fix push constant lowering with bindless shaders

  • intel/dev: fix HSW GT3 number of subslices in slice1

  • intel/dev: don’t forget to set max_eu_per_subslice in generated topology

  • intel/dev: reuse internal functions to set mask

  • intel/dev: fix subslice/eu total computations with some fused configurations

  • intel/perf: fix perf equation subslice mask generation for gfx12+

  • intel/devinfo: fix wrong offset computation

  • intel: remove 2 preproduction pci-id for ADLS

  • anv: don’t forget to add scratch buffer to BO list

  • anv: fix multiple wait/signal on same binary semaphore

Liviu Prodea (1):

  • ci: Add osmesa to Windows GitLab CI

Lone_Wolf (1):

  • clover: TargetRegistry.h was moved to another folder

Lucas Stach (2):

  • renderonly: don’t complain when GPU import fails

  • etnaviv: always try to create KMS side handles for imported resources

Luis Felipe Strano Moraes (2):

  • docs: Clean up environment variable docs for Intel drivers.

  • docs: Add documentation regarding INTEL_MEASURE to envvars doc.

M Henning (1):

  • nouveau: Support nir_intrinsic_*_atomic_fadd

Maniraj D (1):

  • egl: set TSD as NULL after deinit

Mao, Marc (1):

  • iris: declare padding for iris_vue_prog_key

Marcin Ślusarz (51):

  • intel/tools/aubinator_error_decode: tag hanging instruction

  • anv: share some code between vkCmdDrawIndirectCount and vkCmdDrawIndexedIndirectCount

  • glsl: evaluate switch expression once

  • nir/builder: invalidate metadata per function

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_apply_attribute_workarounds

  • d3d12: use nir_metadata_none instead of its value

  • microsoft/clc: preserve only valid metadata in clc_lower_printf_base

  • microsoft/clc: use nir_shader_instructions_pass in clc_nir_dedupe_const_samplers

  • microsoft/compiler: preserve all metadata when upcast_phi doesn’t make progress

  • microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_split_clip_cull_distance

  • microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_lower_double_math

  • zink: use nir_shader_instructions_pass in lower_discard_if

  • zink: use nir_shader_instructions_pass in nir_lower_dynamic_bo_access

  • genxml: add INSTDONE_GEOM register for Gfx12.5

  • intel/error-decode: printout INSTDONE_GEOM register for Gfx12.5

  • glsl/opt_algebraic: disable invalid optimization

  • glsl: refactor code to avoid static analyzer noise

  • freedreno/ir3: use nir_metadata_none instead of its value

  • r600: use nir_shader_instructions_pass in r600_nir_lower_atomics

  • r600: preserve all metadata when passes don’t make progress

  • turnip: use nir_shader_instructions_pass in tu_lower_io

  • intel/compiler: INT DIV function does not support source modifiers

  • vulkan/wsi/x11: fix shm allocation control flow issue

  • glsl: propagate errors from *=, /=, +=, -= operators

  • glsl: break out early if compound assignment’s operand errored out

  • crocus: drop redundant unlikely’s around INTEL_DEBUG

  • intel/compiler: drop redundant likely’s around INTEL_DEBUG

  • anv: drop redundant unlikely’s around INTEL_DEBUG

  • lima: use nir_shader_instructions_pass in lima_nir_split_load_input

  • anv: Set graphics pipeline active_stages earlier

  • anv: Use input assembly state only when pipeline has vertex stage

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_demote_sample_qualifiers

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_clamp_image_1d_2d_array_sizes

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_conversions

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_mem_access_bit_sizes

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_scoped_barriers

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_storage_image

  • intel/compiler: use nir_shader_instructions_pass in brw_nir_opt_peephole_ffma

  • intel/compiler: use nir_metadata_none instead of its value

  • anv: use nir_shader_instructions_pass in anv_nir_add_base_work_group_id

  • anv: use nir_shader_instructions_pass in anv_nir_lower_ycbcr_textures

  • anv: preserve all metadata when anv_nir_lower_multiview doesn’t make progress

  • glsl: preserve all metadata when lower_buffer_interface_derefs doesn’t make progress

  • nir: preserve all metadata when nir_lower_int_to_float doesn’t make progress

  • nir: preserve all metadata when nir_propagate_invariant doesn’t make progress

  • nir: preserve all metadata when nir_opt_vectorize doesn’t make progress

  • anv: allocate zeroed device object

  • nir/print: pad 64-bit constants with zeroes

  • anv: fix potential integer overflow

  • iris: fix scratch address patching for TESS_EVAL stage

  • intel: fix INTEL_DEBUG environment variable on 32-bit systems

Marek Olšák (211):

  • radeonsi: don’t expose no-attachment MSAA 16x on all 1 RB chips due to issues

  • radeonsi: document a missing synchronization for bindless textures

  • st/mesa: inline st_setup_arrays on MSVC too by adding a wrapper

  • mesa: remove unused drawid_offset parameter from DrawGalliumMultiMode

  • mesa: fix incorrect comment in draw_gallium_multimode

  • st/mesa: always use PIPE_USAGE_STAGING for GL_MAP_READ_BIT usage

  • shader_enums,mesa: move VERT_ATTRIB_EDGEFLAG to slot 31 for st/mesa

  • gallium: change pipe_vertex_element::src_format to uint8_t

  • gallium: add multi-component 64-bit UINT formats for raw double vertex attribs

  • gallium: add pipe_vertex_element::dual_slot to move lowering to CSO creation

  • gallium: lower raw 64-bit vertex formats in cso/vbuf instead of st/mesa

  • st/mesa: remove lowering of 64-bit vertex attribs to 32 bits

  • st/mesa: remove st_vertex_program::index_to_input

  • st/mesa: remove st_vertex_program::input_to_index

  • radeonsi: improve viewperf snx performance by forcing staging for VRAM buffers

  • gallium: simplify VRAM uploads by adding PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY

  • gallium/noop: implement fences

  • gallium/noop: implement shader buffers and shader images

  • gallium/noop: use threaded_query

  • gallium/noop: use threaded_resource

  • gallium/noop: use threaded_transfer

  • gallium/noop: enable threaded_context to test TC overhead without a driver

  • gallium/noop: update pipe_screen::num_contexts

  • gallium/noop: implement a lot of missing screen functions

  • gallium/noop: implement a lot of missing context functions

  • radeonsi: allow arbitrary swizzle modes for displayable DCC

  • radv: allow arbitrary swizzle modes for displayable DCC

  • ac/surface: allow arbitrary swizzle modes for displayable DCC

  • gallium: add take_ownership into set_sampler_views to skip reference counting

  • st/mesa: set take_ownership = true in set_sampler_views

  • st/mesa: move handling CubeMapSeamless into st_convert_sampler where it belongs

  • gallium: remove vertices_per_patch, add pipe_context::set_patch_vertices

  • radeonsi: remove vertices_per_patch parameter from draw-related functions

  • frontend/dri: add environment variable DRI_NO_MSAA for performance comparisons

  • gallium: use a packed enum to make pipe_prim_mode 1-byte large with __GNUC__

  • gallium: change pipe_draw_info::mode to uint8_t on MSVC to make it 1 byte large

  • glthread: implement glGetUniformLocation without syncing

  • meson: add missing custom target to generate shader_replacement.h

  • mesa: add environment variable MESA_NO_SHADER_REPLACEMENT

  • util/cpu_detect: print num_L3_caches and num_cpu_mask_bits

  • util/cpu_detect: add/guess support for next Zen CPUs

  • vbo: merge draws with GL_LINES regardless of line stippling

  • vbo: check more GL errors when drawing via glCallList

  • mesa: remove unused indices parameter from validate functions

  • mesa: fix gl_DrawID with indirect multi draws using user indirect buffer

  • mesa: skip draw calls with unaligned indices

  • radeonsi: remove unused depth_clamp_any

  • radeonsi: remove instancing support from the prim discard compute shader

  • radeonsi: remove stages_key parameter from si_shader_selector_key

  • radeonsi: move si_vgt_stages_key determination into si_update_vgt_shader_config

  • radeonsi: move as_ls/es/ngg setting out of si_shader_selector_key

  • radeonsi: inline si_get_alpha_test_func

  • radeonsi: stop using AC_EXP_PARAM_UNDEFINED because it’s not useful

  • radeonsi: use memcmp and radeon_emit_array in radeon_opt_set_context_regn

  • radeonsi: correctly use cs instead of gfx_cs in build pm4 helpers

  • radeonsi: simplify memory usage checking by merging vram and gtt counters

  • radeonsi: inline remaining big functions in draw_vbo for better snx perf

  • radeonsi: simplify si_need_gfx_cs_space

  • winsys/amdgpu: clean up amdgpu_cs_check_space

  • radeonsi: inline si_need_gfx_cs_space

  • radeonsi: don’t use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x

  • radeonsi: add si_print_current_ib function for debugging

  • ac/debug: add an option to disable colors for printed IBs

  • radeonsi: fix a memory leak in si_get_shader_binary_size

  • radeonsi: set gfx10 registers better in si_emit_initial_compute_regs

  • ac/gpu_info: fix detection of smart access memory

  • radeonsi: disable DCC stores on Navi12-14 for displayable DCC to fix corruption

  • radeonsi: enable DCC stores for clear_render_target on gfx10

  • radeonsi: add missing make_CB_shader_coherent for DCC stores into copy_image

  • radeonsi: handle pipe_aligned in compute_expand_fmask

  • radeonsi: rename DCC_WRITE -> ALLOW_DCC_STORE

  • radeonsi: track displayable_dcc_dirty for non-compute shaders

  • radeonsi: enable DCC stores on gfx10.3 APUs for better performance

  • radeonsi: clean up typecasts in compute_copy_image

  • ac/llvm: remove load_tess_coord callback

  • ac/llvm: implement a bunch of NIR AMD intrinsics for NGG

  • ac: remove needless parameters from ac_shader_abi::emit_outputs

  • ac: make ac_shader_abi::inputs an array instead of a pointer

  • ac/llvm: implement nir_intrinsic_overwrite_*_arguments_amd

  • ac/llvm: implement nir_intrinsic_elect

  • ac,radeonsi: load VS inputs at the call site of nir_intrinsic_load_input

  • ac,radv: remove unused inputs array and VS input code

  • radeonsi: don’t set prefer_mono for fetched instance divisors

  • radeonsi: ignore the vertex element count in si_shader_selector_key_vs

  • radeonsi: accurately check if instance divisors need a VS update

  • radeonsi: don’t update shaders if only the vertex element count changes

  • radeonsi: correct index_bias_varies usage

  • radeonsi: remove the primitive discard compute shader

  • winsys/amdgpu: precompute amdgpu_ib_max_submit_dwords

  • radeonsi: reduce the frequency of switching GS fast launch on/off

  • radeonsi: strengthen the VGT_FLUSH condition in begin_new_gfx_cs

  • radeonsi: skip setting some PGM_HI registers by switching to 32-bit addresses

  • winsys/amdgpu: include CS ioctl overhead in RADEON_NOOP

  • radeonsi: enable shader-based prim culling with polygon mode

  • radeonsi: remove a few fields from si_state_rasterizer

  • radeonsi: don’t emit PA_SU_POLY_OFFSET_CLAMP if it has no effect

  • radeonsi: add AMD_DEBUG=ib to print IBs

  • radeonsi: don’t use NGG passthrough if culling is possible for better perf

  • radeonsi: fix DCC image stores with display DCC

  • radeonsi: copy a few nir_shader_compiler_options from RADV

  • driconf: remove leftover code for allow_incorrect_primitive_id

  • radeonsi: fix DCC image stores with image descriptors in user SGPRs

  • radeonsi: add const to the key parameter in si_shader_select_with_key

  • radeonsi: handle NO_OPT_VARIANT in si_shader_select_with_key

  • radeonsi: sink memsets and disable uniform inlining in si_shader_selector_key

  • radeonsi: move PS shader key code into a separate function

  • radeonsi: don’t memset mono and opt in si_update_ps_shader_key

  • radeonsi: don’t memset part in si_update_ps_shader_key

  • radeonsi: divide si_update_ps_shader_key into many separate functions

  • radeonsi: ignore blitter when computing the PS shader key

  • radeonsi: update most of the PS shader key in set & bind functions

  • radeonsi: clean up and clear VS shader key fields related to outputs

  • radeonsi: update the VS shader key in set & bind functions and remove memsets

  • radeonsi: rewrite inlinable uniform states for shader keys in si_context

  • radeonsi: move si_shader_io_get_unique_index calls out of si_get_vs_key_outputs

  • radeonsi: move PS inputs_read computation out of si_get_vs_key_outputs

  • radeonsi: unset SI_PREFETCH_* only when we unbind pm4 shader states

  • radeonsi: make si_update_shaders a C++ template in si_state_draw.cpp

  • radeonsi: optimize scratch buffer size updates using C++ template arguments

  • radeonsi: check flatshade and sprite_coord_enable for spi_map in bind_rs_state

  • radeonsi: move DB_SHADER_CONTROL update for PS out of si_update_shaders

  • radeonsi: move flat shading VRS enablement out of si_update_shaders

  • radeonsi: precompute si_vgt_stages_key for NGG in si_shader

  • radeonsi: deduplicate si_compiler_ctx_state initialization

  • radeonsi: determine num_vbos_in_user_sgprs from template arguments in draw_vbo

  • radeonsi: eliminate a not-found conditional for PrimID in si_get_ps_input_cntl

  • radeonsi: force flat for PrimID early in si_nir_scan_shader

  • radeonsi: restructure si_get_ps_input_cntl for future refactoring

  • radeonsi: interleave si_shader_info::input_* in memory for faster emit_spi_map

  • radeonsi: precompute num_interp for si_emit_spi_map

  • radeonsi: simplify si_emit_spi_map for back-face colors

  • radeonsi: inline si_get_ps_input_cntl because it has only one use

  • radeonsi: unroll loops in si_emit_spi_map using 33 C++ template instantiations

  • radeonsi: precompute more spi_map code

  • radeonsi: set prefer_mono outside of si_shader_selector_key

  • radeonsi: move setting most TCS shader key fields out of si_shader_selector_key

  • radeonsi: move setting one GS shader key field out of si_shader_selector_key

  • radeonsi: put si_pm4_state at the beginning of si_shader

  • radeonsi: eliminate redundant SPI_SHADER_PGM_RSRC3/4_GS register writes

  • radeonsi: convert gfx10_emit_ge_pc_alloc to radeon_opt_set_uconfig_reg

  • radeonsi: use a trick to extract and pack edgeflags using fewer instructions

  • radeonsi: don’t set edgeflags for TES and blit VS

  • radeonsi: fix incorrect comments about VGT_SHADER_STAGES_EN

  • radeonsi: enable NGG passthrough when LDS is used, document the real constraints

  • radeonsi: remove the unused cs parameter from radeon_emit

  • radeonsi: remove the unused cs parameter from radeon_emit_array

  • radeonsi: remove the unused cs parameter from radeon_set_(config|context)_reg

  • radeonsi: remove the unused cs parameter from radeon_set_sh_reg

  • radeonsi: remove the unused cs parameter from radeon_set_uconfig_reg

  • radeonsi: remove the unused cs parameter from remaining packet functions

  • ac/surface: use DCC compatible with image stores for < 4K resolutions

  • ac/surface: correct a comment about DCC image stores

  • radeonsi: fix a depth texturing performance regression on gfx6-7

  • radeonsi: change the units of oversub_pc_factor to integer multiples of 1/4

  • radeonsi: decrease vertex count threshold for shader culling to 128

  • radeonsi: set vs_uses_base_instance using C++ template arguments

  • radeonsi: use the optimal draw packet sequence for VGT_FLUSH

  • radeonsi: reduce NGG culling on/off transitions by keeping it enabled

  • radeonsi: clean prefer_mono for the blit VS

  • radeonsi: don’t check ngg_culling != 0 for fast launch because it’s tautology

  • ac/gpu_info: fix the comment for the NGG->legacy transition bug

  • radeonsi: strenthen the ngg->legacy hw workaround, fix fast launch hangs too

  • radeonsi: fix clearing index_size for NGG fast launch

  • radeonsi: disallow NGG fast launch on Navi1x because VGT_FLUSH makes it slower

  • ac/llvm: pass cull options into cull_bbox directly

  • radeonsi: always use the correct number of vertices in NGG shader code

  • radeonsi: add gfx10 helpers for determining whether edgeflags are enabled

  • ac/llvm: rename ac_cull_triangle -> ac_cull_primitive

  • radeonsi: implement shader-based culling for lines

  • radeonsi: don’t set DX10_DIAMOND_TEST_ENA for better performance

  • util: add util_popcnt_inline_asm

  • util: import u_debug_refcnt, u_hash_table, u_debug_describe from gallium

  • gallium/util: make pipe_vertex_buffer_reference safe for hashing dst

  • gallium: add pipe_vertex_state and draw_vertex_state for display lists

  • gallium/u_threaded: implement draw_vertex_state

  • gallium/trace: add pipe_vertex_state support

  • gallium/util: add util_vertex_state_cache for deduplicating the states

  • st/mesa: add ST_PIPELINE_RENDER_NO_VARRAYS, for future display list support

  • st/mesa: make setup_arrays more reusable for future display list support

  • mesa: use pipe_vertex_state in vbo and st/mesa for lower display list overhead

  • radeonsi: separate VBO descriptor code into a new function (for future work)

  • radeonsi: implement draw_vertex_state for lower display list overhead

  • ac/surface: don’t overwrite DCC settings for imported buffers

  • ac/surface: enable DCC image stores for all displayable DCC on gfx10.3

  • mesa: add missing unlock_texture into generate_texture_mipmap

  • util/slab: use simple_mtx_t

  • util/queue: use simple_mtx_t for finish_lock

  • gallium/pb_cache: use simple_mtx_t

  • gallium/pb_slab: use simple_mtx_t

  • mesa: use simple_mtx_t for TexMutex

  • mesa: use simple_mtx_t for ShaderIncludeMutex

  • gallium/u_threaded: fix draw_vertex_state with multi draws

  • radeonsi: fix a leak in draw_vertex_state if threaded_context is disabled

  • radeonsi: remove duplicate partial_count variable

  • radeonsi: add back a workaround for DCC MSAA on gfx9 due to conformance issues

  • radeonsi: remove GS fast launch

  • util,gallium: put count in pipe_resource & sampler_view on its own cache line

  • radeonsi: align pipe_resource & sampler_view allocations to a cache line

  • radeonsi: fix an out-of-bounds access in si_create_vertex_state

  • ac/surface: always use suboptimal display DCC with DRM <= 3.43.0

  • ac/surface: disallow display DCC for big resolutions

  • ac/surface: enable better display DCC for chips newer than Yellow Carp

  • radeonsi: simplify how VS_OUT_CCDIST is set

  • radeonsi: simplify write_psize code in si_get_vs_out_cntl

  • mesa: fix crashes in the no_error path of glUniform

  • st/mesa: don’t crash when draw indirect buffer has no storage

  • radeonsi: enable shader culling for indirect draws

  • radeonsi: print the border color error message only once

  • radeonsi: fix 2 issues with depth_cleared_level_mask

  • radeonsi: fix a typo preventing a fast depth-stencil clear

  • driconf: disallow 10-bit pbuffers for viewperf2020/maya due to X errors

Marek Vasut (2):

  • freedreno: a2xx: Handle samplerExternalOES like sampler2D

  • freedreno: Handle timeout == PIPE_TIMEOUT_INFINITE and rollover

Marijn Suijten (1):

  • freedreno: Enable Adreno 508, 509 and 512

Mark Janes (3):

  • anv: Use local memory for block pool BO

  • anv: Allocate workaround buffer in local memory if present

  • anv: warn if system memory is used

Martin Krastev (2):

  • svga: enable DRM mks-stats via hooking to the corresponding DRM ioctls

  • meson: introduce option vmware-mks-stats controlling the instrumentations of gallium svga driver

Martin Roukala (néé Peres) (1):

  • radv/ci: mark some tests as flaky on gfx9

Matt Turner (5):

  • tu: Raise maxDescriptorSetUpdateAfterBindUniformBuffersDynamic to 16

  • util: Add unit tests for dag

  • util: Replace recursive DFS with iterative implementation

  • tu: Free device->bo_idx and device->bo_list on init failure

  • tu: Enable VK_KHR_uniform_buffer_standard_layout

Michael Tang (11):

  • spirv_to_dxil: expose version number

  • spirv_to_dxil: Run nir_lower_tex during compilation

  • microsoft/compiler: Add support for SV_SampleIndex intrinsic

  • microsoft/compiler: More robustly handle setting Register=-1

  • microsoft/compiler: Set the SampleFrequency runtime metadata

  • microsoft/compiler: Emit a flat interpolation method for SV_SampleIndex

  • microsoft/compiler: Miscellaneous fixes from running clang-format

  • microsoft/spirv_to_dxil: Add `install : true` to spirv_to_dxil library.

  • gallium/d3d12: move d3d12_lower_bool_input to microsoft/compiler

  • microsoft/spirv_to_dxil: use dxil_nir_lower_bool_input pass

  • microsoft/spirv_to_dxil: turn sysvals into input varyings

Michel Dänzer (2):

  • ci: Drop “success” job

  • ci: Put all container related jobs in a single stage

Michel Zou (6):

  • zink: Fix unused-variable warning

  • meson: dont use missing dumpbin path

  • radv: fix build with mingw

  • lavapipe: fix missing VKAPI_CALL attribute

  • wgl: fix 32 bits mingw exports

  • docs: mark off missing lavapipe exts

Mike Blumenkrantz (480):

  • zink: improve detection for broken drawids

  • lavapipe: increment drawid for multidraws

  • radv: merge si_write_viewport into radv_emit_viewport

  • radv: pre-calculate viewport transforms

  • radv: remove unused variable from radv_emit_viewport

  • lavapipe: don’t read line stipple info in pipeline creation if stipple is disabled

  • util/tc: make clear calls async

  • util/foz: stop crashing on destroy if prepare hasn’t been called

  • lavapipe: add a padding member to rendering_state

  • lavapipe: implement VK_EXT_color_write_enable

  • features: VK_EXT_color_write_enable for lavapipe

  • zink: check for dedicated allocation requirements during image alloc

  • zink: hook up VK_KHR_dedicated_allocation

  • zink: optimize shader recalc

  • zink: ifdef out some context prototypes/inlines for c++ compile

  • zink: start adding C++ draw templates

  • zink: add draw template for dynamic state

  • zink: make descriptors_update hook return a bool if a flush occurred

  • zink: if descriptor updating flushes, re-call draw/compute

  • zink: add template for starting new cmdbuf

  • zink: split pipeline_changed to use template value separately

  • zink: stop flagging pipeline dirty for line width changes

  • zink: don’t rebind vertex buffers if pipeline changes

  • zink: add a ctx flag for drawid reading

  • zink: flatten descriptor_refs_dirty into BATCH_CHANGED template

  • zink: use drawid_offset directly during draw

  • zink: add a ctx flag for shader reading basevertex

  • zink: remove screen info stuff from draw templates

  • zink: add changed flag for blend states

  • util/tc: add a util function for setting bytes_mapped_limit

  • radeonsi: use new tc util for setting bytes_mapped_limit

  • zink: use new tc util for setting bytes_mapped_limit

  • freedreno: use new tc util for setting bytes_mapped_limit

  • nir/lower_point_size_mov: zero nir_state_slot::swizzle in new variable

  • gallium: add pipe_sampler_state::pad member

  • lavapipe: add support for anisotropic texturing

  • nir: add nir_imm_ivec3 builder

  • zink: add mechanism for generating VkBuffers for rebinding

  • zink: change vbo_bind_count to a mask of slots

  • zink: handle vertex buffer offset overflows

  • zink: split and move maybe_flush_or_stall mechanic

  • zink: split draw_count checking to local variable

  • zink: make zink_end_render_pass public

  • zink: make batch_rp and norp static inlines

  • zink: use a local var for draw mode during draw

  • zink: add a param to check_batch_completion for toggling lock-taking

  • zink: rework oom flushing

  • zink: move mem cache to sub-struct

  • zink: inline mem cache hash table

  • zink: split mem cache per type

  • zink: clamp descriptor allocation bucket sizing to defined limit

  • zink: add define for descriptor alloc clamping

  • zink: improve lazy descriptor pool handling

  • zink: fix cached descriptor allocation clamping

  • nir/validate: refactor validate_assert to have a return value

  • zink: use array size in spirv bo length calculations

  • zink: add screen function for checking usage completion

  • zink: force batch completion check on query result

  • zink: add some resource util functions for batch usage

  • zink: collapse a conditional in zink_batch_resource_usage_set()

  • zink: use resource batch usage helpers in invalidate_buffer()

  • zink: simplify some dumb code in invalidate_buffer

  • zink: use new resource batch usage utils for is_resource_busy

  • zink: replace some direct batch_usage calls with resource abstractions

  • zink: remove no longer used internal resource function

  • zink: more explicitly check shader stages during compile

  • zink: merge draw_count and compute_count, move to batch struct

  • zink: improve oom flushing

  • zink: EXT_vertex_input_dynamic_state

  • zink: change descriptor flushing to assert

  • zink: lower subgroup ballot instructions

  • zink: implement compiler handling for subgroup ballot builtins/intrinsics

  • zink: remove VK_EXT_shader_subgroup_ballot from device info

  • zink: export PIPE_CAP_TGSI_BALLOT

  • zink: add env var to disable timelines

  • ci: add another zink job with timelines disabled

  • zink: use dynamic line stipple

  • zink: use MAP_ONCE for qbo readback

  • zink: rework buffer mapping

  • mesa/st: break up st_GetTexSubImage

  • mesa/st: break up st_choose_matching_format()

  • mesa/st: enable calling st_choose_format() purely for translation

  • mesa/st: add format-finding capabilities to pbo get_dst_format()

  • st/texture: refactor get_src_format() to be more useful

  • zink: never use staging buffer for unsynchronized buffer maps

  • zink: force threadsafe mapping for query results when necessary

  • Revert “zink: simplify some dumb code in invalidate_buffer”

  • zink: simplify some dumb code in invalidate_buffer (v2)

  • lavapipe: rework queue to use u_queue

  • lavapipe: use consistent semaphore variable naming

  • lavapipe: implement timeline semaphores

  • features: mark off timelines for lavapipe

  • zink: add locking for zink_shader::programs

  • zink: sum available memory heaps instead of assigning

  • zink: simplify else clause for mem info gathering

  • nine: don’t memset sampler state during conversion

  • nine: set CSO_NO_USER_VERTEX_BUFFERS for main cso context

  • nine: optimize texture binds a bit

  • nine: split enabled/dummy texture binds into separate iterators

  • nine: update bound sampler mask directly during texture updates

  • nine: track bound sampler count to optimize unbinds

  • nine: enable tc

  • nir: add imm_vec3 to round these out

  • nine: init take_index_buffer_ownership for draws

  • nine: init more draw info members

  • zink: add a suballocator

  • zink: repack zink_resource_object struct

  • zink: stop zeroing structs during resource allocation

  • zink: split transfer_unmap for images and buffers

  • zink: split mem unmap logic for images and buffers

  • zink: make map_count useful for dedicated image allocations

  • zink: remove PIPE_MAP_ONCE from subdata

  • zink: rejigger PIPE_MAP_ONCE for internal qbo reads

  • zink: flake out some tests for now

  • zink: collapse ‘dedicated’ allocation into zink_bo

  • zink: remove duplicated zink_resource_object::mem member

  • zink: split out zink_transfer allocation

  • zink: split buffer and image map functions

  • zink: remove unused variable from image map

  • zink: break out transfer map destroy

  • zink: handle map failures more effectively

  • zink: enable compat contexts

  • zink: ci updates

  • nir/lower_vectorize_tess_levels: set num_components for vectorized loads

  • softpipe: fix ci rule ordering to avoid unnecessarily running jobs

  • zink: simplify get_descriptor_set_lazy params

  • zink: remove redundant asserts from lazy descriptor set populate

  • zink: remove repeated lazy batch dd casts

  • zink: flag the gfx pipeline dirty and unset pipeline shader module on shader change

  • zink: do compute shader change on bind

  • zink: clear current gfx/compute program upon unbinding its shaders

  • zink: clear out all ubo rebinds first if they exist

  • zink: make descriptor update functions return the updated resource

  • zink: split out buffer rebinds to helper functions

  • zink: add bind counts for so bindings

  • zink: count streamout rebinds when doing buffer rebinds

  • zink: rebind all buffers on replacement

  • zink: only force all buffer rebinds if rebinds exist on other contexts

  • zink: defer deletion of no-attachment framebuffers

  • zink: stop referencing framebuffers

  • nine: replace unnecessary dynamic-sized array with bitfield

  • zink: move void format detection function to zink_format

  • zink: make component mapping function a static inline

  • zink: make void swizzle clamping util public

  • zink: add better TODO note for surface swizzles

  • zink: fix program init flag

  • zink: fix pipeline caching

  • zink: verify program key sizes before checking for default variant

  • zink: return early when getting resource modifer if no modifier is used

  • zink: inline program cache structs

  • zink: track mask of bound gfx shader stages

  • zink: split gfx shader cache based on stages present

  • zink: avoid hashing shader stages multiple times for new gfx programs

  • zink: create compute programs on bind

  • zink: simplify a bitmask init

  • zink: stop using dirty_shader_stages for shader binds

  • zink: add some null checks for shader variant key generation

  • zink: set inlinable_uniforms_mask first when binding a shader

  • zink: only remove programs from hash tables on shader deletion if needed

  • zink: implement PIPE_QUERY_GPU_FINISHED

  • zink: always init bordercolor value for sampler

  • zink: require occlusionQueryPrecise for occlusion queries

  • zink: assert precise queries are occlusion queries

  • zink: declare ctx var during blend state bind

  • zink: remove attachment count from pipeline hash

  • zink: pass current program’s shader array, not ctx array

  • zink: remove extra unsetting of ctx->vertex_state_changed

  • zink: reorder gfx program/pipeline/descriptor binds if dynamic state is present

  • zink: init ctx->gfx_prim_mode to nonzero value to trigger pipeline changes

  • zink: use ctx gfx prim mode for draw comparisons

  • zink: remove query flush from memory barrier hook

  • zink: slim down streamout component of mem barrier hook

  • zink: batch mem barrier hooks

  • zink: use dynamic prim type

  • zink: consolidate pipeline hash tables

  • zink: no-op prim changes for pipeline recalc

  • zink: hook up VK_EXT_extended_dynamic_state2

  • zink: template for VK_EXT_extended_dynamic_state2

  • zink: bump dynamic pipeline state count

  • zink: set primitive restart with extended dynamic state2

  • zink: move dynamic state1 pipeline members into substruct

  • zink: move viewport count into dynamic state1 part of pipeline hash

  • zink: zero viewport and scissor count in pipeline with dynamic state1

  • zink: repack zink_rasterizer_hw_state

  • zink: add clip_halfz to rasterizer hw state

  • zink: steal a bit from rast_samples in pipeline state

  • zink: convert rasterizer pipeline components to bitfield

  • zink: repack zink_gfx_pipeline_state

  • zink: make zink_gfx_pipeline_state::vertices_per_patch a bitfield

  • zink: improve threadsafe qbo access

  • zink: move time query ending out to zink_end_query

  • zink: don’t try to sync previous timestamp query qbo values

  • zink: more effectively utilize batch_usage for query destruction

  • zink: avoid pulling in unused push descriptors for cached ubo0

  • zink: remove extra program ref from cached descriptor updates

  • freedreno: export supported primtypes

  • freedreno: remove primconvert

  • freedreno: ci updates

  • zink: only update inlinable constants when they change

  • zink: determine whether the gpu has a resizable BAR at startup

  • zink: implement PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY when resizable bar not present

  • radv: use pool stride when copying single query results

  • radv: ignore dynamic line stipple if line stipple isn’t enabled

  • zink: free local shader nirs on program free

  • zink: use VK_WHOLE_SIZE for full-sized bufferviews

  • zink: explicitly end renderpass before running dispatch

  • zink: move alphaToOne warning to a dynamic warning

  • zink: add input attachment thingy for spirv builder

  • zink: emit fbfetch variables as ntv input attachments

  • zink: add a compiler pass to translate fbfetch -> input attachments

  • zink: refactor descriptor layout/template creation a little

  • zink: track fbfetch info on context, update as needed

  • zink: flag color attachment images as input attachments at creation

  • zink: add an input attachment to the gfx push set layout to handle fbfetch

  • zink: fix lazy descriptor deinit

  • zink: add an input attachment to the gfx push set layout to handle fbfetch

  • zink: update push descriptor set anytime fbfetch changes

  • zink: add a renderpass flag for input attachment layout handling

  • zink: enable fbfetch pipe cap

  • docs: mark off ES 3.2 for zink

  • zink: ci updates

  • zink: destroy shader modules on program free to avoid leaking

  • aux/cso: always restore states in atom order

  • gallium/cso: add unbind mask for cso restore

  • zink: directly pass resource pointer to descriptor state updates

  • zink: use tc rebind info for buffer replacements

  • zink: split out stalling from fence-waiting function

  • zink: remove refcounting from batch states

  • zink: ensure gfx shader module states are updated when doing a partial recalc

  • zink: create inner scanout object without scanout binds

  • zink: dynamic vertex input template

  • zink: don’t use dynamic vertex stride with dynamic vertex input

  • zink: incrementally hash gfx shader stages

  • zink: incrementally hash module variants in pipeline

  • zink: incrementally hash vertex state into pipeline hash

  • zink: incrementally hash all pipeline component hashes

  • zink: inline gfx pipeline hash table

  • zink: track compatible render passes

  • zink: use compatible renderpass state in pipeline hash

  • zink: clamp lazy pools to 500 descriptors and allocate more slowly

  • zink: remove ZINK_HEAP_HOST_VISIBLE_ANY

  • mesa/st: create new surfaces before destroying old ones when updating attachments

  • radv: just use UINT64_MAX when getting absolute timeout for that value

  • radv: add some asserts for descriptor updating

  • lavapipe: support EXT_primitive_topology_list_restart

  • docs: update features for lavapipe

  • lavapipe: unbreak imageless framebuffer

  • zink: move get_framebuffer() to zink_framebuffer.c

  • zink: store some image creation metadata to object struct

  • zink: store some surface metadata to struct during creation

  • zink: use imageless framebuffers

  • lavapipe: unbreak push descriptor templates

  • zink: add a piglit ci job for lazy descriptors

  • tgsi_to_nir: force int type for LAYER output

  • zink: hash blend state pointers on creation

  • zink: remove tcs shader keys

  • zink: move sample part of fs key to renderpass

  • zink: add pipeline state flag for determining if output type is points

  • zink: move point sprite rasterizer bits to unhashed pipeline state

  • zink: move drawid_broken to unhashed pipeline state

  • zink: always emit sample id 0 for non-msaa texel pointers in ntv

  • zink: fix PIPE_CAP_DRAW_PARAMETERS export

  • zink: add 8bit alu handling

  • zink: hook up 8/16bit storage exts

  • zink: lower 32_2x16_split pack/unpack instructions

  • zink: implement nir_op_pack_half_2x16_split

  • zink: handle 8/16bit ssbo storage

  • zink: handle bo struct types that are just a runtime array

  • zink: fix PIPE_SHADER_CAP_FP16_DERIVATIVES handling

  • zink: clamp query results to 500 per qbo on 32bit

  • util/primconvert: force restart rewrites if original primtype wasn’t supported

  • lavapipe: fix primitive restart with indexed indirect draws

  • zink: hook up VK_EXT_primitive_topology_list_restart

  • zink: use EXT_primitive_topology_list_restart where available

  • zink: use dispatch table for (almost) all vulkan calls

  • zink: fix some pipe caps for max instructions

  • mesa/st: use uint for instance_divisor instead of int

  • aux/trace: dump more pipe_vertex_element members

  • mesa: skip fallback draw call if no primitives are being drawn

  • aux/trace: use private refcounts for samplerviews

  • zink: reorganize cached descriptor updating a bit

  • zink: split out lazy set updating

  • zink: fall back to lazy descriptors if too many cache misses in a row

  • zink: add “nofallback” descriptor mode

  • zink: document ZINK_DESCRIPTORS env var

  • zink: ci updates

  • zink: move resource unrefs to flush thread

  • zink: remove batch params from renderpass functions

  • zink: remove batch params from resource copy functions

  • zink: remove unused barrier function

  • zink: remove batch params from barrier functions

  • zink: clamp instance divisors to max value

  • zink: add 8/16bit ubo handling

  • zink: export PIPE_SHADER_CAP_FP16_CONST_BUFFERS

  • zink: initialize zink_descriptor_layout_key::use_count on create

  • Revert “zink: ci updates”

  • zink: set vbo resource usage on bind

  • zink: add inline for checking whether a resource has any binds

  • zink: replace a couple checks for bind counts with new inline

  • zink: add some asserts for buffer replacement

  • zink: add a batch ref when replacing a buffer that has binds and usage

  • zink: move batch ref when possible during buffer replacement

  • zink: make a local screen var for buffer replace

  • zink: use better check for determining bufferview rebinds

  • zink: remove ZINK_RESOURCE_USAGE_STREAMOUT

  • zink: use bind_stages for pipeline barrier generation

  • zink: don’t generate more pipeline stages if vertex bit is already set

  • zink: use more accurate generation for buffer barrier pipeline stages

  • zink: remove bind_stages and bind_history from zink_resource

  • zink: remove zink_get_resource_for_descriptor()

  • zink: use descriptor info for ubo hashing

  • zink: fix ZINK_MAX_DESCRIPTORS_PER_TYPE to stop exploding the stack

  • zink: add function for decomposing vertex format to single component

  • zink: decompose vertex attribs into single components when not supported

  • zink: use smallest int type possible for decompose shader key

  • zink: hook up dmabuf ext

  • zink: add dmabuf modifier query hooks for screen

  • zink: hook up VK_EXT_queue_family_foreign

  • zink: split import and export fd handle types

  • zink: set a flag for dmabuf init

  • zink: handle image creation for dmabufs

  • zink: fix import pNext attachment during image creation

  • zink: use foreign queue import for dmabufs

  • zink: add dmabuf fd handling

  • zink: fix dmabuf cap export

  • zink: unconditionally support conditional rendering

  • zink: fix some return values

  • zink: add return values for resource usage unsetting

  • zink: move barrier info to resource object struct

  • zink: unset barrier info if resource object no longer has usage after reset

  • zink: unset src access in barriers if there’s no src pipeline stages

  • zink: assert surface geometry

  • zink: add a resource reference for bufferviews

  • zink: move surface and bufferview caches onto resources

  • zink: wrap framebuffer surfaces to preserve gallium expectations

  • zink: be smarter about fb surface rebinds

  • zink: force imageless fb rebind if rebinding an attachment

  • zink: update surface info when rebinding to storage

  • zink: add some debug asserts to validate imageless framebuffer correctness

  • compiler/spirv: add a fail if tex instr coord components aren’t dimensional enough

  • zink: don’t copy inner surface refcount

  • zink: stop setting nr_samples for null surfaces

  • zink: fix enabled vertex buffer mask calculation

  • zink: move pending prim type to gfx pipeline struct

  • zink: make tcs shader generation take screen param

  • zink: remove ctx references from shader compile path

  • zink: remove some ctx references from shader/pipeline compile

  • zink: only update gfx pipeline cache after creating a real pipeline

  • zink: simplify flagging last vertex stage for updating

  • zink: move xfb updates to just before draw

  • zink: move shader keys to be persistent on pipeline state

  • zink: move uniform size calc for shader keys into keybox

  • zink: store shader key to shader module

  • zink: stop using hash table for compute programs

  • zink: move shader cache to gfx program struct

  • zink: replace shader module hash table with a list

  • zink: remove default_variants storage in program struct

  • zink: split out inlined uniform shader variants into separate cache

  • zink: simplify shader variant update loop

  • zink: cap max shader variants with inlined uniforms

  • zink: store drm fd to screen

  • zink: unbreak dmabuf handling

  • zink: pre-filter multi-plane modifiers

  • zink: pass all modifiers through to image creation

  • zink: zero VkImageCreateInfo::queueFamilyIndexCount on creation

  • features: fix listing for GL_ARB_parallel_shader_compile

  • util/tc: rename tc_replace_buffer_storage_func::num_rebinds and document

  • zink: don’t leak drm fd on drmPrimeFDToHandle failure

  • zink: disable miplevel tests in ci completely for now

  • zink: fix regex syntax from previous ci commit

  • build: fix nine compilation with only zink enabled as a gallium driver

  • zink: always use type size for query result copy stride

  • zink: fix ci skips

  • zink: don’t use legacy scanout with modifiers

  • zink: clean up texture_barrier hook a little

  • zink: check for pending memory barrier before trying to flush it

  • zink: enable timeline ext features

  • zink: split vk debug logging into separate functions

  • zink: repack zink_render_pass_state

  • zink: add ZINK_HEAP_DEVICE_LOCAL_LAZY

  • zink: add ZINK_BIND_TRANSIENT

  • zink: improve handling of buffer rebinds using tc info

  • zink: reorder draw state updates

  • zink: remove fbfetch layout thingy from zs renderpass init

  • zink: move fb attachment init to new function

  • zink: stop setting nr_samples for shader image surface creation

  • zink: implement GL_EXT_multisampled_render_to_texture

  • docs: mark off GL_EXT_multisampled_render_to_texture for zink

  • zink: remove duplicated struct member set

  • zink: force lazy descriptor set rebinds if pipeline compatibility changes

  • zink: split out bvci creation from object creation

  • zink: don’t add resource to pending barrier set if no barrier will be generated

  • zink: refactor some shader image code to make it reusable

  • zink: handle bindless images and samplers in ntv

  • zink: hook up VK_EXT_descriptor_indexing

  • zink: implement bindless textures

  • zink: export PIPE_CAP_BINDLESS_TEXTURE

  • features: mark off bindless texture for zink

  • lavapipe: add support for KHR_shader_float_controls

  • anv: assert that legacy_scanout isn’t used with explicit modifiers

  • wsi/x11: fix uninit value by using zalloc for swapchain

  • zink: make a local resource var in fb_clears_apply_internal

  • zink: break out surface info init to helper function

  • anv: support EXT_primitive_topology_list_restart

  • zink: stop using VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT

  • zink: ensure fences are released before reusing them

  • zink: support 16bit rgbx formats

  • ci: updates

  • lavapipe: inherit from vk_image

  • lavapipe: EXT_4444_formats support

  • lavapipe: remove display extension support

  • build: unify vulkan cpp platform args

  • build: also remove wayland wsi flags from c++ build

  • features: be explicit about EXT_color_buffer_half_float support

  • zink: wait on thread queue before destroying context

  • zink: split out fb state updating to helper function

  • zink: wait in the flush thread when ETOOMANY batches are out

  • zink: move semaphore reset handling to submit

  • zink: remove zink_context::curr_batch

  • zink: stop leaking buffers on replacement

  • zink: switch remaining direct access of zink_resource_object::(reads|writes) to util

  • zink: remove reads/writes members from zink_resource_object

  • zink: stop leaking resource surface cache hash tables

  • zink: rework in-use batch states hash table to be a singly-linked list

  • zink: ci updates

  • zink: move glx@glx-multi-window-single-context to flakes

  • radv: don’t use invalid stride for triggering vertex state change

  • radv: dynamically calculate misaligned_mask for dynamic vertex input

  • radv: pre-calc “simple” dynamic vertex input values

  • radv: add a mask of bound descriptor buffers for dynamic vertex input

  • radv: move alpha_adjust into conditional during vertex input updating

  • aux/pb: add a tolerance for reclaim failure

  • aux/pb: more correctly check number of reclaims

  • zink: use static array for detecting VK_TIME_DOMAIN_DEVICE_EXT

  • zink: add a read barrier for indirect dispatch

  • zink: fully zero surface creation struct

  • zink: rescue surfaces/bufferviews for cache hits during deletion

  • zink: clear descriptor refs on buffer replacement

  • zink: assert compute descriptor key is valid before hashing it

  • zink: don’t update lazy descriptor states in hybrid mode

  • zink: move push descriptor updating into lazy-only codepath

  • zink: add an early return for zink_descriptors_update_lazy_masked()

  • zink: move last of lazy descriptor state updating back to lazy-only code

  • zink: detect prim type more accurately for tess/gs lines

  • zink: don’t break early when applying fb clears

  • zink: only reset zink_resource::so_valid on buffer rebind

  • zink: don’t check rebind count outside of buffer/image rebind function

  • zink: stop exporting PIPE_SHADER_CAP_FP16_DERIVATIVES

  • zink: don’t add dynamic vertex pipeline states if no attribs are used

  • zink: fix gl_SampleMaskIn spirv generation

  • zink: more accurately update samplemask for fs shader keys

  • nir/lower_samplers_as_deref: rewrite more image intrinsics

  • zink: add better handling for CUBE_COMPATIBLE bit

  • zink: use align64 for allocation sizes

  • zink: set aspectMask for renderpass2 VkAttachmentReference2 structs

  • zink: always use explicit lod for texture() when legal in non-fragment stages

  • zink: be more permissive for injecting LOD into texture() instructions

  • zink: inject LOD for sampler version of OpImageQuerySize

  • zink: flag renderpass change when toggling fbfetch

  • zink: don’t clamp cube array surfacess to cubes

  • zink: don’t clamp 2D_ARRAY surfaces to 2D

  • zink: error when trying to allocate a bo larger than heap size

  • zink: clamp max buffer sizes to smallest buffer heap size

  • zink: explicitly enable VK_EXT_shader_subgroup_ballot

  • zink: add more int/float types to cast switching in ntv

  • zink: force float dest types on some alu results

  • zink: stop double printing validation messages

  • zink: add SpvCapabilityStorageImageMultisample for multisampled storage images

  • zink: reject all storage multisampling if the feature is unsupported

  • zink: add queue locking

  • build: add sha1_h to llvmpipe build

  • zink: set fbfetch state on lazy batch data when enabling it

  • zink: always use lazy (non-push) updating for fbfetch descriptors

  • zink: clamp PIPE_SHADER_CAP_MAX_INPUTS for xfb

  • aux/primconvert: handle singular incomplete restarts

  • zink: rework cached fbfetch descriptor fallback

  • aux/trace: fix vertex state tracing

  • zink: be more consistent about applying module hash for gfx pipeline

  • zink: update gfx pipeline shader module pointer even if the program is unchanged

  • zink: always add VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for 3D images

Mykhailo Skorokhodov (3):

  • iris: Fix compute shader leak

  • iris: Add missed tile flush flag

  • Revert “iris: add tile cache flush to iris_copy_region”

Nanley Chery (41):

  • anv: Add genX(cmd_buffer_emit_gfx12_depth_wa)

  • iris: Add genX(emit_depth_state_workarounds)

  • iris: Update the clear value in cso_z->packets

  • iris: Emit clear_params as part of cso_z->packets

  • iris: Update clear_params only when HiZ is enabled

  • intel: Move the D16 workarounds out of ISL

  • iris: Use constants for emitting cso_z->packets

  • iris: Optimize genX(emit_depth_state_workarounds)

  • anv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa)

  • intel: Use env_var_as_boolean for INTEL_NO_HW

  • intel: Parse INTEL_NO_HW for devinfo construction

  • intel/isl: Add msaa_layout param to isl_tiling_get_info

  • intel/isl: Define ISL_TILING_4/64 for XeHP

  • intel/isl: Update image alignments on XeHP

  • intel/isl: Size Tile64 surfaces with 4 dimensions

  • intel/isl: Drop extra assert on array_pitch_el_rows

  • intel/isl: Drop ISL_SURF_USAGE_DISPLAY_*_BIT

  • intel/isl: Use an allow-list in gfx6_filter_tiling

  • intel/isl: Update tiling filter functions for XeHP

  • intel: Support Tile4/64 in depth/stencil state

  • intel: Support Tile4/64 in surface states

  • intel/blorp: Fix faked RGB image alignment on XeHP

  • intel/blorp: Fix Gfx7 stencil surface state valign

  • intel/isl: Fix halign/valign of uncompressed views

  • intel/isl: Use a switch for HALIGN/VALIGN encoding

  • intel: Update surface states for XeHP alignments

  • intel: Add underscores to HALIGN and VALIGN enums

  • intel/isl: Disable I915_FORMAT_MOD_Y_TILED on XeHP+

  • iris: Disable tiled memcpy for Tile4

  • anv/image: Don’t assert that HiZ can be added

  • iris: Delete iris_resource_get_clear_color

  • iris: Support NULL aux BOs in fill_surface_state

  • iris: Split clear color and aux BO checks

  • iris: Simplify an iris_use_pinned_bo call

  • iris: Allow NULL aux BOs in aux-state functions

  • iris: Don’t add a clear color BO for MC_CCS

  • iris: Add and use get_num_planes

  • iris: Finish aux import in iris_resource_from_handle

  • anv: Allow HIZ_CCS_WT with subpass self-dependencies

  • anv: Tile cache flush for depth before fast clear

  • iris: Tile cache flush for depth before fast clear

Neha Bhende (4):

  • aux/draw: use nir_to_tgsi for draw shader in llvm path

  • svga/drm: use pb_usage_flags instead of pipe_map_flags in vmw_svga_winsys_buffer_map

  • auxiliary/indices: convert primitive type PIPE_PRIM_PATCHES

  • st: Fix 64-bit vertex attrib index for TGSI path

Neil Roberts (1):

  • v3d: Update prim_counts when prims generated query in flight without TF

Olivier Fourdan (1):

  • radeonsi: Check aux_context on si_destroy_screen()

Paulo Zanoni (10):

  • iris: mark the workaround_bo as asynchronous

  • iris: don’t bump the seqno for the workaround_bo

  • iris: assign bo->index to the aux map BOs too

  • iris: extract the code that adds BOs to the batch lists

  • iris: add the workaround_bo directly to the batch

  • iris: use add_bo_to_batch() when adding batch->bo

  • iris: syncobjs are now owned by bufmgr instead of screen

  • iris: give each screen of a bufmgr a unique ID

  • iris: switch to explicit busy tracking

  • iris: signal the syncobj after a failed batch

Pavel Asyutchenko (3):

  • vulkan/overlay: Fix violation of VUID-VkMappedMemoryRange-size-01389

  • llvmpipe: fix crash when doing FB fetch + gl_FragDepth write in one shader

  • lavapipe: Fix vkWaitForFences for initially-signalled fences

Philipp Zabel (3):

  • etnaviv: fix gbm_bo_get_handle_for_plane for multiplanar images

  • etnaviv: fix dirty bit check for baselod emission

  • etnaviv: add mov for direct depth store output from load input

Pierre Moreau (5):

  • clover: Do not advertise OpenCL x.y when unsupported

  • clover/spirv: Increase max amount of function args

  • clover/spirv: Properly size 3-component vector args

  • clover/api: Interleave details in dispatch table

  • clover/nir: Set constant buffer pointer size to host

Pierre-Eric Pelloux-Prayer (78):

  • mesa: fix bindless uniform samplers update

  • dlist: don’t handle unmerged draws as merged

  • mesa: move gl_program::is_arb_asm to shader_info

  • radeonsi: preserve derivatives after discards for ARB shaders

  • gallium/va: don’t use key=NULL in hash tables

  • amd/registers: fix fields conflict detection

  • dlist: upload vertices in compile_vertex_list

  • dlist: implement vertices deduplication

  • radeonsi: add a script to run piglit/glcts/deqp tests

  • radeonsi: add expected tests results for Navi10 GPU

  • st/pbo: only use x coord when reading a PIPE_TEXTURE_1D

  • st/pbo: set nir_tex_instr::is_array field

  • st/pbo: add a fast pbo download code-path

  • radeonsi: fix test script’s output

  • radeonsi: add -t option to the test script

  • radeonsi: don’t create an infinite number of variants

  • nir: add a pass to optimize “gl_FragDepth = gl_FragCoord.z” away

  • radeonsi/test: fix test script args handling

  • radeonsi/test: format radeonsi-run-test.py with black

  • radeonsi/test: allow to pass a filename as a test filter value

  • radeonsi/test: prettier output

  • radeonsi/test: add Sienna Cichlid expected results

  • vbo/dlist: simplify add_vertex function

  • vbo/dlist: apply start_offset after indices construction

  • vbo/dlist: move VAO update at the end

  • vbo/dlist: use buffer_in_ram_size

  • vbo/dlist: use a single buffer object

  • vbo/dlist: remove vbo_save_vertex_store::bufferobj

  • vbo/dlist: don’t store prim_store

  • vbo/dlist: use prim_store directly

  • vbo/dlist: realloc prims array instead of free/malloc

  • vbo/dlist: don’t force list compilation if out of prim space

  • vbo/dlist: remove vbo_save_context::buffer_ptr

  • vbo/dlist: reset vertex_store::used in reset_counters

  • vbo/dlist: remove vbo_save_context::buffer_map

  • vbo/dlist: realloc vertex stores

  • vbo/dlist: remove vbo_save_context::max_vert

  • vbo/dlist: limit allocation sizes

  • vbo/dlist: don’t force list compilation if out of vertex space

  • vbo/dlist: rework out of memory

  • vbo/dlist: fix max_index_count value

  • vbo/dlist: remove vbo_save_copied_vtx

  • vbo/dlist: remove vbo_save_context::vert_count

  • vbo/dlist: add documentation

  • vbo/dlist: remove unused functions

  • vbo/dlist: rework buffer sizes

  • vbo/dlist: rework primitive store handling

  • vbo/dlist: rework vertex_store management

  • vbo/dlist: fix indentation in vbo_save_api.c

  • vbo/dlist: reallocate the vertex buffer on vertex upgrade

  • Revert “ci/v3d: add piglit flake”

  • radeonsi/test: fix typo in the test script

  • radeonsi/test: update expected results

  • radeonsi/sqtt: export wave size and scratch size

  • radeonsi/sqtt: add si_se_is_disabled

  • radeonsi/test: don’t require a folder name

  • radeonsi/test: use -t for deqp tests

  • radeonsi/test: print default values in help

  • radeonsi/test: allow to specify a baseline folder

  • radeonsi/test: sanitize output_folder

  • radeonsi/test: add –gpu to select the GPU to test

  • radeonsi/test: add Raven expected results

  • radeonsi/test: add sanity checks

  • gallium: add PIPE_CAP_PREFER_BACK_BUFFER_REUSE

  • loader/dri3: avoid reusing the same back buffer with DRI_PRIME

  • radeonsi: disable PIPE_CAP_PREFER_BACK_BUFFER_REUSE

  • radeonsi: don’t clear G_028644_OFFSET

  • radeonsi: implement si_sdma_copy_image for gfx7+

  • radeonsi: add an async compute context

  • gallium: add a is_dri_blit_image bool to pipe_blit_info

  • radeonsi: make the DRI_PRIME dGPU -> iGPU copy async

  • radeonsi: use viewport offset in quant_mode determination

  • radeonsi: treat nir_intrinsic_load_constant as a VMEM operation

  • radeonsi/sdma: fix bogus assert

  • ac/surface: don’t validate DCC settings if DCC isn’t possible

  • vbo/dlist: free copied.buffer if no vertices were copied

  • mesa: always call _mesa_update_pixel

  • radeonsi/sqtt: fix shader stage values

Qiang Yu (20):

  • nir/inline_uniforms: add uniforms in condition atomically

  • nir/inline_uniforms: support vector uniform

  • nir/loop_analyze: move nir_is_supported_terminator_condition() to header

  • nir/loop_analyze: record induction variables for each loop

  • nir/loop_analyze: skip unsupported induction variable early

  • nir/inline_uniforms: support loop

  • egl/dri2: seperate EGLImage validate and lookup

  • gbm/dri: implement image lookup extension version 2

  • gallium/dri: add dri_screen egl image validate hooks

  • gallium/api: add validate_egl_image interface

  • mesa: add ValidateEGLImage driver callback

  • mesa: fix glthread deadlock when EGL multi thread shared context

  • nir/lower_io_to_vector: check centroid & sample when merge variable

  • nir/linker: pack varyings with different interpolation qualifier

  • radeonsi: enable nir option pack_varying_options

  • radeonsi: fix ps SI_PARAM_LINE_STIPPLE_TEX arg

  • loader/dri3: fix swap out of order when changing swap interval

  • mesa/st: delay nir spirv link

  • nir/linker: support uniform when optimizing varying

  • nir/linker: rename replace_constant_input to replace_varying_input_by_constant_load

Quantum (1):

  • main: allow all external textures for BindImageTexture

Rhys Perry (108):

  • aco: don’t create v_madmk_f32/v_madak_f32 from v_fma_legacy_f16

  • ac/llvm: implement v2f16 fsat

  • radv: set image_dim and image_array intrinsic indices

  • aco: use image_dim and image_array intrinsic indices

  • aco: calculate correct register demand for branch instructions

  • nir/algebraic: fix imod by negative power-of-two

  • nir/algebraic: don’t optimize umod/imod/irem if lower_bitops=true

  • nir/algebraic: add optimizations for imul(a, INT_MIN)

  • nir/search: don’t consider INT_MIN a negative power-of-two

  • nir/algebraic: improve irem by power-of-two optimization

  • nir/idiv_const: improve idiv(n, INT_MIN)

  • nir/idiv_const: optimize imod/irem

  • nir: fix signed overflow for iadd constant folding

  • nir/tests: add tests for umod/imod/irem optimizations

  • radv: enable DCC with signedness reinterpretation

  • nir: remove src/compiler/nir/nir_control_flow

  • nir: swap fadd operands in nir_atan()

  • spirv: swap fadd operands in build_asin() and matrix_multiply()

  • nir/algebraic: add various ffma optimizations

  • nir/algebraic: reassociate add chains for more MAD/FMA-friendly code

  • nir/algebraic: add is_used_once to dot product reassociation optimization

  • nir: add ffma creation helpers

  • nir: create ffma from builders more often

  • nir: lower fdot to ffma if lower_ffma=false

  • spirv: create ffma more often

  • nir,glsl_to_nir: use nir_fdot()

  • ci: update trace hashes

  • aco: fix validation of DPP v_cndmask_b32/v_addc_co_u32

  • aco: add can_use_DPP() and convert_to_DPP()

  • aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp

  • aco: make optimize_postRA() work across blocks

  • aco: handle DPP in the optimizer

  • aco: combine DPP into VALU before RA

  • aco: combine DPP into VALU after RA

  • aco/tests: add tests for pre-RA DPP combining

  • aco/tests: add tests for post-RA DPP combining

  • aco: fix vectorized 16-bit load_input/load_interpolated_input

  • aco: remove label_extract if the extract is used by a non-VALU

  • aco/scheduler: allow moving down VMEM stores to below VMEM loads

  • nir/lower_io: use nir_vector_insert_imm()

  • radv: use nir_vector_insert_imm in lower_intrinsics

  • nir: consider push constant loads as always dynamically uniform

  • nir/gcm: pin some instructions which require uniform sources

  • aco: include utility in isel

  • aco: don’t constant propagate to DPP instructions

  • aco/tests: test copy propagation with DPP instructions

  • aco: remove DPP when applying constants/literals/sgprs

  • aco: don’t coalesce constant copies into non-power-of-two sizes

  • aco/spill: add temporary operands of exec phis to next_use_distances_end

  • nir: separate lower_add_sat

  • nir: add sdot_2x16 and udot_2x16 opcodes

  • spirv: use sdot_2x16 and udot_2x16 opcodes

  • ac/gpu_info: add has_accelerated_dot_product

  • ac/llvm: implement nir_op_pack_32_4x8

  • ac/llvm,radv: implement uadd_sat/iadd_sat

  • ac/llvm: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes

  • radv: refactor handling of nir_options

  • radv,aco: implement iadd_sat

  • aco: implement nir_op_pack_32_4x8

  • aco: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes

  • aco/ra: allow v1b operands with 16-bit instructions

  • radv: expose VK_KHR_shader_integer_dot_product

  • aco/ra: don’t use ds_write_b8_d16_hi/ds_write_b16_d16_hi on GFX8

  • nir: fix serialization of loop/if control

  • radv: fix pipeline caching with robust buffer access

  • aco: add RegClass::is_linear_vgpr helper

  • aco: add and use RegClass::resize helper

  • aco: rewrite print_reg_class()

  • aco: find a scratch register for sub-dword copies on GFX7 if scc is empty

  • aco: find scratch reg for sub-dword psuedo instructions which read sgprs

  • aco/tests: fix finish_ra_test()

  • aco/tests: add regalloc.scratch_sgpr.create_vector

  • aco: implement linear vgpr copies

  • aco: allow live-range splits of linear vgprs in top-level blocks

  • aco/nops: use up-to-date mask_size

  • aco/nops: create handle_raw_hazard_instr helper

  • aco/nops: add State

  • aco/nops: fix handle_raw_hazard_internal when visiting the current block

  • nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants

  • aco/tests: add idep_amdgfxregs_h

  • nir: add nir_src_components_read()

  • nir/opt_if: add opt_if_rewrite_uniform_uses

  • radv: don’t require a GS copy shader to use the cache with NGG VS+GS

  • radv: workaround incorrect image format with World War Z

  • radv: move ngg culling determination earlier

  • nir: add _amd suffix to fragment_mask_fetch and fragment_fetch texops

  • nir/lower_tex: add lower_to_fragment_fetch_amd

  • radv: don’t create blit pipelines for multisampled 3D images

  • aco: return 0x76543210 for NULL FMASK fetch

  • ac/nir: return 0x76543210 for NULL FMASK fetch

  • aco: use correct dim for FMASK fetches

  • radv,aco: use lower_to_fragment_fetch

  • radv,aco: don’t include FMASK in the storage descriptor

  • ac/llvm: fix image_samples with null descriptors

  • radv/llvm: fix parameter index for layer exports

  • aco: fix vadd32() when b is neither a constant nor temporary

  • radv: add and use radv_vs_input_alpha_adjust

  • radv: add radv_translate_vertex_format()

  • radv: add radv_shader_variant_get_va and radv_find_shader_variant helpers

  • radv: add segregated fit shader memory allocator

  • radv: move VS specific input SGPRs first

  • radv: implement dynamic vertex input state using vertex shader prologs

  • radv: add pre-compiled vertex shader prologs for common states

  • aco: implement aco_compile_vs_prolog

  • aco: implement VS input loads with prologs

  • radv: implement VK_EXT_vertex_input_dynamic_state

  • radv: enable VK_EXT_vertex_input_dynamic_state

  • aco: consider pseudo-instructions reading exec in needs_exec_mask()

Rob Clark (81):

  • freedreno/registers: update dsi registers to support tpg

  • freedreno/a6xx: Add missing PC_CCU_INVALIDATE_x

  • driconfig: Add support for device specific config

  • driconf: Add force_gl_renderer override

  • freedreno: Support per-device driconf overrides

  • freedreno: Unleash the dragon!

  • freedreno: Move generated device table to .h

  • freedreno: Drop device_id

  • freedreno: Reduce use of screen->gpu_id

  • freedreno/ir3: Reduce use of compiler->gpu_id

  • freedreno/ir3/lower_io_offsets: Drop gpu_id param

  • freedreno/all: Introduce fd_dev_id

  • freedreno: Make chip_id 64b

  • freedreno: Device matching based on chip_id

  • freedreno: Use correct key for binning pass shader

  • freedreno: Add a680 support

  • freedreno/cffdec: Fix indentation

  • freedreno/cffdec: Fix gpuaddr comparision

  • freedreno/crashdec: Decode full RB in verbose mode

  • freedreno/crashdec: Quiet spammy print in query mode

  • freedreno/common: Fix comment typo

  • freedreno/a6xx: Set type for PC_HS_INPUT_SIZE

  • freedreno/a6xx: Register updates for a6xx gen3

  • freedreno/a6xx: Rast updates for a6xx gen3

  • freedreno/a6xx: Fix streamout with tess_use_shared

  • freedreno/a6xx: Updates for tess_use_shared

  • freedreno/a6xx: Register updates for a6xx gen4

  • freedreno/a6xx: Fix a6xx gen4 compute shaders

  • freedreno/ci: Add a status variable for CI farm

  • freedreno/ci: Take fd farm offline for moving day

  • freedreno/ci: Bring fd farm back online after move

  • clover: Don’t remove sampler/image uniforms

  • nir/lower_amul: Handle load/store_global

  • nir/lower_amul: Fix usage of nir_foreach_src()

  • freedreno/ir3: Update physical_successors after retargetting jumps

  • freedreno/ir3: Fix physical successors for break out of loop

  • freedreno/ir3: Fix double printing of branch suffix

  • freedreno/ir3: Validate physical successors

  • freedreno/ir3: Improve error msg for block level validation

  • freedreno/ir3: Update physical_predecessors for streamout block

  • freedreno: Remove unused function

  • freedreno: Cleanup primtypes/primtypes_mask

  • freedreno: Move a6xx specific screen init

  • freedreno/drm: Garbage collect unused bo_cache

  • freedreno/drm: Rename bo->flags to bo->reloc_flags

  • freedreno/drm: Consider allocation flags in bo-cache

  • freedreno/drm: Don’t return shared/control bo’s to cache

  • freedreno/drm: Add cached-coherent bo support

  • freedreno/drm: Use cached-coherent cmdstream buffers

  • freedreno/drm: Use cached-coherent for control bo

  • freedreno: Used cached coherent for staging resources

  • freedreno: Add perf warning for WC readback

  • freedreno/a6xx: Pre-bake SO-disable stateobj

  • freedreno/ir3: Fix sched debug msgs

  • freedreno/ir3: Cleanup liveness lifetime

  • freedreno/ir3: Fix generation check

  • freedreno/computerator/a4xx: Fix enum mismatch warning

  • freedreno: Add info->a6xx.has_shading_rate

  • turnip: Fix unitialized cs->device

  • turnip: Rast updates for a6xx gen4

  • turnip: Fix a6xx gen4 compute shaders

  • isaspec: Remove unused leftovers

  • isaspec: Fix comment

  • isaspec: Split encode_bitset() into it’s own template

  • isaspec: De-duplicate bitset encoding

  • freedreno: Get shader variant msgs in perf debug output

  • freedreno: Optimize no-op submits

  • freedreno: Fix some indentation

  • freedreno/ir3: Remove used unused

  • freedreno: Handle cso==NULL in bind_sampler_states

  • freedreno: Handle PIPE_FORMAT_NONE buffers

  • gallium/u_threaded: Get reset status without sync

  • freedreno: Disable TC syncs for get_device_reset_status()

  • zink: Disable TC syncs for get_device_reset_status()

  • Revert “freedreno: Fix autotune regression since batch-cache rework.”

  • Revert “freedreno: Remove dead fd_batch_reset().”

  • Revert “freedreno: Use a BO bitset for faster checks for resource referenced.”

  • Revert “freedreno: Remove the submit lock locking.”

  • Revert “freedreno: Move the batch cache to the context.”

  • gallium/u_threaded: Split out options struct

  • freedreno/drm: Move pipe unref after fence removal

Rohan Garg (7):

  • virgl: Add more meta data to cached resources

  • Revert “Revert “virgl: Cache depth and stencil buffers””

  • virgl: Enable caching for sampler views and render targets

  • i965: Take into account the offset when marking a valid data region

  • i965: Write a custom allocator for the intel memobj struct

  • ci: Fix a minor issue in prepare-artifacts.sh script

  • ci: Use FDO_DISTRIBUTION_TAG where possible

Roland Scheidegger (7):

  • llvmpipe/linear: don’t try to use tgsi analysis for nir shaders

  • llvmpipe: always use draw_regions intersection

  • llvmpipe: fix nir dot products (fsum op)

  • aux/cso: try harder to keep cso state in sync on cso context unbind

  • gallium: add rasterizer depth_clamp enable bit

  • lavapipe: implement VK_EXT_depth_clip_enable

  • lavapipe: Fix crashes with transform feedback when using VK_WHOLE_SIZE

Roman Stratiienko (7):

  • kmsro: Add ‘kirin’ driver support

  • AOSP: Extract version from libdrm instead of hardcoding it.

  • AOSP: Upgrade libLLVM dependency to v12

  • AOSP: Update timestamps of target binaries

  • AOSP: Add panfrost vulkan library suffix

  • lima: Implement lima_resource_get_param() callback

  • meson_options: Bump max value of platform-sdk-version to 31

Ryan Neph (1):

  • virgl: disallow null-terminated debug messages

Sagar Ghuge (19):

  • nir: Add new opcode for ternary addition

  • intel/compiler: Add support for ternary add instruction on XeHP

  • intel/compiler: Make decision based on source type instead of opcode

  • intel/compiler: Allow ternary add to promote source to immediate

  • nir: Add optimizations for iadd3

  • intel/compiler: Enable has_iadd3 option on XeHP

  • intel/compiler: Fix missing break in switch

  • intel/compiler: Handle ternary add in lower_simd_width

  • genxml/gen12: Update debug register fields according to HW

  • genxml/gen125: Update debug register fields according to HW

  • anv: Fix VK_EXT_memory_budget to consider VRAM if available

  • intel/compiler: Add 64-bit A64 float logical opcode support

  • anv: Advertise support for shaderBufferFloat64AtomicMinMax

  • intel/compiler: Add support to handle 64-bit atomics with A32 messages

  • anv: No need to lower to A64 messages for 64-bit atomics

  • iris: Enable atomic operations on compressed surfaces

  • intel/genxml: Add new bit fields Render Compression Format

  • isl: Add helper to return render compression format encoding

  • isl: Use software programmable render compression format encoding

Samuel Pitoiset (215):

  • radv: only init the TC-compat ZRANGE metadata for the depth aspect

  • radv: fix bounds checking for zero vertex stride on GFX6-7

  • radv: report APUs as discrete GPUs for Red Dead Redemption 2

  • radv: fix specifying the stencil layout for separate depth/stencil layouts

  • radv: allow unused VkSpecializationMapEntries

  • aco: implement VK_EXT_shader_atomic_float2

  • radv: implement VK_EXT_shader_atomic_float2

  • radv: reduce number of emitted DWORDS for contiguous context registers

  • radv: do not use radeon_set_context_reg_seq() for only one register

  • radv: init radv_image::l2_coherent when creating the layout

  • ac: introduce a structure to store DCC address equations for GFX9

  • amd/addrlib: expose CMASK address equations to drivers on GFX9

  • ac/surface: add tests for CmaskAddrFromCoord prototype outside of addrlib

  • ac/surface: store CMASK pitch and height to radeon_surf

  • ac/surface: copy the CMASK equation to radeon_surf

  • ac/surface: implement CmaskAddrFromCoord in NIR

  • radv: fix selecting the first active CU when profiling with SQTT

  • radv: fix missing cache flushes when clearing HTILE levels on GFX10+

  • amd/addrlib: expose CMASK address equations to drivers on GFX10+

  • ac/surface: add tests for CmaskAddrFromCoord on GFX10+

  • ac/surface: implement CmaskAddrFromCoord in NIR on GFX10+

  • radv: rework DCC, FMASK and FCE decompress path

  • radv: perform a FCE for MSAA images that might have been fast-cleared

  • radv: allow DCC MSAA fast clears if a FCE is needed

  • radv: fix initializing the DS clear metadata value for separate aspects

  • radv: remove unnecessary FIXME about custom sample locations

  • radv: flush caches before performing separate depth/stencil aspect init

  • radv: bump maxFragmentSizeAspectRatio to 2

  • radv: disable fragmentShadingRateWithCustomSampleLocations

  • radv: bump maxFragmentShadingRateCoverageSamples to 32

  • radv: fix reported sample counts for VRS 1x1

  • radv: use more explicit DCC clear codes

  • radv: pass an image view to vi_get_fast_clear_parameters()

  • radv: add RADV_DCC_CLEAR_SINGLE

  • radv: determine if an image support fast clears using comp-to-single

  • radv: implement DCC fast clears with comp-to-single

  • radv: skip FCE for images that are fast-cleared using comp-to-single

  • radv: enable DCC fast-clears with comp-to-single on GFX10+

  • radv: allow fast clears for concurrent images if comp-to-single is supported

  • radv: fix pre-computing viewport xform when setting new viewports

  • radv: fix fast clearing depth images with mips on GFX10+

  • radv: determine if an image support comp-to-single at creation time

  • radv: remove useless check about the FCE predicate offset

  • radv: do not allocate the FCE predicate for images that use comp-to-single

  • radv: remove unnecessary check in radv_layout_is_htile_compressed()

  • radv: remove incorrect comment about compressed writes to HTILE on GFX10+

  • radv: fix copying depth+stencil images on compute

  • radv: remove unused fast depth-stencil gfx clear path with expclear

  • radv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state

  • radv: don’t use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x

  • radv: allocate shaders to 32-bit address to skip PGM_HI

  • nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)

  • Revert “nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)”

  • nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(fabs(b), -a)

  • ci: update the list of expected failures/skips for RADV

  • radv: allow storage images with VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 on GFX10.3+

  • ci: update the list of skipped tests for Fiji/RADV

  • radv: remove outdated radv_finishme() in the HW resolve path

  • radv: remove useless check about number of samples in the HW resolve path

  • radv: remove unecessary radv_finishme() for invalid color formats

  • radv: disable DCC image stores on Navi12-14 for displayable DCC corruption

  • radv: do not load/store the clear value for comp-to-single images

  • radv: do not allocate a clear value for images that support comp-to-single

  • radv: add support for clearing multi layers with normal gfx clear path

  • vulkan: Update the XML and headers to 1.2.190

  • radv: advertise VK_EXT_primitive_topology_list_restart

  • ac/llvm: adjust assertion for nir_intrinsic_terminate

  • ac/llvm: fix huge alignment when loading from shared memory

  • radv/llvm: fix invalid IR when converting triangle strips to indices

  • radv: use radeon_set_sh_reg_seq() more for initial gfx/compute state

  • radv: call nir_lower_int64() for LLVM

  • radv: track if shader image 32-bit float atomics are enabled

  • radv: do not disable DCC for storage images if atomics aren’t enabled

  • vulkan: add common entrypoints for sparse image requirements/properties

  • radv: use common entrypoints for sparse image requirements/properties

  • radv: use common vkGetPhysicalDevice{Image}FormatProperties()

  • radv: use common vkGetDeviceQueue()

  • radv: use common vkBind{Buffer,Image}Memory()

  • radv: use common vkGet{Buffer,Image}MemoryRequirements()

  • radv: fix determining the maximum number of waves that can use scratch

  • radv: remove NGG streamout support in LLVM

  • radv: allow to conditionally read HTILE value when copying VRS rates

  • radv: optimize copying VRS rates to the global HTILE buffer

  • radv: pass the HTILE buffer to radv_copy_vrs_htile()

  • radv: optimize VRS when no depth stencil attachment is bound

  • radv/llvm: rework VS input loads and implement the callback

  • ac/llvm: fix build with LLVM 14

  • radv: add MSAA support to the comp-to-single fast clear path

  • radv: enable comp-to-single for MSAA images

  • radv: reduce SQTT traffic when instruction timing is disabled

  • radv/llvm: fix using Wave32

  • radv/llvm: fix vertex input fetches with 16-bit floats

  • ac/llvm: implement nir_intrinsic_image_deref_atomic_{fmin,fmax}

  • ac/llvm: implement nir_intrinsic_ssbo_atomic_{fmin,fmax}

  • ac/llvm: implement nir_intrinsic_shared_atomic_{fmin,fmax}

  • ac/llvm: implement nir_intrinsic_global_atomic_{fmin,fmax}

  • radv: advertise EXT_shader_atomic_float2 with LLVM 14+

  • radv/ci: add a list of expected failures for VanGogh

  • ac/rgp, radv: report scratch memory size for shaders

  • ac/rgp, radv: report wave size for shaders

  • radv: rename radv_decompress_depth_stencil()

  • radv: implement depth/stencil expand on compute

  • radv: add support for copying compressed depth/stencil images on compute

  • radv: keep depth/stencil images compressed for TRANSFER_DST on compute

  • radv: replicate THREAD_TRACE_CTRL config when stopping SQTT

  • radv: make the SQTT BO a resident buffer

  • radv: remove useless assertions in the SQTT path

  • radv: do not use a different disk cache key for LLVM

  • radv: do not store meta shaders to the default shader disk cache

  • radv: remove useless shader variant key copies for VS+TCS

  • radv: stop loading invocation ID for NGG vertex shaders

  • radv: remove unused radv_tcs_variant_key:primitive_mode

  • radv: stop using the shader keys for as_ls/as_es/as_ngg when possible

  • radv: remove useless as_ngg_passthrough init when lowering NGG in NIR

  • radv/llvm: stop using vs_common_out.as_ngg_passthrough

  • radv: add export_clip_dists for VS and TES to radv_shader_info

  • radv,aco: stop using vs_common_out.export_clip_dists

  • radv/llvm: stop using vs_common_out.export_prim_id

  • radv: store the topology instead of the output primitive type in the key

  • radv: store the CS subgroup size to radv_shader_info

  • radv: rework layout of radv_pipeline_key

  • radv: pass the pipeline key to the backend compilers

  • radv: cleanup uses of VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT

  • radv: remove unused radv_nir_compiler_options fields

  • radv: remove unnecessary vs_common_out.export_viewport_index

  • radv: remove unnecessary vs_common_out.export_layer_id

  • radv: remove unnecessary radv_shader_info:{vs,tes}.export_prim_id

  • radv: remove unnecessary init of outinfo.export_prim_id for GS

  • radv: remove vs_common_out:export_prim_id

  • radv: remove vs_common_out:export_clip_dists

  • radv: pass the pipeline key to the shader info pass

  • radv: use the pipeline key more when possible

  • radv: stop using vs_common_out.{as_es/as_ls/as_ngg*} shader keys

  • radv: remove radv_shader_variant_key completely

  • radv: fix missing features for BDA

  • radv: remove the LLVM stat about the number of private VGPRs

  • radv: fix adjusting the frag coord when RADV_FORCE_VRS is enabled

  • radv: fix selecting the hash when RADV_FORCE_VRS is enabled

  • radv: make sure to load the Primitive ID for VS+GS as NGG

  • radv: fix vk_object_base_init/finish for the internal pipeline cache

  • radv: fix vk_object_base_init/finish for internal buffer views

  • radv: fix vk_object_base_init/finish for the internal push descriptors

  • radv: fix vk_object_base_init/finish for internal image views

  • radv: fix vk_object_base_init/finish for internal buffers

  • radv: set export_clip_dists for the GS copy shader

  • radv: determine the VS output parameters in the shader info pass

  • radv: disable the DX10 diamond test for better line rasterization perf

  • radv: get the float controls execution mode from NIR for LLVM

  • radv: do not declare an extra user SGPR for sample positions and PS

  • radv: move ngg early prim export determination earlier

  • move: move ngg lds bytes determination earlier

  • radv: move ngg passthrough determination earlier

  • radv: remove unnecessary ac_nir_ngg_config output struct

  • radv: constify radv_shader_info for radv_lower_{io_to_mem,ngg}()

  • radv: move forcing discard to demote to the graphics pipeline key

  • radv: move forcing invariant geometry to the graphics pipeline key

  • radv: move forcing MRT output NaN fixup to the graphics pipeline key

  • radv: move forcing VRS rates to the graphics pipeline key

  • radv: move use of NGG to the graphics pipeline key

  • radv: remove redundant check of needs_multiview_view_index for PS

  • radv: remove useless loads_dynamic_offsets when emitting push constants

  • radv: determine the ES type (VS or TES) for GS earlier

  • ci: enable building RADV in debian-release

  • radv: fix vk_object_base_init/finish for push descriptors

  • radv: fix writing combined image/sampler descriptor

  • radv: fix vk_object_base_init/finish for internal device memory objects

  • radv/llvm: fix exporting VS parameters

  • radv: do not set TRAP_PRESENT(1) for fragment shaders

  • aco: fix load_barycentric_at_{offset,sample}

  • radv: declare the shader user locs from the shader arguments

  • radv: determine if a shader uses indirect descriptors from the SGPR loc

  • radv: determine if a shader loads push constants from the SGPR loc

  • radv: remove unnecessary radv_shader_info:base_inline_push_consts

  • radv: remove unnecessary radv_shader_info:num_inline_push_consts

  • radv: do not overwrite loads_push_constants when declaring shader args

  • radv: gather more information about PS in the shader info pass

  • radv,aco: compute and store the SPI PS input in radv_shader_info

  • aco: prevent using undeclared shader arguments for PS

  • radv,aco: remap PS inputs when declaring shader arguments

  • aco: constify radv_shader_{info,args}

  • radv: remove radv_pipeline::layout

  • radv: implement vkGetDeviceBufferMemoryRequirementsKHR()

  • radv: implement vkGetDeviceImageMemoryRequirementsKHR()

  • radv: implement vkGetDeviceImageSparseMemoryRequirementsKHR()

  • radv: advertise VK_KHR_maintenance4

  • radv: use nir_image_deref_{load,store} in the DCC retile compute path

  • radv: remove useless coordinate computation in the compute clear path

  • radv: remove few useless nir_channels() in meta shaders

  • radv: use get_global_ids() to compute coordinates in meta shaders

  • radv: use nir_ssa_undef() for unused image components in meta shaders

  • radv: move ac_shader_config to radv_shader_binary instead of legacy

  • radv: store the post-processed shader binary config to the cache

  • radv,aco: remove nir_intrinsic_load_layer_id

  • radv: remove no-op about the view index in the shader info pass

  • radv: rename needs_multiview_view_index to uses_view_index

  • radv: stop gathering output GS info for vertex shaders

  • aco: cleanup setup_vs_output_info()

  • radv: do not initialize is_ngg_passthrough for geometry shaders

  • radv: remove duplicated code about NGG passthrough determination

  • radv: switch to VK_FORMAT_FEATURE_2_XXX/VkFormatProperties3KHR

  • radv: implement VK_KHR_format_feature_flags2

  • aco: do not return an empty string when disassembly is not supported

  • radv: fix removing PSIZ when it’s not emitted by the last VGT stage

  • radv: fix OpImageQuerySamples with non-zero descriptor set

  • radv: do not remove PSIZ for streamout shaders

  • aco: fix invalid IR generated for b2f64 when the dest is a VGPR

  • aco: fix emitting stream outputs when the first component isn’t zero

  • aco: fix loading 64-bit inputs with fragment shaders

  • radv: re-emit prolog inputs when the nontrivial divisors state changed

  • radv: fix build errors with Android

  • aco: only load streamout buffers if streamout is enabled

  • radv: do not expose buffer features for depth/stencil formats

  • radv/sqtt: fix GPU hangs when capturing from the compute queue

  • radv: fix a sync issue on GFX9+ by clearing the upload BO fence

  • nir: fix constant expression of ibitfield_extract

Sergii Melikhov (2):

  • iris: Fix Null pointer dereferences

  • dri2: Fix Null pointer dereferences

Shmerl (1):

  • vulkan/overlay: don’t display histogram and range for device and format

Simon Ser (18):

  • EGL: sync headers with Khronos

  • egl: add support for EGL_EXT_device_drm_render_node

  • etnaviv: fix renderonly check in etna_resource_alloc

  • etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource

  • freedreno: fail in get_handle(TYPE_KMS) without a scanout resource

  • panfrost: fail in get_handle(TYPE_KMS) without a scanout resource

  • lima: fail in get_handle(TYPE_KMS) without a scanout resource

  • vulkan/wsi/wayland: use drm_fourcc.h for formats

  • vulkan/wsi/wayland: drop support for wl_drm

  • vulkan/wsi/wayland: generalize modifier handling

  • etnaviv: add stride, offset and modifier to resource_get_param

  • panfrost: implement resource_get_param

  • vc4: implement resource_get_param

  • v3d: implement resource_get_param

  • vulkan/wsi/x11: add driconf option to not wait under Xwayland

  • gbm: consistently use the same name for BO flags

  • gbm: add gbm_{bo,surface}_create_with_modifiers2

  • gbm: assume USE_SCANOUT in create_with_modifiers

Simon Zeni (5):

  • gbm: add GBM_FORMAT_R16

  • i915: remove use of backtrace and backtrace_symbols

  • glapi/gl_gentable.py: drop call to backtrace on no op

  • util/u_debug_symbol: remove debug_symbol_name_glibc and execinfo dependency

  • meson: stop searching for execinfo

Stéphane Marchesin (1):

  • virgl: Flush context before waiting on fences

Tapani Pälli (22):

  • crocus: take a reference to memobj bo in crocus_resource_from_memobj

  • crocus: disable depth and d+s formats with memory objects

  • iris: handle depth-stencil import with a wrapper function

  • anv: disable aux for exportable images without modifiers

  • anv: allow stencil memory export

  • anv/android: fix build error due refactoring

  • mesa: fix timestamp enum with EXT_disjoint_timer_query

  • mesa: GL_ARB_ES3_2_compatibility GL compat profile support

  • anv: remove a format assert when setting up attachments

  • vulkan: provide common functions to check device features

  • anv: remove feature checks from device creation

  • radv: remove feature checks from device creation

  • turnip: remove feature checks from device creation

  • v3dv: remove feature checks from device creation

  • lavapipe: remove feature checks from device creation

  • panvk: remove feature checks from device creation

  • intel/blorp: fix a compile warning about uninitialized use

  • intel/isl: FXT1 support was removed on Gfx12.5

  • swrast: Fix another warning from gcc 11

  • anv/android: fix parameters given for vk_common_QueueSubmit

  • anv: use vk_object_zalloc for wsi fences created

  • iris: clear bos_written when resetting a batch

Thomas H.P. Andersen (1):

  • nine: Fix assert in tx_src_param

Thomas Wagner (6):

  • gallium: add utility and interface for memory fd allocations

  • llvmpipe: add support for EXT_memory_object(_fd)

  • lavapipe: add support for KHR_external_memory_fd

  • llvmpipe: enable EXT_memory_object(_fd)

  • lavapipe: enable KHR_external_memory_fd

  • util: use anonymous file for memory fd creation

Thong Thai (15):

  • gallium: add temporal layers cap enum

  • frontends/va: check number of temporal layers supported by encoder

  • gallium: update h264 struct to track temporal layers

  • radeon/vcn/enc: H.264 SVC encode

  • radeonsi: enable H.264 temporal encoding support for VCN

  • frontends/va: handle h264 num_temporal_layers for SVC encoding

  • gallium: change rate ctrl struct to array

  • r600: change rate ctrl struct to array

  • radeon/vce: change rate ctrl struct to array

  • radeon/vcn/enc: change to per-temporal layer rate control

  • frontends/omx: change rate ctrl struct to array

  • frontends/va: change to per-layer rate control

  • gallium/auxiliary/vl: Add additional deinterlace enum and tracking

  • gallium/util: add half texel offset param to util_compute_blit

  • frontends/va/postproc: Keep track of deinterlacing method being used

Timothy Arceri (20):

  • util: document that workaround also fixes Riptale

  • glsl: replace some C++ code with C

  • nir/gcm: be less destructive with instruction order

  • intel/compiler: call nir_opt_dead_cf() after we have finished all opts

  • intel/compiler: Use GCM in nir_optimize

  • util: add workaround for Full Bore

  • glsl: relax rule on varying matching for shaders older than 4.20

  • intel/compiler: make sure swizzle is applied to if condition

  • nir: add indirect loop unrolling to compiler options

  • nir: move nir_block_ends_in_break() to nir.h

  • nir: add heuristic for instructions in loops with GCM

  • nir: fix GCM when GVN enabled

  • glsl: fix variable scope for instructions inside case statements

  • mesa: fix mesa_problem() call in _mesa_program_state_flags()

  • glsl: fix variable scope for loop-expression

  • glsl: handle scope correctly when inlining loop expression

  • glsl: fix variable scope for do-while loops

  • util/cache: run basic cache tests on the single file cache

  • util/cache: test simple cache put and get between instances

  • mesa: fix buffer overrun in SavedObj texture obj array

Timur Kristóf (71):

  • radv: Use 128-sized vertex grouping for NGG shaders.

  • radv: Don’t compile NGG culling into shaders that write viewport index.

  • radv: Remove num_viewports from radv_skip_ngg_culling.

  • aco: Swap s_and operand order for ballot.

  • aco: Allow elect to take advantage of knowing when all lanes are active.

  • aco: Remove s_and with exec when all lanes are active.

  • radv: Use pre-computed viewport transform for NGG culling state.

  • aco: Fix how p_elect interacts with optimizations.

  • aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags.

  • ac/nir: Use es_accepted variable after culling.

  • ac/nir: Use gs_accepted variable after culling.

  • ac/nir: Don’t count vertices and primitives in wave after culling.

  • nir, aco: Remove vertex and primitive count overwrite intrinsic.

  • ac/nir: Remove unhelpful nir_opt_cse from ac_nir_lower_ngg_nogs.

  • aco: Use Navi 10 empty NGG output workaround on NGG culling shaders.

  • radv: Don’t toggle PC oversubscription for NGG culling.

  • radv: Use ac_compute_late_alloc in radv_pipeline.

  • ac: Remove deprecated use_late_alloc field as nobody uses it anymore.

  • radv: Write RSRC2_GS for NGGC when pipeline is dirty but not emitted.

  • aco: Fix to_uniform_bool_instr when operands are not suitable.

  • radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+.

  • radeonsi: Change GS vertex offset arguments to use gs_vtx_offset array.

  • ac: Calculate workgroup sizes of HW stages that operate in workgroups.

  • radv: Calculate workgroup sizes in radv_pipeline.

  • radv: Remove superfluous workgroup size calculations.

  • aco: Use workgroup size from input shader info.

  • aco: Consider LDS usage by PS inputs in MaxWaves calculation.

  • aco: Consider maximum number of workgroups per CU/WGP on Navi.

  • aco: Emit zero for the derivatives of uniforms.

  • aco: Unset 16 and 24-bit flags from operands in apply_extract.

  • nir: Add unsigned upper bound for extract opcodes.

  • nir: Fix local_invocation_index upper bound for non-compute-like stages.

  • nir: Add comment to explain the sad_u8x4 opcode.

  • aco: Fix invalid usage of std::fill with std::array.

  • ac/nir/ngg: Delete unused struct.

  • ac/nir/nggc: Don’t stop applying reusable variables at prim export.

  • ac/nir/nggc: Only repack arguments that are needed.

  • ac/nir/nggc: Move gs_alloc_req up in NGG culling shaders.

  • aco: Use Builder reference in emit_copies_block.

  • aco: Skip code paths to emit copies when there are no copies.

  • aco/optimize_postRA: Use iterators instead of operator[] of std::array.

  • aco: Add some useful info to the README for debugging.

  • radv: Remove PSIZ output when it isn’t needed.

  • aco: Add ability to optimize v_lshl + v_sub into v_mad_i32_i24.

  • aco/isel: Fix emit_vop2_instruction to apply 16/24-bit flags properly.

  • ac/nir: Remove byte permute from prefix sum of the repack sequence.

  • ac/nir: Fix match_mask to work correctly for VS outputs.

  • nir: Exclude non-generic patch variables from get_variable_io_mask.

  • radv: Disable HW generated edge flags for NGG shaders.

  • ac/nir: Emit edge flag instructions conditionally.

  • radv/llvm: Don’t read edge flags anymore.

  • radv: Fix gs_vgpr_comp_cnt for NGG culling in vertex shaders.

  • ac/nir/nggc: Refactor save_reusable_variables.

  • ac/nir/nggc: Don’t reuse uniform values from divergent control flow.

  • radv: Select PC oversubscription rate based on number of PS params.

  • radv: Reduce NGG culling small draw threshold to 128.

  • aco: Allow p_extract to have different definition and operand sizes.

  • aco: Implement integer conversions using p_extract.

  • aco: Omit p_extract after ds_read with matching bit size.

  • aco: Don’t write m0 register for LDS instructions on GFX9+.

  • aco: Fix small primitive precision.

  • aco: Fix determining whether any culling is enabled.

  • radv: Don’t declare ngg_gs_state when there is no API GS.

  • radv: Enable NGG culling by default on GFX10.3, add nonggc debug flag.

  • ac/nir/cull: Accept NaN and +/- Inf in face culling.

  • ac/nir/nggc: Write undef to variables in non-repacked ES threads.

  • aco/optimizer: Skip SDWA on v_lshlrev when unnecessary in apply_extract.

  • drirc: Fix indentation.

  • drirc: Apply radv_invariant_geom workaround to Resident Evil Village.

  • drirc: Apply radv_invariant_geom workaround to World War Z games.

  • aco: Fix how p_is_helper interacts with optimizations.

Tomeu Vizoso (40):

  • panvk: Don’t try to update samplers if they are immutable

  • panvk: Start a new batch when the job index gets above the limit

  • panvk: Close batch when ending a command buffer

  • panvk: Move check for fragment requirement up to the draw

  • panvk: A pipeline might not be bound when the render pass is ended

  • panvk: Expose panvk_cmd_alloc_fb_desc and panvk_cmd_alloc_tls_desc

  • panvk: Implement vkCmdClearAttachments

  • docs/ci: Update http cache config to let Authorization headers pass through

  • freedreno/ci: Move rules for restricted jobs to test-source-dep.yml

  • ci: Update canvas_text trace

  • virgl/ci: Have LLVMPipe use more threads for rendering

  • virgl/ci: Rebalance concurrency

  • virgl/ci: Wait a bit before shutting the VM down

  • virgl/ci: Set NIR_VALIDATE=0 on the host

  • panfrost: Add padding to pan_blit_blend_shader_key

  • iris/ci: Add manual jobs for tracking performance

  • panvk: Initialize timestamp for disk cache

  • freedreno/ci: Correctly set freq governors to max

  • iris/ci: Correctly set freq governors to max

  • panvk/ci: Build-test panvk

  • ci: Ensure the DRM device is open

  • lavapipe: add xfails for whole of CTS

  • vulkan: Read len attribute of parameters to functions

  • vulkan: Generate code to place commands in a queue

  • vulkan: Generate entrypoints that enqueue commands

  • lavapipe: Use generated command queue code

  • lavapipe: Use c_msvc_compat_args

  • vulkan: Remove dependency on Python 3.9+

  • Revert “lavapipe: unbreak imageless framebuffer”

  • vulkan: Copy pNext structures when enqueuing commands

  • ci: Uprev piglit to 99be1b06ff36

  • ci: Stop adding link to tracie dashboard

  • panfrost/ci: Enable test runs on G72

  • panvk: Move CmdClear* impl to a separate file

  • panfrost/ci: Move CI files to src/panfrost

  • panfrost/ci: Test panvk on Mali G52

  • ci: Rebuild kernel with Amlogic KMS support

  • panfrost/ci: Run Piglit’s quick_gl tests on G52

  • ci: Add support for lazor Chromebooks

  • ci: Let manual LAVA jobs have a longer timeout than others

Tony Wasserka (24):

  • radv: Rename radv_shader_helper.h to radv_llvm_helper.h

  • aco: Separate LLVM/CLRX asm printers more cleanly

  • aco: Extend set of supported GPUs that can be disassembled with CLRX

  • radv: Build code which depends on LLVM only when enabled

  • radv: Disable shader disassembly when no disassembler is available

  • aco/tests: Assert that the requested IR is actually provided

  • aco/spill: Avoid unneeded copies when iterating over maps

  • aco: Use std::vector for the underlying container of std::stack

  • aco/spill: Remove unused container

  • aco/spill: Replace map[] with map::insert

  • aco/spill: Avoid copying next_use maps more often than needed

  • aco/spill: Persist memory allocations of local next use maps

  • aco/spill: Avoid destroying local next use maps over-eagerly

  • aco/spill: Replace vector<map> with vector<vector> for local_next_use

  • aco/spill: Prefer unordered_map over map for next use distances

  • aco/spill: Avoid copying current_spills when not needed

  • aco/spill: Reduce redundant std::map lookups

  • aco/spill: Replace an std::map to booleans with std::set

  • aco/spill: Store remat list in an std::unordered_map instead of std::map

  • aco/spill: Change worklist to a single integer

  • aco/spill: Reduce allocations in next_uses_per_block

  • aco/spill: Clarify use of long-lived references by adding const

  • aco/spill: Use unordered_map for spills_exit

  • aco/spill: Use std::unordered_map for spills_entry

Vadym Shovkoplias (3):

  • driconf, glsl: Add a vs_position_always_precise option

  • drirc: Set vs_position_always_precise for Assault Android Cactus

  • intel/fs: Fix a cmod prop bug when cmod is set to inst that doesn’t support it

Vasily Khoruzhick (2):

  • lima: handle fp16 vertex formats

  • lima: split_load_input: don’t split unaligned vec2

Veerabadhran Gopalakrishnan (2):

  • radeon/vcn: Add FW header flag to enable VP9 header parsing

  • gallium/va: Remove VP9 header parsing for secure playback

Vinson Lee (17):

  • nv50/ir: Initialize Value member id in constructor.

  • asahi: Move assignment after null check.

  • spirv_to_dxil: Fix missing-prototypes build error.

  • meson: Remove duplicate xvmc in build summary.

  • nir: Initialize evaluate_cube_face_index_amd dst.x.

  • zink: Remove unnecessary null checks.

  • nv50/ir: Add FlatteningPass constructor.

  • freedreno: Require C++17.

  • broadcom/compiler: Fix qpu.flags.muf typo.

  • glx: Fix unused-variable warning with macOS build.

  • draw/tess: Fix unused-function warning with draw-use-llvm=disabled.

  • nv50/ir: Add DeadCodeElim constructor.

  • pps: Avoid duplicate elements in with_datasources array.

  • freedreno: Add valgrind dependency.

  • anv: Fix assertion.

  • radv: Fix memory leak on error path.

  • virgl: Allocate qdws after virgl_init_context to avoid leak.

Witold Baryluk (2):

  • zink: Do not access just freed zink_batch_state

  • zink: Fully initialize VkBufferViewCreateInfo for hashing

Yevhenii Kharchenko (1):

  • iris: fix layer calculation for TEXTURE_3D ReadPixels() on mip-level>0

Yevhenii Kolesnikov (19):

  • glsl: Add operator for .length() method on implicitly-sized arrays

  • glsl: Properly handle .length() of an unsized array

  • vulkan: Add a common vk_command_buffer structure

  • anv: Use a common vk_command_buffer structure

  • radv: Use a common vk_command_buffer structure

  • turnip: Use a common vk_command_buffer structure

  • v3dv: Use a common vk_command_buffer structure

  • lavapipe: Use a common vk_command_buffer structure

  • vulkan: Add a common vk_queue structure

  • anv: Use a common vk_queue structure

  • radv: Use a common vk_queue structure

  • turnip: Use a common vk_queue structure

  • v3dv: Use a common vk_queue structure

  • lavapipe: Use a common vk_queue structure

  • vulkan: Implement VK_EXT_debug_utils

  • vulkan/enum_to_str: Add generator for VkObjectType to Vulkan Handle

  • vulkan: Add vk_asprintf and vk_vasprintf helpers

  • vulkan: Add convenience debug message helpers

  • anv: Switch to new debug message helpers

Yipeng Chen (Jasber) (1):

  • radeonsi: do not use staging texture for APU

Yiwei Zhang (24):

  • venus: cache ahb backed buffer memory type bits requirement

  • venus: fix all missing vn_object_base_fini

  • venus: scrub ignored fields of pipeline info when rasterization is disable

  • venus: refactor failure path for sets allocation

  • venus: add vn_descriptor_set_layout_init

  • venus: descriptor layout to track more binding infos

  • venus: layout to track variable descriptor count binding info

  • venus: descriptor pool to track pool state

  • venus: descriptor set to track descriptor count of last binding

  • venus: check descriptor allocations against pool resource

  • venus: conditionally enable async descriptor set allocation

  • venus: set maxMipLevels to 1 for ahb images

  • venus: renderer to check map size only when mappable

  • venus: workaround a blob_mem mappable size check issue

  • venus: suggest the proper sampler ycbcr model conversion based on format

  • docs: update vn extension list

  • venus: amend supported extensions list

  • venus: properly check and fill ahb buffer properties

  • util: fix sign comparison

  • radv/anv android: rename buffer usage camera mask

  • android_stub: update platform headers to include atrace

  • venus: update to latest venus-protocol to include tracing

  • dri_interface: remove obsolete interfaces

  • dri_interface: remove gl header

Yogesh Mohan Marimuthu (2):

  • radeonsi: remove redundant setting scratch_state atom dirty

  • radeonsi: set scratch_state dirty only if ctx->scratch_buffer allocated

Yogesh Mohanmarimuthu (1):

  • vulkan/device-select: select correct default device for xcb apiVersion 1.0

Zachary Michaels (1):

  • X11: Ensure that VK_SUBOPTIMAL_KHR propagates to user code

Zhu Yuliang (1):

  • gallium/vl: don’t leak fd in vl_dri3_screen_create

byte[] (1):

  • i965: Explicitly abort instead of exiting on batch failure

liuyujun (1):

  • gallium: fix surface->destroy use-after-free

mattvchandler (1):

  • gallium/osmesa: fix buffer resizing

mwezdeck (1):

  • mesa: validate texture format against GL/ES ctx

orbea (1):

  • build: add sha1_h for lp_texture.c

suijingfeng (4):

  • gallivm: add basic mips64 support and set mcpu to mips64r5 on ls3a4000

  • pass egl-symbols-check test on mips64el

  • gallivm: fix pass init order on mips64 with llvm 8

  • llvmpipe: correct the debug information printed with GALLIVM_PERF=nopt

xantares (1):

  • lavapipe: Fix 32bits windows build