Mesa 20.1.0 Release Notes / 2020-05-27

Mesa 20.1.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 20.1.1.

Mesa 20.1.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 20.1.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

2109055d7660514fc4c1bcd861bcba9db00c026119ae222720111732dba27c83  mesa-20.1.0.tar.xz

New features

  • GL_ARB_compute_variable_group_size on i965.

  • GL_EXT_depth_bounds_test on Iris.

  • GL_EXT_texture_shadow_lod on radeonsi, nvc0.

  • GL_NV_alpha_to_coverage_dither_control on radeonsi

  • GL_NV_copy_image on all gallium drivers.

  • GL_NV_pixel_buffer_object on all gallium drivers, i915, i965, swrast.

  • GL_NV_viewport_array2 on nvc0 (GM200+).

  • GL_NV_viewport_swizzle on nvc0 (GM200+).

  • VK_AMD_memory_overallocation_behavior on RADV.

  • VK_KHR_shader_non_semantic_info on Intel, RADV.

  • GL_EXT_draw_instanced on gles2

  • VK_KHR_8bit_storage for ACO on GFX8+

  • VK_KHR_16bit_storage for ACO on GFX8+ (storageInputOutput16 is still unsupported)

  • shaderInt16 for ACO on GFX9+

  • VK_KHR_shader_float16_int8 for ACO on GFX8+ (shaderFloat16 is still unsupported)

  • VK_EXT_robustness2 on Intel, RADV.

  • Add Rocket Lake (RKL) support on anvil and iris.

Bug fixes

  • Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)

  • glsl: regression affecting shader compilation time

  • freedreno: glamor issue with x11 desktops

  • [gles3] supertuxkart: some textures are incorrect

  • Double lock in fbobject.c

  • [bisected] Steam crashes when newest Iris built with LTO

  • i965/vec4: opt_cse_local cause the out of bound array access

  • NIR: Regression on shader using 8/16-bit integers

  • lp_bld_intr.c:70:16: error: use of undeclared identifier ‘LLVMFixedVectorTypeKind’; did you mean ‘LLVMVectorTypeKind’?

  • Deadlock in anv_timelines_wait()

  • post_version.py does not work with release candidates

  • post_version.py does not work with release candidates

  • radv regression on android

  • srcutilmeson.build:294:4: ERROR: Program or command ‘winepath’ not found or not executable

  • debug builds are massively broken on Windows

  • heavy glitches on amd ryzen 5 since version 20.x

  • zink asserts with 32-bit boolean

  • Dirt: Showdown bad performance and broken rendering with enabled advanced lightning

  • gravit & Firefox WebGL broken since 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 “ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE”

  • mesa 20.0.5 causing kitty to crash

  • radeonsi: “Torchlight II” trace showing regression on mesa-20.0.6 [bisected]

  • [RADV/LLVM/ACO/Regression] After mesa commit a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start

  • Android building error after commit 2ab45f41

  • iris: Crash when trying to capture window in OBS Studio

  • Properly annotate control flow convergence points

  • intel/compiler: Register coalesce doesn’t move conditional modifiers

  • [bisected] [iris] mpv under wayland: failed to import supplied dmabufs: Unsupported buffer format 808669784

  • [Bisected][Iris] piglit.spec.!opengl 1_1.max-texture-size crashes on x32 platform

  • anv : android deqp assert dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image#export_import_bind_bind

  • GL cts gtf30.GL3Tests.sgis_texture_lod.sgis_texture_lod_basic_getter failure

  • freedreno/a6xx: texture cache vs realloc_bo()

  • [Bisected] dEQP-VK.subgroups.ballot_mask.ext_shader_subgroup_ballot.* failures

  • dEQP-VK.subgroups.size_control.compute.* crashes on HSW and TGL

  • zink: framebuffer and pipeline caches accumulate due to zink_create_surface()

  • FTBFS due to LLVM commit 2dea3f129878 (LLVMVectorTypeKind is gone)

  • [r600/Turks] 20.0.2: modesetting/radeon driver SIGABRT at loading X (kernel 5.5.10, ppc64)

  • piglit spec.!opengl 1.0.gl-1.0-fpexceptions crash on Iris

  • ci: Update the Wine version

  • SPIR-V: Failure in dEQP-VK.graphicsfuzz.control-flow-switch

  • SPIR-V: OpConvertUToPtr from spec constant fails to compile

  • ACO: Regression: Texture corruption

  • radv: Reading ViewportIndex in fragment shader returns garbage

  • piglit spec.arb_gpu_shader_fp64.execution.arb_gpu_shader_fp64-vs-non-uniform-control-flow-ssbo crash on Iris

  • piglit spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-neg-abs.shader_test failure on IVB

  • [ANV] gfxbench Aztec Ruins misrenders on gen11+

  • glxinfo cmd crashed

  • radeonsi: GL_LINES rendering is affected by GL_POINT_SPRITE

  • nir: nir_lower_returns can’t handle nested loops

  • Graphic artifacts with Mesa 20.0.4 on intel HD 510 GPU

  • [Iris] [Bisected] Some KHR-GL46.arrays_of_arrays_gl. tests are failing

  • Mesa 20 regression makes Lightsprint demos crash

  • metro redux games crash upon loading certain levels on amdgpu

  • dri_common.h:58:8: error: unknown type name ‘__GLXDRIdrawable’

  • Graphical glitches on Intel Graphics when Xorg started on Iris driver

  • GL/GLES test crashes on G33/i915 platforms

  • GL/GLES test crashes on G33/i915 platforms

  • GL/GLES test crashes on G33/i915 platforms

  • SIGSEGV src/compiler/glsl/ast_function.cpp:53

  • manywin aborts with “i965: Failed to submit batchbuffer: Invalid argument”

  • manywin aborts with “i965: Failed to submit batchbuffer: Invalid argument”

  • manywin aborts with “i965: Failed to submit batchbuffer: Invalid argument”

  • manywin aborts with “i965: Failed to submit batchbuffer: Invalid argument”

  • v3d: transform feedback issue

  • radv: Enable TC-compat HTILE in VK_IMAGE_LAYOUT_GENERAL.

  • radv: dEQP-VK.binding_model.descriptorset_random.sets4.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.noia.0 segfault

  • radv: RAVEN fails dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy

  • buffer overflow in nouveau driver on mesa 20.0.2

  • xmlconfig sha1 code has overflow and possible bug

  • enable storageBuffer16BitAccess feature in radv for SI and CIK

  • Build Fails with Clang Shared Library

  • Thousands of 32 bit regressions in VulkanCTS and GL test suites due to handling of cross-invocation

  • anv: isl assert when running dEQP-VK.geometry.layered.3d.*.readback

  • Weston drm-backend.so seems to fail with Mesa master and LIBGL_ALWAYS_SOFTWARE=1

  • freedreno/turnip: Don’t request pixlodenable when we don’t use it

  • VulkanCTS uniform_buffer_block_geom spins forever

  • freedreno: dEQP-GLES3.functional.fbo.msaa.4_samples.r16f flakiness in CI

  • srcutilmeson.build:291:4: ERROR: Program or command ‘winepath’ not found or not executable

  • RADV: flickering textures in Q.U.B.E. 2 through Proton

  • Missing ENDBR in entry_x86-64_tls.h, entry_x86_tls.h and entry_x86_tsd.h

  • [regression][bisected] Android build test fails: marshal_generated.c’, missing and no known rule to make it

  • Missing ENDBR in rtasm_x86sse.c

  • src/intel/tools/aubinator_viewer.cpp:383:52: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t {aka long long unsigned int}’ [-Werror=format=]

  • src/compiler/glsl/ast_to_hir.cpp:2134: ir_rvalue* ast_expression::do_hir(exec_list*, _mesa_glsl_parse_state*, bool): Assertion `result != NULL || !needs_rvalue’ failed.

  • process_test fails on macOS

  • Vulkan Overlay is blinking

  • Regression: 9d64ad2fe79 broke Rocket League

  • GameMaker games (Memoranda and Undertale) + amdgpu — Segmentation fault on launch

  • Civilization VI - Animated leader characters small black squares artifacts

  • [ACO] Reliable crash with RPCS3 that is not present with LLVM

  • [RADV] vkCmdBindTransformFeedbackBuffersEXT pSizes optional parameter not handled

  • [RadeonSI] - Curse of the Dead Gods (1123770) - Lighting is not rendering correctly.

  • soft-fp64: __fsat64 incorrectly returns NaN for a NaN input. It should return zero.

  • Hang when using glWaitSync with multithreaded shared GL contexts

  • RPCS3 / Persona 5 - Performance regression [RADV / Navi]

  • [ANV] Rendering corruption in Shadow of the Tomb Raider

  • src/compiler/glsl/glcpp/glcpp-parse.y:1297: _token_print: Assertion `!”Error: Don’t know how to print token.”’ failed.

  • [CTS] dEQP-VK.descriptor_indexing.* fails on RADV/LLVM

  • Unigine Valley failure / assert

  • [Gen9/icl] [Bisected] [Regression] dEQP-GLES3.functional.shaders.loops.short_circuit.do_while_fragment fail

  • [RadeonSI][gfx10/navi] Kerbal Space Program crash: si_draw_vbo: Assertion `0’ failed

  • Budget Cuts hits VK_AMD_shader_fragment_mask assert

  • Follow-up from “i965/blorp: Don’t resolve HiZ unless we’re reinterpreting”

  • crash in vc4_write_uniforms with shaders involving YUV textures

  • Corrupted output with vaapi 10 bit -> 8 bit transcoding on AMD RAVEN

  • tessellator.cpp:78:7: error: ‘fmin’ is missing exception specification ‘noexcept’

  • Please add Raspberry Pi 4 to features.txt

  • Build failure with bison 2.3.

  • Mesa build fails on 32 bit architecture

  • Mesa build fails on 32 bit architecture

  • Incorrect rendering with vaapi + uyvy422

  • V3D/Broadcom (Raspberry Pi 4) - GLES 3.1 - GL_EXT_texture_norm16 advertised, but not usable

  • mesa-20.0.0/src/amd/compiler/aco_instruction_selection.cpp:7221:55: style: Same expression on both sides of ‘&&

  • i965 assertion failure in fallback_rgbx_to_rgba

  • vaapi bob deinterlacer produces wrong output height on AMD

  • Compute copies do not handle SUBSAMPLED formats

  • Please document RADV_TEX_ANISO variable in envvars.html

  • unexpected CI failure

  • Multiple glapi_mapi_tmp.h

  • drisw crashes on calling NULL putImage on EGL surfaceless platform (pbuffer EGLSurface)

  • VRAM leak with vuilkan external memory + opengl memory objects

  • [radeonsi][vaapi][bisected] invalid VASurfaceID when playing interlaced DVB stream in Kodi

  • [RADV] GPU hangs while the cutscene plays in the game Assassin’s Creed Origins

  • ACO: The Elder Scrolls Online crashes on startup (Navi)

  • Broken rendering of glxgears on S/390 architecture (64bit, BigEndian)

  • aco: sun flickering with Assassins Creeds Origins

  • !1896 broke ext_image_dma_buf_import piglit tests with radeonsi

  • aco: wrong geometry with Assassins Creed Origins on GFX6

  • valgrind errors since commit a8ec4082a41

  • src/broadcom/qpu/qpu_pack.c:962:25: error: implicit declaration of function ‘ffs’ is invalid in C99 [-Werror,-Wimplicit-function-declaration] mux_b = ffs(desc->mux_b_mask) - 1;

  • X fails to start with amdgpu and Mesa 20.1 on Fedora

  • GPU hangs in Factorio on Radeon RX 5700 XT (MSI GAMING X)

  • OSMesa osmesa_choose_format returns a format not supported by st_new_renderbuffer_fb

  • Build error with VS on WIN

  • Using EGL_KHR_surfaceless_context causes spurious “libEGL warning: FIXME: egl/x11 doesn’t support front buffer rendering.”

  • !3460 broke texsubimage test with piglit on zink+anv

  • VERSION needs to be bumped for trunk master

  • The screen is black when using ACO

Changes

Abhishek Kumar (1):

  • anv/android: fix assert in anv_import_ahw_memory

Adam Jackson (1):

  • gallium: enable EGL_EXT_image_dma_buf_import_modifiers unconditionally

Albert Astals Cid (5):

  • cube_face_coord: Use fabsf instead of fabs since we know it’s floats

  • cube_face_index: Use fabsf instead of fabs since we know it’s floats

  • aco: Minor optimization in spill_ctx constructor

  • aco: pass vars by const &

  • Fix promotion of floats to doubles

Alejandro Piñeiro (7):

  • docs/features: add v3d driver

  • nir/linker: remove reference to just SPIR-V linking

  • v3d/tex: don’t configure tmu config 1 if not needed

  • v3d/tex: Configuration Parameter 1 can be only skipped if P2 can be skipped too

  • v3d/packet: fixing TMU_Config_Parameter_2 definition

  • nir: add nir_tex_instr_need_sampler helper

  • v3d: support for textureQueryLOD

Alexandros Frantzis (3):

  • gitlab-ci: Automated testing with OpenGL traces

  • gitlab-ci: Fix traces caching in tracie

  • gitlab-ci: Check the Mesa version used for tracie tests

Alyssa Rosenzweig (505):

  • pan/midgard: Break out one-src read_components

  • pan/midgard: Implement mixed-type constant packing

  • panfrost: Avoid overlapping copy

  • pan/midgard: Check for null consts

  • pan/midgard: Remove unused variable

  • panfrost: Use size0 when calculating the offset to a depth level

  • pan/midgard: Fix scheduling issue with csel + render target reference

  • panfrost: Simplify swizzle translation

  • panfrost: Update comment about magic number relating to barriers

  • panfrost: Ensure compute shader_meta is zeroed

  • panfrost: Identify mali_shared_memory structure

  • panfrost: Unify bifrost_scratchpad with mali_shared_memory

  • panfrost: Rename bifrost_framebuffer->mali_framebuffer

  • panfrost: Rename unknown2_8 to padding

  • panfrost: Allocate RAM backing of shared memory

  • pan/midgard: Track pressure when scheduling ld/st

  • pan/midgard: Fix missing prefixes

  • pan/midgard: Fix swizzles harder

  • pan/midgard: Implement barriers

  • pan/midgard: Allow jumping out of a shader

  • pan/midgard: Fix 32/64 mixed swizzle packing

  • pan/midgard: Use dummy tag for empty shaders

  • pan/midgard: Improve barrier disassembly

  • pan/midgard: Overhaul tag handling

  • pan/midgard: Imply next tags

  • pan/midgard: Infer tags entirely

  • pan/midgard: Set xyzx swizzle for load_compute_arg

  • pan/midgard: Identify stack barrier flag

  • pan/midgard: Don’t crash with constants on unknown ops

  • pan/midgard: Use fprintf instead of printf for constants

  • pan/decode: Remove extraneous newline

  • pan/decode: Add `minimal` mode

  • pan/decode: Cleanup pandecode_jc

  • panfrost: Implement PAN_DBG_SYNC with pandecode/minimal

  • panfrost: Print synced traces to stderr

  • panfrost: Rewrite scoreboarding routines

  • panfrost: Update scoreboarding notes

  • panfrost: Cleanup transfer_map

  • panfrost: Avoid reading GPU memory when packing vertices

  • panfrost: Debitfieldize mali_uniform_buffer_meta

  • panfrost: Remove enum panfrost_memory_layout

  • panfrost: Remove dirty tracking

  • panfrost: Remove old comment

  • panfrost: Remove old hack

  • panfrost: Remove flush_frontbuffer

  • pan/midgard: Identify clamp(x, -1.0, 1.0) flag

  • panfrost: Move checksum routines to root panfrost

  • panfrost: Move pan_afbc.c to root

  • panfrost: Move format translation to root

  • panfrost: Rewrite texture descriptor creation logic

  • nir: Add SSBO->global lowering pass

  • pan/midgard: Lower SSBOs in NIR

  • pan/midgard: Implement nir_intrinsic_get_buffer_size

  • pan/midgard: Implement load/store_shared

  • panfrost: Combine get_index_buffer with bound computation

  • panfrost: Implement index buffer cache

  • pan/decode: Dump scratchpad size if present

  • pan/midgard: Don’t spill near a branch

  • panfrost: Fix gl_VertexID/InstanceID

  • panfrost: Fix padded_vertex_count generation

  • panfrost: Update spilling comment framebuffer->shared

  • panfrost: Don’t set shared->unk0

  • panfrost: Fix param getting

  • panfrost: Default to 256 threads for TLS

  • panfrost: Reserve an extra page for spilling

  • panfrost: Simplify stack shift calculation

  • panfrost: Expose PIPE_CAP_PRIMITIVE_RESTART

  • panfrost: Add PAN_MESA_DEBUG=gles3 option

  • panfrost: Increase SSBO/image limit from 4->8

  • pan/midgard: Allow inverted inverted ops

  • pan/midgard: Allow fusing inverted sources for inverted ops

  • pan/midgard: Partially fix 64-bit swizzle alignment

  • pan/midgard: Extract nir_ssa_index helper

  • pan/midgard: Add LDST_ADDRESS property

  • pan/midgard: Fix load/store argument sizing

  • pan/midgard: Round up bytemasks when promoting uniforms

  • pan/midgard: Force address alignment

  • pan/midgard: Add address analysis framework

  • pan/midgard: Use address analysis for globals, etc

  • pan/decode: Calm an assert to a pandecode error

  • pan/decode: Restore bifrost sample_locations

  • pan/decode: Fix tiler weights printing

  • pan/decode: Skip analysis for Bifrost tiler structures

  • pan/bi: Add discard ops

  • pan/bi: Add ICMP.GL.NEQ op

  • pan/bi: Move notes on FMA opcodes from disassembler

  • pan/bi: Introduce CSEL4 class

  • pan/bi: Move notes on ADD ops to notes file

  • pan/bi: Decode FMA_SHIFT properly

  • pan/bi: Add v4i8 mode to FMA_SHIFT

  • pan/bi: Identify extended FMA opcodes

  • pan/bi: Decode ADD_SHIFT properly

  • pan/bi: Combine LOAD_VARYING_ADDRESS instructions by type

  • pan/bi: Squash LD_ATTR ops together

  • pan/bi: Structify FMA_FADD

  • pan/bi: Move some definitions from disasm to bifrost.h

  • panfrost: Add note about preloaded varyings

  • pan/bi: Gut old compiler

  • pan/bi: Stub out new compiler

  • pan/bi: Add the control flow graph

  • pan/bi: Add src/dest fields to bifrost_instruction

  • pan/bi: Add class properties

  • pan/bi: Add modifiers to bi_instruction

  • pan/bi: Add BI_GENERIC property

  • pan/bi: Factor out enum bifrost_minmax_mode

  • pan/bi: Add a bifrost_roundmode field

  • pan/bi: Add bifrost_minmax_mode field

  • pan/bi: Add bi_load structure

  • pan/bi: Pull out bifrost_load_var

  • pan/bi: Add bi_load_vary structure

  • pan/bi: Add PAN_SCHED_* flags

  • pan/bi: Add bi_clause, bi_bundle abstractions

  • pan/bi: Add dest_type field to bifrost_instruction

  • pan/bi: Add special indices

  • pan/bi: Add constant field to bi_instruction

  • pan/bi: Add class-specific ops

  • pan/bi: Add clause header fields to bi_clause

  • pan/bi: Clarify special op scheduling

  • pan/bi: Add swizzles

  • pan/bi: Add source type for conversions

  • pan/bi: Add EXTRACT, MAKE_VEC synthetic ops

  • pan/bi: Add constants to bi_clause

  • pan/bi: Add pred/successors to build CFG

  • pan/bi: Extract bifrost_branch structure

  • pan/bi: Add bi_branch data

  • pan/bi: Add CSEL condition

  • pan/bi: Add high-latency property for classes

  • pan/bi: Add quirks system

  • pan/bi: Add IR iteration macros

  • pan/bi: Move some print routines out of the disasm

  • pan/bi: Add BIR manipulation routines to bir.c

  • pan/bi: Move bi_interp_mode_name to bi_print

  • pan/bi: Add bi_instruction printing

  • pan/bi: Add bi_print_bundle for printing bi_bundle

  • pan/bi: Add bi_print_clause

  • pan/bi: Add bi_print_block

  • pan/bi: Add bi_print_shader

  • pan/bi: Lower and optimize NIR

  • pan/bi: Walk through the NIR control flow graph

  • pan/bi: Improve block printing

  • pan/bi: Don’t print types for unconditional branches

  • pan/bi: Print branch target

  • pan/bi: Add instruction emit/remove helpers

  • pan/bi: Call nir_lower_io_to_temporaries in cmdline

  • pan/bi: Add support for if-else blocks

  • pan/bi: Handle loops when ingesting CFG

  • pan/bi: Handle jumps (breaks, continues)

  • pan/bi: Fix destination printing

  • pan/bi: Implement nir_intrsinic_load_interpolated_input

  • pan/bi: Add blend_location to IR for BI_BLEND

  • pan/bi: Add bi_schedule_barrier helper

  • pan/bi: Implement store_output for fragment shaders

  • pan/bi: Implement load_input for vertex shaders

  • pan/bi: Add helpers for creating temporaries

  • pan/bi: Implement store_vary for vertex shaders

  • pan/bi: Add preliminary LOAD_UNIFORM implementation

  • pan/bi: Implement load_const

  • pan/bi: Add dummy scheduler

  • pan/bi: Rename next-wait to simply ‘wait’

  • pan/bi: Fix Android.mk

  • panfrost: Move mir_to_bytemask to common code

  • pan/bi: Generalize swizzles to avoid extracts

  • pan/bi: Introduce writemasks

  • pan/bi: Remove bi_load

  • pan/bi: Lower vec* to writemasks in NIR

  • pan/bi: Add initial handling of ALU ops

  • pan/bi: Allow inlining constants

  • pan/bi: Implement fsat as mov.sat

  • pan/bi: Add a bunch of ALU ops

  • pan/bi: Add BI_SPECIAL_* enum

  • pan/bi: Handle special ops in NIR->BIR

  • pan/bi: Implement fabs, fneg as fmov with mods

  • pan/bi: Disable lower_sub

  • pan/bi: Add isub op

  • pan/bi: Import algebraic pass from midgard

  • pan/bi: Implement nir_op_bcsel

  • pan/bi: Lower b2f to bcsel

  • pan/bi: Specify comparison op for BI_CMP

  • pan/bi: Print source types unconditionally

  • pan/bi: Implement comparison opcodes via BI_CMP

  • panfrost: Promote midgard_program to panfrost/util

  • pan/midgard: Remove unused iterators

  • pan/midgard: Adjust sysval-related prototypes

  • pan/midgard: Remove indexing dependency of sysvals

  • pan/midgard: Decontextualize midgard_nir_assign_sysval_body

  • pan/midgard: Remove dest_override sysval argument

  • panfrost: Move Midgard sysval code to common Panfrost

  • pan/bi: Switch to panfrost_program

  • pan/bi: Implement sysvals

  • pan/midgard: Localize `visited` tracking

  • pan/midgard: Decontextualize liveness analysis core

  • pan/midgard: Sync midgard_block field names with Bifrost

  • pan/midgard: Subclass midgard_block from pan_block

  • panfrost: Move liveness analysis to root panfrost/

  • panfrost: Sync Midgard/Bifrost control flow

  • pan/bi: Paste over bi_has_arg

  • pan/bi: Add bi_bytemask_of_read_components helpers

  • pan/bi: Add bi_next/prev_op helpers

  • pan/bi: Add bi_max_temp helper

  • pan/bi: Add liveness analysis pass

  • pan/bi: Add dead code elimination pass

  • pan/bi: Implement nir_op_ffma

  • pan/bi: Fix swizzle for second argument to ST_VARY

  • panfrost: Move lcra to panfrost/util

  • pan/midgard: Remove incorrect comment in RA

  • pan/bi: Minor fixes in iteration macros

  • pan/bi: Fix vector handling of readmasks

  • pan/bi: Fix missing src_types

  • pan/bi: Add register allocator

  • pan/bi: Interpret register allocation results

  • pan/bi: Setup initial clause packing

  • pan/bi: Sketch out instruction word packing

  • pan/bi: Add packing for register control field

  • pan/bi: Pack register fields

  • pan/bi: Add missing __attribute__((packed))

  • pan/bi: Assign registers to ports

  • pan/bi: Route through first_instruction field

  • pan/bi: Model 3-bit Bifrost srcs in IR

  • pan/bi: Add struct bifrost_fma_fma

  • pan/bi: Pack BI_FMA ops

  • pan/bi: Pack fadd32

  • pan/bi: List ADD classes in bi_pack_add

  • pan/bi: Generalize bi_get_src a bit

  • pan/bi: Pass second src for load_vary ops

  • pan/bi: Emit load_vary ops

  • pan/bi: Skip over data registers in port assignment

  • pan/bi: Route through clause header

  • pan/bi: Pretty-print clause types in disassembler

  • pan/bi: Don’t hide SCHED_ADD inside HI_LATENCY

  • pan/bi: Track clause types during scheduling

  • pan/bi: Flesh out ATEST in IR

  • pan/bi: Add ATEST packing

  • pan/bi: Flesh out BI_BLEND

  • pan/bi: Pack BI_BLEND

  • pan/bi: Implement FMA/MOV without modifiers

  • pan/bi: Add bi_emit_before helper

  • pan/bi: Add move lowering pass

  • pan/bi: Pack a constant quadword

  • pan/bi: Document constant related errata(?)

  • pan/bi: Index out constants in instructions

  • pan/bi: Include UBO index for sysval reads

  • pan/bi: Add bi_load32_components helper

  • pan/bi: Pack ld_ubo ops

  • pan/bi: Pack ld_var_addr

  • pan/bi: Flesh out st_vary IR

  • pan/bi: Generalize data register setting

  • pan/bi: Add store_channels property

  • pan/bi: Pack st_vary

  • pan/bi: Pack LD_ATTR

  • pan/bi: Lower bool to ints

  • pan/bi: Remove hacks for 1-bit booleans in IR

  • pan/bi: Add `soft` NIR->BIR condition translation

  • pan/bi: Implement csel fusing

  • pan/bi: Respect shift when printing immediates

  • pan/bi: Use bi_lookup_immediate when packing

  • pan/bi: Default csel to “!= 0” mode

  • pan/bi: Pack csel4 opcodes

  • pan/bi: Ingest vecN directly (again)

  • pan/bi: Lower combines to rewrites for scalars

  • pan/bi: Rewrite aligned vectors as well

  • panfrost: Split panfrost_device from panfrost_screen

  • panfrost: Isolate panfrost_bo_access_for_stage to pan_cmdstream.c

  • panfrost: Inline reference counting routines

  • panfrost: Move pan_bo to root panfrost

  • pan/bit: Link standalone compiler with en/decoder

  • panfrost: Move device open/close to root panfrost

  • pan/bit: Open up the device

  • panfrost: Stub out G31/G52 quirks

  • pan/bit: Submit a WRITE_VALUE job as a sanity check

  • pan/bit: Begin generating a vertex job

  • pan/bi: Fix overzealous write barriers

  • pan/bi: Fix off-by-one in scoreboarding packing

  • pan/bi: Enable precision lowering in standalone compiler

  • panfrost: Enable PIPE_SHADER_CAP_FP16 on Bifrost

  • pan/bi: Handle f2f* opcodes

  • pan/bi: Ignore swizzle in unwritten component

  • pan/bi: Finish FMA structures

  • pan/bi: Fix missing type for fmul

  • pan/bi: Add FMA16 packing

  • pan/bi: Pack outmod and roundmode with FMA

  • pan/bi: Expand out FMA conversion opcodes

  • pan/bi: Enumerate conversions

  • pan/bi: Handle standard FMA conversions

  • pan/bi: Add bifrost_fma_2src generic

  • pan/bi: Add one-source f32->f16 op

  • pan/bi: Assert out i16 related converts for now

  • pan/bi: Handle round opcodes in frontend

  • pan/bi: Add v2f16 versions of rounding ops

  • pan/bi: Structify fadd/min/max16

  • pan/bi: Handle core faddminmax16 packing

  • pan/bi: Handle abs packing for fp16/FMA add/min

  • pan/bi: Handle fp16/abs scheduling restriction

  • pan/bi: Fix handling of constants with COMBINE

  • pan/bit: Add `run` mode to the cmdline

  • pan/bit: Wire through I/O

  • pan/bi: Fix writes_component for VECTOR

  • pan/bi: Use STAGE srcs for scheduler nops

  • pan/bi: Don’t set the back-to-back bit yet

  • pan/bi: Add cmdline option for verbose disassembly

  • pan/bi: Fix unused port swapping

  • pan/bi: Handle fmov class ops

  • pan/bi: Fix outmod/roundmode flip

  • pan/bi: Export bi_class_name

  • pan/bi: Fix duplicated source in ADD.v2f16

  • pan/bi: Fix negation in ADD.v2f16

  • pan/bi: Don’t gobble zero ports

  • pan/bi: Allow BI_FMA to take mods

  • pan/bi: Handle BIFROST_FIRST_WRITE_FMA_P2_READ_P3

  • pan/bi: Add helper to debug port assignment

  • pan/bi: Match CSEL argument order with hw

  • pan/bit: Stub out BIR interpreter

  • pan/bit: Handle read/write

  • pan/bit: Add preliminary FMA/ADD/MOV implementations

  • pan/bit: Implement outmods

  • pan/bit: Implement floating source mods

  • pan/bit: Add packing test framework

  • pan/bit: Add helper for generating floating mod tests

  • pan/bit: Add verbose printing for tests

  • pan/bit: Add 16-bit fmod tests

  • pan/bit: Add FMA tests

  • pan/bit: Add CSEL to interpreter

  • pan/bit: Add csel tests

  • pan/bit: Make run more useful

  • pan/bit: Add mode to run unit tests

  • pan/bi: Remove nontrivial SPECIAL ops

  • pan/bi: Add 32-bit _FAST packing

  • pan/bi: Add fp16 support for frcp/frsq

  • pan/bit: Add special op interpreting

  • pan/bit: Add special unit test

  • pan/bi: Implement min/max on FMA

  • pan/bi: Structify ADD unit add/min/max

  • pan/bi: Add ADD add/min/max fp32 packing

  • pan/bi: Set BI_MODS for MINMAX

  • pan/bi: Fix incorrect abs flip in fma/fadd16

  • pan/bi: Force ADD scheduling for MINMAX

  • pan/bit: Unify test frontends

  • pan/bit: Add min/max support to interpreter

  • pan/bit: Enable more debug for `run`

  • pan/bit: Add fmin/max16 tests

  • pan/bit: Wire up add/add op+test

  • panfrost: Add IS_BIFROST quirk

  • panfrost: Populate bifrost-specific structs within mali_shader_meta

  • panfrost: Staticize a few cmdstream functions

  • panfrost: Unify vertex/tiler structures

  • panfrost: Set mfbd.msaa.sample_locations on Bifrost

  • panfrost: Call the Bifrost compiler on bi devices

  • pan/bi: Fix nondeterministic register packing

  • pan/midgard: Remove unused max_varying variable

  • panfrost: Move varying linking to cmdstream

  • panfrost: Move uniform_count to pan_assemble

  • panfrost: Pass compiler-appropriate options

  • pan/bi: Fix backwards registers ports

  • panfrost: Fix BI_BLEND packing

  • pan/bi: Let !b2b imply branch_cond

  • pan/decode: Print Bifrost blend descriptor

  • panfrost: Drop dependency on nonexistant write_value

  • pan/bi: Lower fsqrt

  • pan/midgard: Fix f2u naming confusion

  • pan/bi: Set BI_ROUNDMODE for BI_CONVERT

  • pan/bi: Fix incorrect swizzle packing assert

  • pan/bi: Rewrite conversion packing

  • pan/bi: ADD packing for CONVERT

  • pan/bit: Add BI_CONVERT interpretation

  • pan/bit: Add BI_CONVERT tests

  • pan/bi: Add disasm for ADD.i8

  • pan/bi: Disable FMA scheduling for CONVERT

  • pan/bi: Add BI_TABLE for fast table accesses

  • pan/bi: Add special op for exp2

  • pan/bi: Add op for ADD_FREXPM

  • pan/bi: Add FLOG2_U op to disassembler

  • pan/bi: Add log_frexpe op to IR

  • pan/bi: Add frexp_log packing

  • pan/bi: Add bi_pack_fma_2src helper

  • pan/bi: Pack ADD_FREXPM

  • pan/bi: Add log2_help packing

  • pan/bi: Add _MSCALE flag for FMA/ADD

  • pan/bi: Structify FMA_MSCALE

  • pan/bi: Pack FMA_MSCALE

  • pan/bi: Add fexp2_fast packing

  • pan/bi: Split src/dest index printing

  • pan/bi: Ensure CONSTANT srcs have types

  • pan/bi: Fix bi_get_immediate with multiple imms

  • pan/bi: Fix packing with multiple constants

  • pan/bi: Fix packing with low-nibble-set on hi constant

  • pan/bi: Fix lower_combine swizzle rewrite

  • pan/bi: Add fexp2 implementation

  • pan/bi: Implement flog2

  • pan/bi: Fix vec2/3 handling

  • pan/bi: Handle st_vary with <4 components

  • pan/bi: Try to reuse constants in ALU

  • pan/bi: Workaround constant packing errata

  • pan/bi: Structify add and min/max fp16 ADD

  • pan/bi: Pack ADD.v2f16

  • pan/bi: Pack MAX.v2f16

  • pan/bi: Dump extra bits for disasm

  • pan/bi: Round constants to 32-bit

  • pan/bi: Lower special ops to 32-bit

  • pan/bit: Add FREXP interp support

  • pan/bit: Add frexp_log test

  • pan/bit: Add BI_REDUCE_FMA interp

  • pan/bit: Add FMA_REDUCE test

  • pan/bit: Add log2 helper interp

  • pan/bit: Add BI_TABLE test

  • pan/bit: _MSCALE interp

  • pan/bit: Add FMA_MSCALE test

  • pan/bit: Add fexp2_fast interp

  • pan/bit: Add fexp2_fast test

  • pan/bit: Add constants test

  • pan/bit: Add fp16 min/max tests

  • pan/bi: Print tex_compact coordinates

  • pan/bi: Document when dual-tex is triggered

  • pan/bi: Disassemble f16 dual tex

  • pan/bi: Structify TEX compact

  • pan/bi: Include TEX_COMPACT f16 opcode

  • pan/bi: Feed data register to BI_TEX

  • pan/bi: Add normal/compact/dual switch to IR

  • pan/bi: Stub out tex_compact logic

  • pan/bi: Generate TEX_COMPACT instruction

  • pan/bi: Pack TEX compact instructions

  • pan/bi: Assert out multiple textures

  • panfrost: Fix crashes with small BOs

  • panfrost: Assert on unimplemented fragcoord etc

  • panfrost: Set clear_color_[12] in the extra fb desc

  • panfrost: Add tentative bifrost_texture_descriptor

  • panfrost: decode textures and samplers on bifrost

  • pan/decode: Remove is_zs weirdness

  • panfrost: Identify texture layout field

  • panfrost: The texture descriptor has a pointer to a trampoline

  • pan/bi: Pack fp16 ATEST

  • pan/bi: Passthrough type for ATEST

  • pan/bi: Passthrough blend types

  • pan/bi: Assign blend descriptor for BLEND op

  • pan/bi: Add missing BI_VECTOR

  • pan/bi: Fix ADD.v4i8 opcode

  • pan/bi: Eliminate writemasks in the IR

  • pan/bi: Rename BI_SWIZZLE to BI_SELECT

  • pan/bi: Pack FMA SEL16

  • pan/bi: Pack FMA SEL8

  • pan/bi: Pack ADD SEL16

  • pan/bi: Force BI_SELECT arguments scalar

  • pan/bit: Interpret BI_SELECT

  • pan/bit: Add SELECT tests

  • pan/bi: Fix RA wrt 16-bit swizzles

  • pan/bi: Implement 16-bit COMBINE lowering

  • nir: Move nir_lower_mediump_outputs from ir3

  • ir3: Use shared mediump output lowering

  • pan/bi: Add bool->float opcodes

  • pan/bi: Add CSEL.64 opcode

  • pan/bi: Add some 8-bit compares

  • pan/bi: Add 64-bit int compares

  • pan/bi: Add FCMP.GL.v2f16 on ADD opcode

  • pan/bi: Add CSEL.8 opcode

  • pan/bi(t): Fix SELECT tests

  • pan/bi: Deduplicate csel/cmp cond

  • pan/bi: Remove bi_round_op

  • pan/bi: Structify FMA FCMP

  • pan/bi Strucitfy ADD FCMP 32

  • pan/bi: Structify FMA FCMP16

  • pan/bi: Structify ADD FCMP16

  • pan/bi: Structify FMA ICMP 32

  • pan/bi: Structify FMA ICMP 16

  • pan/bi: Structify ADD ICMP 32

  • pan/bi: Fix source mod testing for CMP

  • pan/bi: Pack FMA 32 FCMP

  • pan/bi: Factor out fp16 abs logic

  • pan/bi: Pack fma.fcmp16

  • pan/bi: Relax double-abs condition

  • pan/bit: Prepare condition evaluation for vectors

  • pan/bit: Interpret CMP

  • pan/bi: Add initial fcmp test

  • pan/bi: Add bitwise modifiers

  • pan/bi: Pack BI_BITWISE

  • pan/bi: Handle iand/ior/ixor in NIR->BIR

  • pan/bit: Interpret BI_BITWISE

  • pan/bit: Add BITWISE test

  • panfrost: Fix BO reference counting

  • panfrost: Move Bifrost IR indexing to common

  • pan/bi: Use common IR indices

  • pan/mdg: Remove nir_alu_src_index

  • pan/mdg: Use PAN_IS_REG

  • pan/mdg: SSA_FIXED_MINIMUM already covered by PAN_IS_REG

  • pan/mdg: Don’t break SSA

  • pan/mdg: Remove goofy 16-bit comment

  • pan/mdg: Remove old hack

  • pan/mdg: Set lower_flrp16

  • pan/bi: Share ALU type printing

  • pan/mdg: Add type fields to IR

  • pan/mdg: Track ALU src types

  • pan/mdg: Track ALU dest type

  • pan/mdg: Another goofy comment gone

  • pan/mdg: Track a primary type for I/O

  • pan/mdg: Denoise prints

  • pan/mdg: Track v_mov type (force uint32 for now?)

  • pan/mdg: Track texture types

  • pan/mdg: Set texture full fields at pack time

  • pan/mdg: Move sampler_type emission to pack time

  • pan/mdg: Lower specials to 32-bit

  • pan/mdg: Specialize swizzle to type

  • pan/mdg: Always print the mask

  • pan/mdg: Make some branch targets more explicit

  • pan/mdg: Don’t crash on unknown branch target

  • pan/mdg: Pass through some types from scheduling

  • pan/mdg: Move condense_writemask to disasm

  • pan/mdg: Ensure fdot is scalar out in disasm

  • pan/mdg: Replicate 16-bit swizzles

Andreas Baierl (8):

  • lima/parser: Fix RSW depth test parsing

  • lima/parser: Extend AUX0 findings

  • lima/parser: Change value name in RSW parser

  • lima/parser: Extend rsw parsing showing strings instead of numbers

  • gitlab-ci: lima: Add flaky tests to the skips list

  • gitlab-ci: Enable the lima job again

  • gitlab-ci: Add add a set of lima flakes

  • lima: Add etc1 support

Andres Gomez (27):

  • tracie: correct typo

  • gitlab-ci: add missing popd to the build-deqp-vk.sh script

  • gitlab-ci: build gfxreconstruct into the Vulkan testing container

  • gitlab-ci: build VulkanTools into the Vulkan testing container

  • gitlab-ci: Change devices format to <api-vendor-deviceId>

  • gitlab-ci: Add gfxreconstruct traces support

  • gitlab-ci: Add jobs to be able to test Vulkan

  • gitlab-ci: Fix indentation and dangerous “" in the last multiline line

  • gitlab-ci: Remove unneeded python3-pilkit dependency

  • gitlab-ci: Sort packages to install alphabetically

  • gitlab-ci: add python3-requests to the test-vk container

  • gitlab-ci/traces: Add Vulkan sample entries for POLARIS10

  • gitlab-ci: Don’t use buster-backports packages by default for x86_test-vk

  • gitlab-ci: add Wine, win64’s apitrace and DXVK to the Vulkan testing container

  • gitlab-ci: add apitrace’s DXGI traces support

  • gitlab-ci: replay apitrace traces in headless mode

  • gitlab-ci: add Wine and DXVK env variables to Vulkan’s tracie runner

  • gitlab-ci/traces: Add D3D11 sample entry for POLARIS10

  • gitlab-ci: Vulkan tracie runner to return last command exit code

  • gitlab-ci: protect usage of shell variables with double quotes

  • gitlab-ci: make explicit tracie is gitlab specific

  • gitlab-ci: adapt query_traces_yaml to gitlab specific changes

  • gitlab-ci: install winehq-stable to get 5.0 instead of 4.0

  • Revert “meson,ci: Disable sparse_array tests on windows”

  • gitlab-ci: update tracie README after changes in main script

  • gitlab-ci: create always the “results” directory with tracie

  • gitlab-ci: correct tracie behavior with replay errors

Andrii Simiklit (2):

  • Revert “glx: convert glx_config_create_list to one big calloc”

  • i965/vec4: Ignore swizzle of VGRF for use by var_range_end()

Anuj Phogat (2):

  • intel/gen12+: Reserve 4KB of URB space per bank for Compute Engine

  • intel/gen12+: Set way_size_per_bank to 4

Arcady Goldmints-Orlov (7):

  • compiler/nir: Add support for variable initialization from a pointer

  • compiler/spirv: Add support for non-constant initializers

  • Rename nir_lower_constant_initializers to nir_lower_variable_initalizers

  • spirv: Remove outdated SPIR-V decoration warnings

  • nir: Lower returns correctly inside nested loops

  • anv: increase minUniformBufferOffsetAlignment to 64

  • intel/compiler: fix alignment assert in nir_emit_intrinsic

Axel Davy (1):

  • gallium/util: Fix leak in the live shader cache

Bas Nieuwenhuizen (29):

  • radv: Allow non-dedicated linear images and buffer.

  • radv: Do not set SX DISABLE bits for RB+ with unused surfaces.

  • radv: Optimize emitting index buffer changes.

  • radv: Do not redundantly set the RB+ regs on pipeline switch.

  • radeonsi: Fix compute copies for subsampled formats.

  • amd/llvm: Fix divergent descriptor indexing. (v3)

  • amd/llvm: Fix divergent descriptor regressions with radeonsi.

  • radv: Store 64-bit availability bools if requested.

  • radv: Consider maximum sample distances for entire grid.

  • radv: Whitespace fixup.

  • radv: Use correct buffer count with variable descriptor set sizes.

  • winsys/amdgpu: Retrieve WC flags from imported buffers.

  • drm-uapi,radv,radeonsi: Add amdgpu_drm.h header.

  • vulkan/wsi: Add callback to set ownership of buffer.

  • radv: Add WSI buffers to BO list only if they can be used.

  • st/dri: Set next in template instead of after creation. (v2)

  • radeonsi: Count planes for imported textures.

  • radv: Use actual memory type count for setting app-visible bitset.

  • radv: Stop using memory type indices.

  • radv/winsys: Add function to get domains/flags from fd.

  • radv: Determine memory type for import based on fd.

  • radv: Expose 4G element texel buffers.

  • radv: Fix implicit sync with recent allocation changes.

  • radv: Extend tiling flags to 64-bit.

  • radv: Provide a better error for permission issues with priorities.

  • radv/winsys: Remove extra sizeof multiply.

  • radv: Handle failing to create .cache dir.

  • radv: Do not close fd -1 when NULL-winsys creation fails.

  • radv: Implement vkGetSwapchainGrallocUsage2ANDROID.

Bernd Kuhls (1):

  • util/os_socket: Include unistd.h to fix build error

Blaž Tomažič (1):

  • radeonsi: Fix omitted flush when moving suballocated texture

Boris Brezillon (45):

  • pan/midgard: Add an enum to describe the render targets

  • pan/midgard: Make sure we pass the right RT id to emit_fragment_store()

  • pan/midgard: Lower bitfield extract to shifts

  • pan/midgard: Don’t check ‘branch && branch->writeout’ twice in mir_schedule_alu()

  • pan/midgard: Stop leaking instruction objects in mir_schedule_alu()

  • panfrost: Fix the damage box clamping logic

  • pan/midgard: Turn Z/S stores into zs_output_pan intrinsics

  • pan/midgard: Add nir_intrinsic_store_zs_output_pan support

  • panfrost: Z24 variants should be sampled as R32UI

  • panfrost: Add the MALI_WRITES_{Z,S} flags

  • panfrost: Set the MALI_WRITES_{Z,S} flags when needed

  • Revert “panfrost: Z24 variants should be sampled as R32UI”

  • panfrost: Pass the sampler view format when creating a tex descriptor

  • panfrost: Assign primitive_size.pointer only if writes_point_size() returns true

  • panfrost: Add an helper to retrieve the currently active shader state

  • panfrost: Move the batch stack size adjustment out of panfrost_queue_draw()

  • panfrost: Move viewport desc emission out of panfrost_emit_for_draw()

  • panfrost: Move the const buf emission logic out of panfrost_emit_for_draw()

  • panfrost: Move shared mem desc emission out of panfrost_launch_grid()

  • panfrost: Dissociate shader meta patching from the desc emission

  • panfrost: Move panfrost_attach_vt_framebuffer() to pan_cmdstream.c

  • panfrost: Stop using panfrost_emit_for_draw() for compute jobs

  • panfrost: Simplify panfrost_emit_for_draw() and make it private

  • panfrost: Add an helper to update the occclusion query part of a tiler job desc

  • panfrost: Add an helper to update the rasterizer part of a tiler job desc

  • panfrost: Prepare things to get rid of panfrost_shader_state.tripipe

  • panfrost: Prepare shader_meta descriptors at emission time

  • panfrost: Add a panfrost_sampler_desc_init() helper

  • panfrost: Move sampler/tex descs emission helpers to pan_cmdstream.c

  • panfrost: Add an helper to emit a pair of vertex/tiler jobs

  • panfrost: Drop initial mali_attr_meta.src_offset assignment

  • panfrost: Ignore BO start addr when adjusting src_offset

  • panfrost: Prepare attribute for builtins at state creation time

  • panfrost: Emit attribute descriptors after patching the templates

  • panfrost: Move the mali_attr.src_offset adjustment to a sub-function

  • panfrost: Rename panfrost_stage_attributes()

  • panfrost: Move streamout offset update out of panfrost_draw_vbo()

  • panfrost: Move vertex/tiler payload initialization out of panfrost_draw_vbo()

  • panfrost: Inline panfrost_queue_draw() and panfrost_emit_for_draw()

  • panfrost: Move panfrost_emit_vertex_data() to pan_cmdstream.c

  • panfrost: Move panfrost_emit_varying_descriptor() to pan_cmdstream.c

  • panfrost: Re-init the VT payloads at draw/launch_grid() time

  • panfrost: Use ctx->active_prim in panfrost_writes_point_size()

  • panfrost: Get rid of ctx->payloads[]

  • vtn/opencl: add rint-support

Brian Ho (17):

  • turnip: Promote tu_cs_get_size/is_empty to header

  • turnip: Execute main cs for secondary command buffers

  • turnip: Advertise 8 bit subpixel precision

  • ir3: Disable copy prop for immediate ldlw offsets

  • turnip: Set has_gs in ir3_shader_key

  • turnip: Emit geometry shader obj and related consts

  • turnip: Configure VPC for geometry shaders

  • turnip: Configure VFD_CONTROL with gsheader and primitiveid

  • turnip: Set up REG_A6XX_SP_GS_CONFIG

  • turnip: Selectively configure GRAS_LAYER_CNTL

  • turnip: Update maxGeometryShaderInvocations to match blob

  • turnip: Populate tu_pipeline.active_stages

  • turnip: Enable geometry shaders for CP_DRAWs

  • turnip: Enable geometryShader device feature

  • turnip: Correctly set layer stride for 3D images

  • turnip: Emit geometry shader descriptor consts

  • freedreno/turnip: Update GRAS_LAYER_CNTL to GRAS_MAX_LAYER_INDEX

Caio Marcelo de Oliveira Filho (46):

  • anv: Advertise VK_KHR_shader_non_semantic_info

  • radv: Advertise VK_KHR_shader_non_semantic_info

  • intel/gen12: Take into account opcode when decoding SWSB

  • spirv: Be consistent when checking for Shader/Kernel

  • anv: Use intel_debug_flag_for_shader_stage()

  • anv: Add pipe_state_for_stage() helper

  • nir/builder: Add nir_scoped_memory_barrier()

  • nir: Add the alias NIR_MEMORY_ACQ_REL

  • nir/tests: Use nir_scoped_memory_barrier() helper

  • nir, intel: Move use_scoped_memory_barrier to nir_options

  • anv: Remove unused field xfb_used from anv_pipeline

  • anv: Remove unused field `urb.total_size`

  • nir: Don’t skip a bit in nir_memory_semantics

  • nir: Reorder nir_scopes so wider scope has larger numeric value

  • nir: Add pass to combine adjacent scoped memory barriers

  • intel/fs: Combine adjacent memory barriers

  • anv: Add a new enum to identify the pipeline type

  • anv: Use pipeline type to decide whether or not lower multiview

  • anv: Use a dynamic array for storing executables in pipeline

  • anv: Keep the shader stage in anv_shader_bin

  • anv: Pass the right pipe_state to flush_descriptor_sets()

  • anv: Remove redundant check in flush_descriptor_sets() helpers

  • anv: Decouple flush_descriptor_sets() helpers from pipeline struct

  • anv: Decouple flush_descriptor_sets() from pipeline struct

  • anv: Use a separate field in the pipeline for compute shader

  • anv: Split graphics and compute bits from anv_pipeline

  • anv: Reduce compute pipeline batch_data size

  • anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set

  • intel/blorp: Plumb the stage through blorp upload_shader

  • mesa/main: Fix overflow in validation of DispatchComputeGroupSizeARB

  • nir: Add per_view attribute to nir_variable

  • intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION

  • intel/fs: Allow multiple slots for position

  • anv/gen12: Lower VK_KHR_multiview using Primitive Replication

  • intel/compiler: Replace cs_prog_data->push.total with a helper

  • anv: Stop using cs_prog_data->threads

  • iris: Stop using cs_prog_data->threads

  • intel/compiler: Remove cs_prog_data->threads

  • intel/fs,vec4: Properly account SENDs in IVB memory fence

  • spirv: Fix propagation of OpVariable access flags

  • spirv: Handle instruction aliases in vtn_gather_types

  • spirv: Update the headers from latest Khronos master

  • intel/fs: Allow FS_OPCODE_SCHEDULING_FENCE stall on registers

  • intel/fs,vec4: Pull stall logic for memory fences up into the IR

  • intel/fs: Only stall after sending all memory fence messages

  • i965: Use correct constant for max_variable_local_size

Chad Versace (12):

  • anv: Drop unused anv_image_get_surface_for_aspect_mask()

  • anv: Rename param make_surface::dev to device

  • anv: Delete anv_image::ccs_e_compatible

  • anv: Clarify behavior of anv_image_aspect_to_plane()

  • anv: Respect ISL_SURF_USAGE_DISABLE_AUX_BIT in make_surface()

  • turnip: Add magic register values to tu_physical_device

  • turnip: Add a618 support

  • anv: Drop anv_image.c:get_surface()

  • anv: Add anv_image_plane_needs_shadow_surface() (v2)

  • anv: Refactor creation of aux surfaces (v2)

  • anv: Flatten the logic add_aux_surface_if_supported (v3)

  • anv: Use isl_drm_modifier_get_default_aux_state()

Chia-I Wu (2):

  • egl/android: require ANDROID_native_fence_sync for buffer age

  • egl/android: enable/disable KHR_partial_update correctly

Chris Lord (2):

  • vc4: fix vc4_yuv_blit overwriting fragment constant buffer slot 0

  • vc4: Fix query_dmabuf_modifiers mis-reporting external_only property

Chris Wilson (1):

  • iris: Fix import sync-file into syncobj

Christian Gmeiner (44):

  • etnaviv: enable texture upload memory throttling

  • etnaviv: update headers from rnndb

  • etnaviv: fix alpha test on GC3000

  • etnaviv: add etna_constbuf_state object

  • etnaviv: ask kernel for max number of supported varyings

  • etnaviv: update headers from rnndb

  • etnaviv: increase number of supported varyings to 16

  • etnaviv: implement emit_string_marker

  • etnaviv: get rid of etna_spec in etna_context

  • etnaviv: enable shareable shaders

  • freedreno: calculate modified bit mask only once

  • freedreno: simplify fd_set_shader_buffers(..)

  • freedreno: ssbo: keep track if a buffer gets written

  • freedreno: ssbo: mark resource read or written depending on usage

  • etnaviv: get rid of SE_CLIP_*

  • etnaviv: rework clippling calculation to be a derived state

  • etnaviv: do the left shift by 16 at emit time

  • etnaviv: get rid of struct compiled_scissor_state

  • etnaviv: s/scissor_s/scissor

  • etnaviv: compiled_framebuffer_state: get rid of SE_SCISSOR_*

  • etnaviv: rename hw queries to acc queries

  • etnaviv: rework etna_acc_sample_provider

  • etnaviv: explicitly call resource_written(..)

  • etnaviv: reset no_wait_cnt after triggered flush

  • etnaviv: rework wait/flush logic

  • etnaviv: extend acc query provider with supports(..) function

  • etnaviv: make use of a fixed size array to track of all acc query provider

  • etnaviv: extend result(..) to return if data is ready

  • etnaviv: extend acc sample provide with an allocate(..)

  • etnaviv: move generic perfmon functionality into own file

  • etnaviv: convert perfmon queries to acc queries

  • etnaviv: drop redundant calls to etna_acc_query_suspend(..)

  • etnaviv: change begin_query(..) to a void function

  • etnaviv: remove the “active” member of queries

  • etnaviv: anisotropic filtering is supported starting with HALTI0

  • etnaviv: update headers from rnndb

  • etnaviv: add anisotropic filter support

  • docs/features: mark GL_ARB_texture_filter_anisotropic as done for etnaviv

  • etnaviv: drop default state for FE_HALTI5_ID_CONFIG

  • etnaviv: call util_blitter_save_fragment_constant_buffer_slot(..)

  • etnaviv: support for using generic blit path

  • ci: bare-metal: power down device after tests

  • etnaviv: fix SAMP_ANISOTROPY register value

  • etnaviv: do not use int filter when anisotropic filtering is used

Christopher Egert (1):

  • radv: use util_float_to_half_rtz

Christopher James Halse Rogers (1):

  • egl/wayland: Fix zwp_linux_dmabuf usage

Connor Abbott (55):

  • freedreno: Fix CP_COND_REG_EXEC bit positions

  • freedreno: Add CP_REG_WRITE documentation

  • freedreno: Fix CP_COND_EXEC

  • tu: Move vsc_data and vsc_data2 allocation into the device

  • tu: Don’t emit initial render target state in tile_load_ib

  • tu: Properly set UBWC flags in RB_RENDER_CNTL

  • tu/blit: Support blits in secondary cmdstreams

  • tu: Support multisample image clears

  • tu: Disable linear depth attachments

  • tu: Sysmem rendering

  • tu: Add helper for CP_COND_REG_EXEC

  • tu: Handle vkCmdClearAttachments() with sysmem

  • tu: Support resolve ops with sysmem rendering

  • tu: Support input attachments with sysmem

  • tu: Force sysmem with mipmapped non-aligned linear stores

  • tu: Rewrite border color handling

  • lima/gpir: Make lima_gpir_node_insert_child() useful

  • lima/gpir: Optimize conditional break/continue

  • lima/gpir: Optimize nots created from branch lowering

  • tu: Fix border color with compute shaders

  • freedreno/fdl: Add base_align

  • tu: Return the correct alignment for images

  • freedreno: Cleanup event names

  • freedreno: Rename RB_DONE_TS

  • tu: Dump out shader assembly when requested

  • tu: ir3: Emit push constants directly

  • freedreno/a6xx: Add UBO size field

  • freedreno/a6xx: Add registers for the bindless model

  • ir3: Add bindless instruction encoding

  • ir3: Plumb through support for a1.x

  • ir3: Also don’t propagate immediate offset with LDC

  • ir3: LDC also has a destination

  • ir3: Plumb through bindless support

  • ir3: Rewrite UBO push analysis to support bindless

  • tu: Switch to the bindless descriptor model

  • tu: Emit CP_LOAD_STATE6 for descriptors

  • tu: Add missing code for immutable samplers

  • tu: Implement descriptor set update templates

  • ir3: Fix txs with bindless

  • ir3: Fix LDC offset units

  • ir3: Handle load_ubo_ir3 when promoting to constants

  • tu: Align GMEM resolve blit scissor

  • tu: Use tu_cs_add_entries() with non-render-pass secondaries

  • ir3/ra: Fix off-by-one issues with live-range extension

  • freedreno/a6xx: Expand various varying-count bitfields

  • tu: Fix the advertised maxFragmentInputComponents

  • ir3: Don’t double-insert the first block

  • ir3: Fix bug with shaders that only exit via discard

  • freedreno/a6xx: Document PrimID passthrough registers

  • ir3: Skip missing VS outputs in VS out map when linking

  • tu: Implement PrimID passthrough

  • freedreno/a6xx: Implement PrimID passthrough

  • st/nir: Fix assigning PointCoord location with !PIPE_CAP_TEXCOORD

  • ir3: Remove VARYING_SLOT_PNTC remapping hack

  • tu: Don’t invert point coords

D Scott Phillips (6):

  • intel/tools/aubinator_error_decode: read HW Context before other batches

  • intel/tools/aubinator_error_decode: Decode ring buffers from HEAD to TAIL

  • util/sparse_array: don’t stomp head’s counter on pop operations

  • intel/fs: Update location of Render Target Array Index for gen12

  • anv,iris: Fix input vertex max for tcs on gen12

  • anv/gen11+: Disable object level preemption

Daniel Schürmann (73):

  • aco: fix image_atomic_cmp_swap

  • nir: gather info whether a shader uses demote_to_helper

  • nir: add pass to lower discard() to demote()

  • amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation

  • radeonsi: lower discard to demote when FS_CORRECT_DERIVS_AFTER_KILL is enabled

  • radv: use nir_lower_discard_to_demote to work around game bugs

  • amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm

  • nir: fix unpack_64_4x16 in lower_alu_to_scalar()

  • aco: add comparison operators for PhysReg

  • aco: add sub-dword regclasses

  • aco: refactor regClass setup for subdword VGPRs

  • aco: validate p_create_vector with subdword elements properly

  • aco: validate register alignment of subdword operands and definitions

  • aco: validate uninitialized operands

  • aco: validate RA of subdword assignments

  • aco: print subdword registers

  • aco: fix Temp and assignment of renamed operands during RA

  • aco: remove unnecessary reg_file.fill() operation in get_reg_create_vector()

  • aco: add notion of subdword registers to register allocator

  • aco: create helper function to collect variables from register area

  • aco: adapt register allocation for subdword registers

  • aco: align subdword registers during RA when necessary

  • aco: small refactoring of shuffle code lowering

  • aco: add builder function for subdword copy()

  • aco: lower subdword shuffles correctly.

  • aco: don’t propagate SGPRs into subdword PSEUDO instructions

  • aco: don’t assume split_vector(create_vector) has the same number of elements when optimizing

  • aco: don’t vectorize 8/16bit load/store_ssbo

  • aco: add missing conversion operations for small bitsizes

  • aco: add byte_align_scalar() & trim_subdword_vector() helper functions

  • aco: prepare helper functions for subdword handling

  • aco: implement vec2/3/4 with subdword operands

  • aco: implement storagePushConstant8 & storagePushConstant16

  • aco: implement 8bit/16bit load_buffer

  • aco: implement 8bit/16bit store_ssbo

  • aco: use MUBUF to load subdword SSBO

  • aco: guarantee that Temp fits in 4 bytes

  • aco: add explicit padding for all Instruction sub-structs

  • aco: improve hashing for value numbering

  • aco: improve register assignment when live-range splits are necessary

  • aco: replace assignment hashmap by std::vector in register allocation

  • aco: during RA only insert into renames table if a variable got renamed

  • aco: improve speed of live_var_analysis

  • aco: refactor try_remove_trivial_phi() in RA

  • aco: change some std::map to std::unordered_map in register_allocation

  • aco: change live_out variables to std::unordered_set

  • aco: move all needed helper containers to ra_ctx

  • aco: RA - move all std::function objects into proper functions

  • aco: setup subdword regclasses for ssa_undef & load_const

  • aco: ensure correct bit representation of subdword constants

  • aco: don’t constant-propagate into subdword PSEUDO instructions

  • aco: lower subdword phis with SGPR operands

  • aco: rename aco_lower_bool_phis() -> aco_lower_phis()

  • aco: make some reg_file helpers private and fix their uses

  • aco: fix p_extract_vector optimization in presence of unequally sized vector operands

  • aco: use v_subrev_f32 for fsub with an sgpr operand in src1

  • aco: fix 64bit fsub

  • aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel

  • aco: simplify operand handling in RA

  • aco: refactor get_reg() to take Temp instead of RegClass

  • aco: refactor get_reg() to also handle affinities

  • aco: create pseudo dummy instruction in RA to be used for live-range splits

  • aco: create and use DefInfo struct in RA

  • aco: use DefInfo in more places to simplify RA

  • aco: move attempt to find strided register into get_reg_simple()

  • aco: allocate full register for subdword definitions if HW doesn’t support it

  • aco: don’t create vector affinities for operands which are not killed or are duplicates

  • aco: refactor get_reg_simple() to return early on exact matches

  • aco: stop get_reg_simple after reaching max_used_gpr

  • aco: try to always find a register with stride for even sizes

  • aco: use upper part of gap in register file if it is beneficial for striding

  • aco: coalesce v_mad’s accumulator with definition’s affinities

  • aco: either copy-propagate or inline create_vector operands

Daniel Stone (15):

  • Revert “gitlab-ci: disable panfrost runners”

  • egl/wayland: Don’t invalidate buffers on no-op resize

  • util/test: Use MAX_PATH on Windows

  • CI: Add native Windows VS2019 build

  • CI: Windows: Fix Docker tag argument inversion

  • CI: Disable Panfrost Mali-T820 jobs

  • CI: Avoid htz4 runner for VS2019

  • meson: Add VS 4624 warning exclusion to remove piles of LLVM warnings

  • CI: Re-enable Windows VS2019 builds

  • EGL: Add eglSetDamageRegionKHR to GLVND dispatch list

  • meson: Make shared-llvm into a tri-state boolean

  • CI: Disable Windows/VS2019 builds

  • Revert “CI: Disable Windows/VS2019 builds”

  • ci/windows: Make Chocolatey installs more reliable

  • CI: Disable Lima jobs due to lab unhealthiness

Danylo Piliaiev (29):

  • i965: Do not set front_buffer_dirty if there is no front buffer

  • st/mesa: Handle the rest renderbuffer formats from OSMesa

  • osmesa/tests: Cover OSMESA_RGB GL_UNSIGNED_BYTE case

  • st/nir: Unify inputs_read/outputs_written before serializing NIR

  • brw_nir: Cast bitshift to unsigned

  • brw_fs: Avoid zero size vla

  • intel/compiler: Do not qsort zero sized array

  • intel/bufmgr: Cast bitshift to unsigned

  • glsl/blob: Do not call memcpy if there is nothing to copy

  • iris: Do not dereference nullptr with pipe_reference

  • i965: Do not generate D16 B5G6R5_UNORM configs on gen < 8

  • intel/tools: Fix compilation with UBSan

  • glsl: do not crash if string literal is used outside of #include/#line

  • st/mesa: Fix signed integer overflow when using util_throttle_memory_usage

  • intel/aub_viewer: Fix format specifier for uint64_t

  • nir: Fix breakage of foreach_list_typed_safe assumptions in loop unrolling

  • anv: Do not sample from 3d depth image with HiZ

  • glsl/list: Fix undefined behaviour of foreach_* macros

  • st/mesa: Update shader info of ffvp/ARB_vp after translation to NIR

  • st/mesa: Re-assign vs in locations after updating nir info for ffvp/ARB_vp

  • spirv: Expand workaround for OpControlBarrier on old GLSLang

  • st/mesa: Treat vertex inputs absent in inputMapping as zero in mesa_to_tgsi

  • iris/bufmgr: Check if iris_bo_gem_mmap failed

  • i965: Fix out-of-bounds access to brw_stage_state::surf_offset

  • anv: Translate relative timeout to absolute when calling anv_timelines_wait

  • anv: Fix deadlock in anv_timelines_wait

  • meson: Disable GCC’s dead store elimination for memory zeroing custom new

  • mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong mutex

  • intel/fs: Work around dual-source blending hangs in combination with SIMD16

Dave Airlie (69):

  • llvmpipe/query: add support for indexed queries

  • gallivm/swr: add stream_id to geom epilogue emit

  • gallivm/nir: add support for multiple vertex streams

  • draw: change geom shader output to an array of outputs.

  • draw/gs: track emitted prims + verts per stream.

  • draw: emit multiple streams to streamout.

  • draw: don’t emit vertex to streams with no outputs

  • llvmpipe: advertise 4 vertex streams

  • gallivm/s390: fix pass init order on s390 with llvm 8 (v2)

  • ci: bump debian image and change llvm deps to 8

  • dri: add another get shm variant.

  • glx/drisw: add getImageShm2 path

  • glx/drisw: return false if shmid == -1

  • glx/drisw: fix shm put image fallback

  • gallivm/tgsi: fix stream id regression

  • gallivm/nir: fix integer divide SIGFPE

  • gallivm/nir: handle mod 0 better.

  • gallium/auxiliary: add the microsoft tessellator and a pipe wrapper.

  • gallivm/nir: split out 64-bit splitting code

  • gallivm/nir: add support for tess system values

  • gallivm/nir: align store_var param order with load_var

  • gallivm/tgsi/swr: add mask vec to the tcs store

  • gallivm/nir: add tessellation i/o support.

  • draw: add JIT context/functions for tess stages.

  • draw: add main tessellation code

  • draw: hook up final bits of tessellation

  • gallium/nir/tgsi: only scan fragment shader inputs for usage_mask

  • llvmpipe: add support for tessellation shaders

  • gallivm/tessellator: use private functions for min/max to avoid namespace issues

  • gallium: fix build with latest meson and gcc10

  • gallivm/s3tc: split out dxt5 alpha code

  • gallivm: add support for rgtc/latc fetches.

  • gallium/llvmpipe: add an optimised 32-bit memset

  • gallivm/rgtc: fix the truncation to 8-bit

  • gallivm/rgtc: enable fast path for snorm types.

  • Revert “gallivm: disable rgtc/latc SNORM accellerated fetches”

  • llvmpipe: fixup context leaks.

  • draw: collect tessellation invocations statistics

  • llvmpipe: report tessellation shader statistics.

  • llvmpipe/query: fix transform feedback overflow any queries.

  • gallivm: fix left over shader vote debug

  • gallivm/nir: lower implicit lod to tex.

  • gallivm/draw: calloc prim id toavoid undef

  • llvmpipe: fix no tokens detections.

  • draw: fix tessellation stats query

  • llvmpipe/setup: move line stats collection earlier.

  • draw/cull: run pipeline for culled points.

  • draw: fix user culling pipeline order. (v2)

  • u_blitter: fix stencil blitting

  • draw: free the NIR IR.

  • draw/tess: free the NIR

  • llvmpipe/nir: free the nir shader

  • nir/linking: fix issue with two compact variables in a row. (v2)

  • gallivm/nir: fix image store conversions

  • gallivm/nir: add helper invocation support

  • util/indirect: handle stride less than number of parameters.

  • llvmpipe: bump max images to 16

  • llvmpipe: fix ssbo alignment

  • draw/tess: fix TES patch vertices in.

  • llvmpipe: fix d32 unorm depth conversions.

  • llvmpipe/setup: add point size clamping

  • llvmpipe: enable stencil only formats. (v2)

  • llvmpipe: clamp color storage for integer types.

  • gallivm: fix stencil border

  • vulkan: add initial device selection layer. (v6.1)

  • ci: add llvmpipe paths to virgl rules

  • draw/tess: free tessellation control shader i/o memory.

  • llvmpipo/nir: free compute shader NIR

  • llvmpipe: compute shaders work better with all the threads.

David Stevens (1):

  • egl/android: set window usage flags

Denys (1):

  • gitlab: add bug report template

Dominik Behr (1):

  • meson: fix debug build on Android

Drew Davenport (1):

  • radv: Filter extensions not whitelisted for Android

Duncan Hopkins (2):

  • zink. Added storage CISto descriptor pool. Added storage in descriptor pool for combined image samplers as well as uniform buffers. Stops some shaders from running through a pools storage faster than zinks internal tracking.

  • zink: zero out zink_render_pass_state

Dylan Baker (48):

  • docs/release-calendar: 20.0.0-rc1 has been released

  • docs: Mark 20.0-rc2 as done

  • docs: Add release notes for 19.3.4

  • docs: Add SHA256 sum for 19.3.4

  • docs: Mark 19.3.4 as done

  • docs: Mark 20.0.0-rc3 as done

  • Docs: Add 20.0.0 release notes

  • docs: Update index, relnotes, and release-calendar for 20.0

  • docs: Update stable process around using fixes: and gitlab

  • docs/submittingpatches: Fix confusing typo + missing pronoun

  • docs: Update release notes with current process

  • bin/post_version.py: Update the release calendar as well

  • bin/post_version.py: Pretty print the html

  • bin/post_version.py: Make the git commit as well.

  • docs: update releasing to cover updated post_version.py

  • docs: add relnotes for 20.0.1

  • docs: Add sha256sums for 20.0.1

  • docs: update news, calendar, and link release notes for 20.0.1

  • Docs: Add release notes for 20.0.2

  • docs/relnotes: Add sha256 sums for 20.0.2

  • docs: update calendar, add news item, and link releases notes for 20.0.2

  • docs/release-calendar: Add calendar for 20.1 Release candidates

  • bin/gen_release_notes.py: Fix version detection for .0 release

  • bin/pick-ui: Add a new maintainer script for picking patches

  • replace _mesa_is_pow_two with util_is_power_of_two_*

  • replace _mesa_next_pow_two_* with util_next_power_of_two_*

  • replace _mesa_logbase2 with util_logbase2

  • replace LOG2 with util_fast_log2

  • u_math: add x86 optimized version of ifloor

  • replace IFLOOR with util_ifloor

  • Replace IROUND_POS with _mesa_roundevenf

  • mesa/main: remove unused IROUNDD

  • replace IROUND with util functions

  • move windows strtok_r define to u_string

  • Replace IS_INF_OR_NAN with util_is_inf_or_nan

  • replace malloc macros in imports.h with u_memory.h versions

  • util: Add an aligned realloc function

  • replace imports memory functions with utils memory functions

  • mesa|mapi: replace _mesa_[v]snprintf with [v]snprintf

  • mesa: move ADD_POINTERS to macros.h

  • dri/nouveau: replace assert with unreachable

  • remove final imports.h and imports.c bits

  • meson: update llvm dependency logic for meson 0.54.0

  • docs: Add relnotes for 20.0.5

  • docs: Add sha256 sums for 20.0.5

  • docs: update calendar, add news item, and link releases notes for 20.0.5

  • mesa: Follow OpenGL conversion rules for values that exceed storage size

  • tests: Make tests aware of meson test wrapper

Edmondo Tommasina (1):

  • radv/sqtt: fix RADV_THREAD_TRACE_BUFFER_SIZE spelling

Eduardo Lima Mitev (3):

  • turnip/pipeline: Don’t assume tu_shader is a valid object

  • turnip: Instance can be NULL resolving ‘GetInstanceProcAddr’ entry point

  • anv/radv: Resolving ‘GetInstanceProcAddr’ should not require a valid instance

Eli Schwartz (1):

  • docs: fix typo in v20 release notes

Elie Tournier (3):

  • spirv2nir: print nir shader if translation succed

  • spirv2nir: Add kernel spirv support

  • docs/features: Update virgl OpenGL 4.5 features GL_ARB_clip_control and GL_KHR_robustness are now expose in the guest.

Emil Velikov (11):

  • meson: glx: drop with_glx == dri check

  • glx: set the loader_logger early and for everyone

  • egl/drm: reinstate (kms_)swrast support

  • Revert “egl/dri2: Don’t dlclose() the driver on dri2_load_driver_common failure”

  • loader: use a maximum of 64 drmDevices

  • loader: simplify loader_get_user_preferred_fd()

  • loader: simplify codeflow in drm_get_pci_id_for_fd

  • loader: move “using driver…” message to loader_get_kernel_driver_name

  • loader: fallback to kernel name, if PCI fails

  • glx: omit loader_loader() for macOS

  • egl: simplify client/platform extension handling

Emmanuel Gil Peyrot (1):

  • Expose EGL_KHR_platform_* when EXT is supported

Eric Anholt (144):

  • gallium/osmesa: Fix a typo in the unit test’s test names.

  • gallium/osmesa: Fix MakeCurrent of non-8888 contexts.

  • gallium/osmesa: Fill out other format tests.

  • gallium/osmesa: Try to fix the test for big-endian.

  • util: Make helper functions for pack/unpacking pixel rows.

  • mesa/st: Use direct util_format_pack/unpack instead of u_tile.

  • gallium/util: Remove pipe_get_tile_z/put_tile_z.

  • softpipe: Drop the raw_to* part of the tile cache interface.

  • softpipe: Refactor pipe_get/put_tile_rgba_* paths.

  • gallium: Add and use a helper for packing uc from a color_union.

  • gallium: Refactor some single-pixel util_format_read/writes.

  • util: Drop unpacking from int signed to unsigned and vice versa.

  • freedreno: Move the layout debug under FD_MESA_DEBUG=layout.

  • freedreno: Include the layer size in layout debug.

  • freedreno: Rename the UBWC layer size field and store it as bytes.

  • freedreno/a6xx: Disable the core layer-size setup.

  • freedreno: Swap the whole resource layout in shadowing.

  • freedreno: Blit all array levels when uncompressing UBWC.

  • freedreno: Disable UBWC on Z24S8 if not TEXTURE_2D.

  • freedreno: Allow UBWC on textures with multiple mipmap levels.

  • mesa: Clean up some endianness adapters for shader image formats.

  • intel/isl: Move iris’s pipe-to-isl format function to isl.

  • glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.

  • mesa/st: Move the SYSTEM_VALUE -> TGSI_SEMANTIC map to tgsi_from_mesa.

  • nouveau: Reuse tgsi_get_sysval_semantic().

  • nouveau: reuse tgsi_get_gl_frag_result_semantic().

  • nouveau: Reuse tgsi_get_gl_varying_semantic().

  • u_tile: Skip the packed temporary and just store tiles directly.

  • ci: Disable a bunch of tests on freedreno a630.

  • ci: Bump the GLES CTS version to 3.2.6.1.

  • Revert “gallium: Fix big-endian addressing of non-bitmask array formats.”

  • ci: Extend the a630 flake list to reduce spurious failures.

  • radv: Squelch possibly-undefined warning

  • llvmpipe: Fix real uninitialized use of “atype” for SEMANTIC_FACE

  • llvmpipe: Silence “possibly uninitialized value” warning for ssbo_limit.

  • llvmpipe: Silence uninitialized variable warning about “chan”

  • llvmpipe: Fix warning about uninitialized “op” in the NIR path.

  • llvmpipe: Silence uninitialized variable warning about “vals”

  • llvmpipe: Silence uninitialized variable warning about “scissor”

  • llvmpipe: Fix another uninitialized value warning, on init_val.

  • gallium: Only define PIPE_ALIGNSTACK on x86.

  • ci: prepare-artifacts: Make the indent here match previously in the file

  • ci: Make sure that we have a proper shell prompt for LAVA.

  • ci: Make LAVA job fails emit the full list of unexpected test results.

  • ci: Document how LAVA runners work.

  • ci: Don’t bother generating deqp junit results since we don’t present it.

  • ci: Remove a useless filtering of the lava logs.

  • nir: Rename gl_nir_lower_bindless_images.c in preparation for extending it.

  • nir: Make image lowering optionally handle the !bindless case as well.

  • gallium: Add a cap for enabling lowering of image load/store intrinsics.

  • v3d: Ask the state tracker to lower image accesses off of derefs.

  • glsl: Factor out the sampler dim coordinate components switch statement.

  • spirv_to_nir: Reuse glsl_sampler_dim_coordinate_components().

  • freedreno/ir3: Reuse glsl_get_sampler_dim_coordinate_components() in tex_info.

  • tgsi_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().

  • prog_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().

  • freedreno/ir3: Fix the arg to ir3_get_num_components_for_image_format()

  • nir: Move intel’s intrinsic_image_coordinate_components() to core nir.

  • freedreno: Switch to using lowered image intrinsics.

  • ci: Blacklist another freedreno flaky test.

  • meson: Disable bison’s -Wdeprecated since we still support old bison.

  • turnip: Fix compiler warning about casting a nondispatchable handle.

  • freedreno/computerator: Fix defined-but-not-used warnings from lex/yacc.

  • ci: Remove LLVM from ARM test drivers.

  • ci: Stop disabling ACPI in the LAVA arm64 kernel build.

  • ci: Shrink the arm64 kernel build a bit.

  • ci: Include db410c support in the ARM container.

  • aco: Fix signed-vs-unsigned warning.

  • ci: Enable -Werror on meson-vulkan and meson-testing.

  • ci: Switch testing on db410c over to LAVA.

  • ci: Add a disabled-by-default job for GLES3 testing on db410c.

  • ci: Flip db410c back to docker mode.

  • ci: Print the renderer/version that our dEQP invocation is using.

  • ci: Fix installation of firmware for db410c’s nic.

  • ci: Make a simple little bare-metal fastboot mode for db410c.

  • glsl/tests: Catch mkdir errors to help explain when they happen.

  • glsl/tests: Fix waiting for disk_cache_put() to finish.

  • ci: Update the ci-templates commit.

  • ci: Enable ccache in the container builds.

  • ci: Enable ccaching of CMake builds as well.

  • ci: Enable testing GLES2-3 on a530 (Dragonboard 820c).

  • freedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.

  • gallium/util: Switch util_float_to_half to _mesa_float_to_half()’s impl.

  • ci: Ban the recent popular freedreno a630 flakes.

  • ci: Disable tests that showed intermittent fails on a530 in day 1.

  • ci: Only run the freedreno baremetal tests when freedreno/core changes.

  • freedreno: Switch to exposing only half-integer pixel centers.

  • ci: Move db820c and db410c’s gles3 tests to manual, like radv did.

  • glsl: Restore the IsES flag on the shader when reading from cache.

  • ci: Ban the recent popular freedreno a630 intermittent failure.

  • freedreno: Remove always-true return from per-gen begin_query.

  • freedreno: Remove the “active” member of queries.

  • freedreno: Fix acc query handling in the presence of batch reordering.

  • freedreno: Associate the acc query bo with the batch.

  • freedreno: Count blits in GL_TIME_ELAPSED and perf counter queries.

  • freedreno/a6xx: Fix timestamp queries.

  • freedreno: Rename “is_blit” to “is_discard_blit”

  • freedreno: Fix detection of being in a blit for acc queries.

  • freedreno: Work around UBWC flakiness.

  • freedreno: Drop an unnecessary include marked “this should go away”

  • freedreno/turnip: Use the NIR info to decide if we need helper invocations.

  • loader: Warn when we fail to open a device node due to permissions.

  • ci: Consistently use -j4 across x86 build jobs and -j8 on ARM.

  • freedreno/a6xx: Sink the per-level size temps inside the loop.

  • freedreno/a6xx: Remove the “aligned_height” temporary.

  • freedreno/a6xx: Drop the “alignment” layout temporary.

  • freedreno: Add the outline of a test for a6xx texture layout.

  • freedreno/a6xx: Set a level’s pitch based on minified level0 pitch, not width0.

  • freedreno: Fix leak of binning shader variants.

  • freedreno/ir3: Stop doing b2n on the SEL condition.

  • freedreno/ir3: CSE the up/downconversion of SEL’s cond’s size.

  • freedreno/a5xx+: Skip compiling the old gmem blit programs.

  • freedreno/drm-shim: Add support for faking other adreno chips.

  • freedreno/ir3: Drop handling FRAG_RESULT_DEPTH writing to .z

  • freedreno: Introduce a “cpp_shift” value for cpp divs/muls.

  • freedreno: Make the slice pitch be bytes, not pixels.

  • drm-shim: Let the driver choose to overwrite the first render node.

  • nir/lower_two_sided_color: Fix picking of new driver location.

  • nir/lower_clip: Fix picking of unused driver locations.

  • gallium: Fix setup of pstipple frag coord var.

  • freedreno/ir3: Fix driver_location of the added vertex_flags varying.

  • freedreno/ir3: Fix sizing of the inputs/outputs array.

  • vc4: Use NIR shader’s num_outputs for generating our new output.

  • ci: Drop redundant freedreno stage specification.

  • ci: Enable GLES3 testing on db410c/db820c (freedreno a306 and a530).

  • freedreno: Fix derivatives without texturing on a3xx-a5xx.

  • ci: Enable GLES 3.1 testing on db820c (a530).

  • freedreno/ir3: Fix the disasm of half-float STG dests.

  • freedreno/ir3: Print a space after nop counts, like qcom’s disasm.

  • freedreno/ir3: Add a unit test for our disassembler.

  • freedreno/ir3: Convert remaining disasm src prints to reginfo.

  • freedreno/ir3: Refactor out print_reg_src().

  • freedreno/ir3: Add support for disasm of cat2 float32 immediates.

  • ci: Enable –compact-display false on all dEQP runs.

  • ci: Add sanity checking that dEQP gets the expected GL_RENDERER.

  • freedreno: Fix calculation of the const buffer cmdstream size.

  • ci: Allow namespacing of dEQP run results files.

  • ci: Clean up some excessive use of pipes in dEQP results processing.

  • ci/freedreno: Add a test run of a few driver options.

  • util/ra: Sanity check that the driver selected a valid reg.

  • util/ra: Sanity check that we’re adding a valid reg to a class.

  • util/ra: Use util_dynarray for the adjacency list.

  • util/ra: Use util_dynarray for handling the conflict lists.

  • util/ra: Improve ra_set_finalize() performance.

Eric Engestrom (58):

  • VERSION: bump after 20.0 branch point

  • egl: put full path to libEGL_mesa.so in GLVND json

  • gitlab-ci: disable a630 tests as mesa-cheza is down

  • util/os_socket: fix header unavailable on windows

  • freedreno/perfcntrs: fix fd leak

  • dri: delete gen-symbol-redefs.py

  • util/disk_cache: check for write() failure in the zstd path

  • meson: don’t bother trying `python2`

  • Revert “egl: put full path to libEGL_mesa.so in GLVND json”

  • egl: directly access static members instead of using _egl{Get,Set}ConfigKey()

  • meson: explicitly disallow unsupported build directory layout

  • docs: fix typos in the release docs

  • bin/gen_release_notes.py: fix commit list command

  • gen_release_notes: fix vulkan version reported

  • docs/relnotes/19.3: fix vulkan version reported

  • docs/relnotes/20.0: fix vulkan version reported

  • Revert “docs/relnotes/19.3: fix vulkan version reported”

  • docs: trivial fix for html structure

  • docs/releasing: add missing </li> tags

  • docs: add release notes for 19.3.5

  • docs: update calendar, add news item, and link releases notes for 19.3.5

  • vulkan/wsi: fix cleanup when dup() fails

  • gen_release_notes: fix version in “you should wait” message

  • gen_release_notes: resolve ambiguity by renaming `version` to `previous_version` and `next_version` to `this_version`

  • meson: use existing variables in inc_common

  • meson: inline `inc_common`

  • vulkan: drop unused include directories

  • intel: drop unused include directories

  • scons: prune unused Makefile.sources

  • docs: add release notes for 20.0.3

  • docs/relnotes: add sha256sum for 20.0.3

  • docs: update calendar, add news item, and link releases notes for 20.0.3

  • docs: add release notes for 20.0.4

  • docs/relnotes: add sha256sum for 20.0.4

  • docs: update calendar, add news item, and link releases notes for 20.0.4

  • glx: fix 630 times -Wlto-type-mismatch when building with LTO enabled

  • glx: use anonymous namespace to avoid -Wodr issues when building with LTO enabled

  • pick-ui: auto-scroll the feedback window

  • pick-ui: compute .pick_status.json path only once

  • pick-ui: make .pick_status.json path relative to the git root instead of the script

  • pick-ui: show commit sha in the pick list

  • VERSION: bump to 20.1.0-rc1

  • .pick_status.json: Update to af55bdd05d94eda59ee1c9331a50045000da5db5

  • .pick_status.json: Update to 57796946985de60204189426ca8eb7bbfa97c396

  • .pick_status.json: Mark 3fac55ce0d066d767d6c6c8308f79d0c3e566ec0 as denominated

  • .pick_status.json: Update to 29da52128090a1ef8ef782188c0f67c7f5ec8d19

  • VERSION: bump to 20.1.0-rc2

  • .pick_status.json: Update to 772b15ad3227e08bb4e18932ac9ecf4c29271160

  • .pick_status.json: Update to 56f955e4850035d915a2a87e2ebea7fa66ab5e19

  • .pick_status.json: Update to c1c0cf7a66905e8d7ad506842a41b0ad0c5b10da

  • VERSION: bump to 20.1.0-rc3

  • .pick_status.json: Update to 5a6beb6a24aa084adfd6c57edd0a64f0a044611a

  • post_version.py: fix branch name construction for release candidates

  • post_version.py: invert `is_point` into `is_first_release` to make its purpose clearer

  • post_version.py: stop adding release candidates to the index and relnotes

  • VERSION: bump to 20.1.0-rc4

  • .pick_status.json: Update to a91306677c613ba7511b764b3decc9db42b24de1

  • tree-wide: fix deprecated GitLab URLs

Erik Faye-Lund (154):

  • zink: enable texture-buffer objects

  • zink: implement load_instance_id

  • zink: implement support for derivative-control

  • zink: be more careful about the mask-check

  • zink: disallow depth-stencil blits with format-change

  • st/mesa: use uint-result for sampling stencil buffers

  • zink: lower away fdph

  • zink: fixup sampler-usage

  • zink: replace unset buffer with a dummy-buffer

  • zink: emit blend-target index

  • zink: only inspect dual-src limit if feature enabled

  • Revert “nir: Add a couple trivial abs optimizations”

  • zink: do not use SpvDimRect

  • zink: fix binding-usage

  • zink: do not report texture-samplers for unsupported stages

  • zink/spirv: do not reinvent store_dest

  • zink/spirv: prefer store_dest over store_dest_uint

  • zink/spirv: rename functions a bit

  • zink/spirv: unit_value -> raw_value

  • zink/spirv: uint -> raw

  • zink: do not convert bools to/from uint

  • util: promote u_debug_memory.c to src/util

  • util: move debug_memory_{begin,end} to os_memory_debug.h

  • gallium/util: do not use debug_print_format

  • gallium/util: remove unused debug_print_foo helpers

  • zink/spirv: do not use bitwise operations on booleans

  • pipebuffer: clean up cast-warnings

  • rbug: clean up cast-warnings

  • rbug: do not return void-value

  • vtn/opencl: fully enable OpenCLstd_Clz

  • compiler/nir: move build_exp helper into builtin-builder

  • compiler/nir: move build_log helper into builtin-builder

  • vtn/opencl: add native exp/log-support

  • vtn/opencl: add native exp10/log10-support

  • vtn/opencl: add native exp2/log2-support

  • nv50: remove unused variable

  • meson: disable some more warnings on msvc

  • mesa/main: correct extension-checks for GL_BLACKHOLE_RENDER_INTEL

  • mesa/main: clean-up extension-checks for point-sprites

  • mesa/main: clean up extension-check for GL_VERTEX_PROGRAM

  • mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_TWO_SIDE

  • mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_POINT_SIZE

  • mesa/main: clean up extension-check for GL_TEXTURE_RECTANGLE

  • mesa/main: clean up extension-check for GL_STENCIL_TEST_TWO_SIDE

  • mesa/main: clean up extension-check for GL_DEPTH_BOUNDS_TEST

  • mesa/main: clean up extension-check for AMD_depth_clamp_separate

  • mesa/main: clean up extension-check for GL_FRAGMENT_SHADER_ATI

  • mesa/main: clean up extension-check for GL_TEXTURE_CUBE_MAP_SEAMLESS

  • mesa/main: clean up extension-check for GL_RASTERIZER_DISCARD

  • mesa/main: clean up extension-check for GL_TEXTURE_EXTERNAL

  • mesa/main: remove unused macro

  • wgl: drop pointless debug_printf

  • wgl: drop unused member

  • wgl: move screen-init to a helper

  • wgl: do not create screen from DllMain

  • st/dri: make sure software color-buffers are linear

  • zink: be less picky about tiled resources

  • .mailmap: add an alias for Alan Swanson

  • .mailmap: add an alias for Alyssa Rosenzweig

  • .mailmap: add an alias for Andrii Simiklit

  • .mailmap: add an alias for Anuj Phogat

  • .mailmap: add an alias for Axel Davy

  • .mailmap: add an alias for Boris Brezillon

  • .mailmap: add an alias for Bruce Cherniak

  • .mailmap: update aliases for Carl-Philip Hänsch

  • .mailmap: add an alias for Chad Versace

  • .mailmap: add a couple of aliases for Chandu Babu Namburu

  • .mailmap: add alias for Chenglei Ren

  • .mailmap: add an alias for Christian Gmeiner

  • .mailmap: add an alias for Christian Inci

  • .mailmap: add a few aliases for Christoph Haag

  • .mailmap: add an alias for Colin McDonald

  • .mailmap: specify spelling for Constantine Kharlamov

  • .mailmap: add an alias for Craig Stout

  • .mailmap: add an alias for Daniel Schürmann

  • .mailmap: add an alias for Danylo Piliaiev

  • .mailmap: add an alias for Dave Airlie

  • .mailmap: add an alias for Dylan Baker

  • .mailmap: add a couple of aliases for Dylan Noblesmith

  • .mailmap: add an alias for Emmanuel Gil Peyrot

  • .mailmap: add an alias for Erik Faye-Lund

  • .mailmap: specify spelling for Francesco Ansanelli

  • .mailmap: specify spelling for Gurchetan Singh

  • .mailmap: add an alias for Haihao Xiang

  • .mailmap: add an alias for Harish Krupo

  • .mailmap: specify spelling for Heinrich Fink

  • .mailmap: specify spelling for Henri Verbeet

  • .mailmap: add an alias for Igor Gnatenko

  • .mailmap: add an alias for Illia Iorin

  • .mailmap: specify spelling for James Zhu

  • .mailmap: add an alias for Jan Beich

  • .mailmap: clean up aliases for Jeremy Huddleston

  • .mailmap: add an alias for Julien Isorce

  • .mailmap: add a few aliases for Karol Herbst

  • .mailmap: add a few aliases for Kevin Rogovin

  • .mailmap: add a few aliases for Kristian Høgsberg

  • .mailmap: add an alias for Lionel Landwerlin

  • .mailmap: specify spelling for Liviu Prodea

  • .mailmap: update aliases for Marc-André Lureau

  • .mailmap: add alias for Matthias Groß

  • .mailmap: add an alias for Neha Bhende

  • .mailmap: add an alias for Neil Roberts

  • .mailmap: specify spelling for Nian Wu

  • .mailmap: add an alias for Nicholas Bishop

  • .mailmap: update aliases for Nicolai Hähnle

  • .mailmap: add an alias for Philipp Zabel

  • .mailmap: update aliases for Pierre-Eric Pelloux-Prayer

  • .mailmap: add an alias for Plamena Manolova

  • .mailmap: add an alias for Qiang Yu

  • .mailmap: specify spelling for Randy Xu

  • .mailmap: add an alias for Renato Caldas

  • .mailmap: add an alias for Rob Clark

  • .mailmap: add an alias for Rodrigo Vivi

  • .mailmap: add an alias for Samuel Li

  • .mailmap: add an alias for Sergii Romantsov

  • .mailmap: specify spelling for Sonny Jiang

  • .mailmap: add a couple of aliases for Steinar H. Gunderson

  • .mailmap: add a couple of aliases for Suresh Guttula

  • .mailmap: add an alias for Thierry Reding

  • .mailmap: add an alias for Timo Aaltonen

  • .mailmap: add a couple of aliases for Timothy Arceri

  • .mailmap: add an alias for Tim Wiederhake

  • .mailmap: add an alias for Tom Stellard

  • .mailmap: add an alias for Tomasz Figa

  • .mailmap: add an alias for Topi Pohjolainen

  • .mailmap: add an alias for Vadym Shovkoplias

  • .mailmap: add an alias for Varad Gautam

  • .mailmap: specify spelling for Vivek Kasireddy

  • .mailmap: specify spelling for Wladimir J. van der Laan

  • .mailmap: add an alias for Xavier Bouchoux

  • .mailmap: add an alias for Yaakov Selkowitz

  • .mailmap: add alias for Zhaowei Yuan

  • .mailmap: add an alias for Zhongmin Wu

  • meson: use override_options to change warning-level

  • wgl: silence some cast-warnings

  • util/tests: initialize variable

  • mesa: fixup cast expression

  • vbo: avoid including wingdi.h on win32

  • meson: tell flex that we support c99

  • gtest: Update to 1.10.0

  • meson: do not disable incremental linking for debug-builds

  • docs: remove outdated sentence

  • mesa/gallium: do not use enum for bit-allocated member

  • meson: correct windows-version define

  • mesa/main: do not store unrecognized extensions in context

  • mesa/main: do not pass context to one-time extension init

  • mesa/main: do not init remap-table per api

  • mesa/main: Do not pass context to one_time_init

  • mesa/main: one_time_init() -> _mesa_initialize()

  • mesa/st: call _mesa_initialize() early

  • zink: lower b2b to b2i

  • util/os_memory: never use os_memory_debug.h

  • zink: implement i2b1

  • zink: use general-layout when blitting to/from same resource

Francisco Jerez (57):

  • intel/fs/cse: Make HALT instruction act as CSE barrier.

  • intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL.

  • intel/fs: Add virtual instruction to load mask of live channels into flag register.

  • intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow.

  • intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes.

  • intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow.

  • intel/fs: Set src0 alpha present bit in header when provided in message payload.

  • intel/fs/gen11: Work around dual-source blending hangs in combination with SIMD32.

  • intel/fs: Make sample_mask_reg() local to brw_fs.cpp and use it in more places.

  • intel/fs: Use helper for discard sample mask flag subregister number.

  • intel/fs/gen7+: Swap sample mask flag register and FIND_LIVE_CHANNEL temporary.

  • intel/fs: Refactor predication on sample mask into helper function.

  • intel/fs: Return consistent UW types from sample_mask_reg() in fragment shaders.

  • intel/fs/gen7+: Implement discard/demote for SIMD32 programs.

  • intel/compiler: Move base IR definitions into a separate header file

  • intel/compiler: Reverse inclusion dependency between brw_cfg.h and brw_shader.h

  • intel/compiler: Nest definition of live variables block_data structures

  • intel/compiler: Reverse inclusion dependency between brw_fs_live_variables.h and brw_fs.h

  • intel/compiler: Reverse inclusion dependency between brw_vec4_live_variables.h and brw_vec4.h

  • intel/compiler: Introduce simple IR analysis pass framework

  • intel/compiler: Introduce backend_shader method to propagate IR changes to analysis passes

  • intel/compiler: Define more detailed analysis dependency classes

  • intel/compiler: Pass detailed dependency classes to invalidate_analysis()

  • intel/compiler: Mark virtual_grf_interferes and vars_interfere as const

  • intel/compiler: Move all live interval analysis results into fs_live_variables

  • intel/compiler: Move all live interval analysis results into vec4_live_variables

  • intel/compiler: Restructure live intervals computation code

  • intel/compiler: Pass single backend_shader argument to the fs_live_variables constructor

  • intel/compiler: Pass single backend_shader argument to the vec4_live_variables constructor

  • intel/compiler/fs: Add live interval validation pass

  • intel/compiler/vec4: Add live interval validation pass

  • intel/compiler/fs: Switch liveness analysis to IR analysis framework

  • intel/compiler/vec4: Switch liveness analysis to IR analysis framework

  • intel/compiler: Drop invalidate_live_intervals()

  • intel/compiler: Move idom tree calculation and related logic into analysis object

  • intel/compiler: Move dominance tree data structure into idom_tree object

  • entel/compiler: Simplify new_idom reduction in dominance tree calculation

  • intel/compiler: Move register pressure calculation into IR analysis object

  • intel/compiler: Calculate num_instructions in O(1) during register pressure calculation

  • intel/fs: Fix workaround for VxH indirect addressing bug under control flow.

  • intel/fs/gen12: Fix interaction of SWSB dependency combination with EU fusion workaround.

  • intel/fs/gen12: Fix hangs with per-sample SIMD32 fragment shader dispatch.

  • intel/fs/gen12: Work around dual-source blending hangs in combination with SIMD32.

  • intel/fs/gen12: Fix Render Target Read header setup for new thread payload layout.

  • intel/ir: Add missing initialization of backend_reg::offset during construction.

  • intel/fs: Rename half() helpers to quarter(), allow index up to 3.

  • intel/fs: Fix constness of argument of fs_instruction_scheduler::is_compressed().

  • intel/fs: Replace fs_visitor::bank_conflict_cycles() with stand-alone function.

  • intel/vec4: Fix constness of vec4_instruction::reads_flag() and ::writes_flag().

  • intel/ir: Import shader performance analysis pass.

  • intel/fs: Heap-allocate fs_visitors in brw_compile_fs().

  • intel/fs: Implement performance analysis-based SIMD32 heuristic for fragment shaders.

  • intel/fs: Add INTEL_DEBUG=no32 debugging flag.

  • intel/ir: Use brw::performance object instead of CFG cycle counts for codegen stats.

  • intel/ir: Pass block cycle count information explicitly to disassembler.

  • intel/ir: Remove scheduling-based cycle count estimates.

  • intel/ir: Update performance analysis parameters for memory fence codegen changes.

Fritz Koenig (3):

  • Revert “gitlab-ci: disable a630 tests as mesa-cheza is down”

  • Revert “gitlab-ci: disable a630 tests as mesa-cheza is down (again)”

  • freedreno: allow FMT6_8_UNORM as a UBWC format

Georg Lehmann (3):

  • Correctly wait in the fragment stage until all semaphores are signaled

  • Vulkan Overlay: Don’t try to change the image layout to present twice

  • Vulkan overlay: use the corresponding image index for each swapchain

Gert Wollny (63):

  • r600: force new CF with TEX only if any texture value is written

  • r600: Increase space for IO values to agree with PIPE_MAX_SHADER_IN/OUTPUTS

  • r600: Add NIR compiler options

  • r600: Update state code to accept NIR shaders

  • r600/sfn: Add a basic nir shader backend

  • r600: enable NIR backend DEBUG flag for supported architectures

  • r600/sfn: Add the VS in and FS out vectorization

  • r600/sfn: Add the WaitAck instruction

  • r600/sfn: add live range evaluation for the GPR

  • r600/sfn: add register remapping

  • r600/sfn: Add lowering arrays to scratch and according instructions

  • r600/sfn: Add a load GDS result instruction

  • r600/sfn: Add MemRingOut instructions

  • r600/sfn: add emitVertex instructions

  • r600/sfn: Add support for geometry shader

  • r600/sfn: Add VS for TCS shader skeleton

  • r600/sfn: Add compute shader skeleton

  • r600/sfn: Add GDS instructions

  • r600/sfn: Add lowering UBO access to r600 specific codes

  • r600: Make sure LLVM is not used for DRAW

  • r600/sfn: Add support for atomic instructions

  • r600/sfn: Add support for SSBO load and store

  • r600/sfn: Add .editorconfig file

  • r600/sfn: Add some documentation

  • r600/sfn: Avoid using dynamic_cast to identify type

  • r600/sfn: Use static_cast when type is already known

  • r600/sfn: Don’t try to catch exceptions, the driver doesn’t throw any

  • gallium/tgsi_to_nir: Set nir_intrinsic_align_mul to 16 and offset to 0

  • r600: Dump a few more variables when requested

  • r600/sfn: Reduce array limit for scratch usage

  • r600/sfn: Fix setting alignments when lowering UBOs

  • r600/sfn: Implementing instructions blocks

  • r600/nir: Pin interpolation results to channel

  • r600/sfn: Fix null pointer deref in live range evalation

  • r600/sfn: Handle b2b1 like it was a mov

  • r600/sfn: Fix handling of GS inputs

  • r600/sfn: Fix using the result of a fetch instruction in next fetch

  • r600/sfn: Count only literals that are not inline to split instruction groups

  • r600/sfn: use new temp register allocation when loading single value temporaries

  • nir: Add r600 specific intrinsics for tesselation shader IO

  • nir: Add umad24 and umul24 opcodes

  • r600: Handle texcoord semantics in LDS index evaluation

  • r600/sfn: simplify UBO lowering pass

  • r600/sfn: Don’t emit inline constants in the r600 IR

  • r600/sfn: Add LDS IO instructions to r600 IR

  • r600/sfn: Add LDS instruction to assembly conversion

  • r600/sfn: Add TF write instruction

  • r600/sfn: Add IR instruction to fetch the TESS parameters

  • r600/sfn: Handle umul24 and umad24

  • r600/sfn: Emit some LDS instructions

  • r600/sfn: Move emission of barrier from compute shader to shader base

  • r600/sfn: Add methods to valuepool to get a vector of values

  • r600/sfn: Move some shader base methods to the public interface

  • r600/sfn: extract class to handle the VS export to different stages

  • r600/sfn: derive the GS from the vertex stage for a common interface

  • r600/sfn: Handle LDS output in VS

  • r600/sfn: Move removing of unused variables

  • r600/sfn: Add lowering passes for Tesselation IO

  • r600/sfn: Add tesselation shaders

  • r600: Enable tesselation for NIR

  • r600: Fix nir compiler options, i.e. don’t lower IO to temps for TESS

  • r600/sfn: Fix printing vertex fetch instruction flags

  • r600: Fix duplicated subexpression in r600_asm.c

Greg V (3):

  • amd/addrlib: fix build on non-x86 platforms

  • r600: add missing <array> include

  • svga: fix build on FreeBSD

H.J. Lu (2):

  • x86_init_func_common: Add ENDBR at function entry

  • x86: Add ENDBR at function entries

Hanno Böck (1):

  • Properly check mmap return value

Hyunjun Ko (27):

  • freedreno/ir3: fix printing half constant registers.

  • freedreno/ir3: Add cat4 mediump opcodes

  • freedreno/ir3: put the conversion back for half const to the right place.

  • freedreno/ir3: Fold const only when the type is float

  • freedreno/ir3: Add new ir3 pass to fold out fp16 conversions

  • nir: Add optimization for doing removing f16/f32 conversions

  • freedreno/ir3: handle half registers for arrays during register allocation.

  • turnip: support indirect draw

  • glsl: Handle fp16 unary operations when lowering matrix operations

  • glsl/lower_instructions: Handle fp16 for MOD_TO_FLOOR

  • turnip: Gather information for transform feedback

  • turnip: Define structs for transform feedback

  • turnip: Setup stream-output when linking program

  • turnip: Implement stream-out emit and vkApis for transform feedback

  • turnip: Implement an empty function vkCmdDrawIndirectByteCountEXT

  • turnip: Enable VK_EXT_transform_feedback

  • turnip: Add tu6_control struct.

  • turnip: Fix wrong assignment of xfb output’s offset.

  • turnip: Do gathering xfb info after nir_remove_dead_variables

  • freedreno: Enable mediump lowering

  • freedreno/ir3: enable nir_opt_loop_unroll on a6xx

  • nir: fix wrong assignment to buffer in xfb_varyings_info

  • turnip: make the struct slot_value of queries get 2 values

  • turnip: Implement and enable VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT

  • turnip : Fix wrong offset calculation for xfb buffer.

  • turnip: Skip unused regs when setting up streamout buffers

  • turnip: Fix crashes when geometry shader constants aren’t used

Iago Toral Quiroga (1):

  • nir: add a bool bitsize lowering pass

Ian Romanick (62):

  • intel/fs: Don’t count integer instructions as being possibly coissue

  • nir: Mark fmin and fmax as commutative and associative

  • mesa/draw: Make sure all the unused fields are initialized to zero

  • nir/search: Use larger type to hold linearized index

  • intel/fs: Correctly handle multiply of fsign with a source modifier

  • intel/fs: Do cmod prop again after scheduling

  • intel/fs: Allow NOT instructions in conditional discard optimization

  • intel/fs: Fix NULL destinations on 3-source instructions again after late DCE

  • nir/algebraic: Simplify logic to detect sign of an integer

  • nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)

  • nir/algebraic: Generalize some and-of-shift-right patterns [v2]

  • nir/algebraic: Constant reassociation for bitwise operations too

  • nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan

  • soft-fp64/b2f: Reimplement using bitwise logic ops

  • soft-fp64: Don’t open-code umulExtended

  • soft-fp64: Simplify __countLeadingZeros32 function

  • soft-fp64: Pick a single idiom for treating sign value as a Boolean

  • soft-fp64: Store sign value as 0 or 0x80000000

  • soft-fp64/fneg: Don’t treat NaN specially

  • soft-fp64/flt: Perform checks in a different order

  • soft-fp64/fsat: Correctly handle NaN

  • soft-fp64/fsat: Micro-optimize x < 0 test

  • soft-fp64/fsat: Micro-optimize x >= 1 test

  • soft-fp64: Relax the way NaN is propagated

  • soft-fp64/ffloor: Simplify the >= 0 comparison

  • soft-fp64: Optimize __fmin64 and __fmax64 by using different evaluation order [v2]

  • soft-fp64/fadd: Instead of tracking “b < a”, track sign of the difference

  • soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1

  • soft-fp64/fadd: Pick zero or non-zero result based on subtraction result

  • soft-fp64/fadd: Just let the subtraction happen when the result will be zero

  • soft-fp64/fadd: Delete a redundant condition check

  • soft-fp64/fadd: Reformat after previous commit

  • soft-fp64/fadd: Combine an if-statement into the preceeding else-clause

  • soft-fp64/fadd: Rename aFrac and bFrac variables

  • soft-fp64/fadd: Use absolute value of expDiff

  • soft-fp64/fadd: Move common code out of both branches of an if-statement

  • soft-fp64/fadd: Common code optimization for differing sign case

  • soft-fp64: Split a block that was missing a cast on a comparison

  • intel/vec4: Allow late copy propagation on vec4

  • nir/algebraic: Change the default cursor location when replacing a unary op

  • nir/algebraic: Distribute source modifiers into instructions

  • nir/algebraic: Use value range analysis to convert fmax to fsat

  • nir/algebraic: Remove a redundant fabs pattern

  • tnl: Don’t dereference NULL obj pointer in bind_indices

  • tnl: Don’t dereference NULL obj pointer in replay_init

  • tnl: Don’t dereference NULL obj pointer in t_rebase_prims

  • tnl: Silence unused parameter ‘attrib’ warning in convert_half_to_float

  • tnl: Silence unused parameter warnings in _tnl_draw_prims

  • tnl: Silence unused parameter warnings in dump_draw_info

  • tnl: Silence unused parameter warnings in _tnl_split_inplace

  • tnl: Code formatting in t_draw.c

  • tnl: Code formatting in t_rebase.c

  • intel/compiler: Silence unused parameter warnings in vec4_tcs_visitor

  • intel/compiler: Silence unused parameter warning in fs_live_variables::setup_one_read

  • intel/compiler: Silence unused parameter warning in update_inst_scoreboard

  • intel/compiler: Only GE and L modifiers are commutative for SEL

  • intel/compiler: CSEL can do saturate

  • intel/compiler: Fixup operands in fs_builder::emit() that takes array

  • nir/algebraic: Detect some kinds of malformed variable names

  • nir/algebraic: Require operands to iand be 32-bit

  • nir/algebraic: Optimize ushr of pack_half, not ishr

  • anv/tests: Don’t rely on assert or changing NDEBUG in tests

Icecream95 (16):

  • panfrost: Fix non-debug builds

  • panfrost: Inline panfrost_get_default_swizzle

  • panfrost: LogicOp support

  • nir: Allow nir_format conversions to work on 32-bit values

  • panfrost: LogicOp fixes and non 8-bit format support

  • mesa/format_utils: Add a fast-path for RGBA to BGRA

  • panfrost: Extend the tiled store fast-path to loads

  • panfrost: Mark 64-bit formats as unsupported

  • panfrost: Add support for B5G5R5X1

  • st/mesa: Fall back on R3G3B2 for R3_G3_B2

  • panfrost: Add support for R3G3B2

  • panfrost: Correctly identify format 0x4c

  • pan/midgard: Fix a divide by zero in emit_alu_bundle

  • panfrost: Fix GL_EXT_vertex_array_bgra

  • panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED

  • panfrost: Fix background showing when using discard

Icenowy Zheng (3):

  • lima: remove its hash table entry when invalidating a resource

  • lima: expose fragment shader derivatives capability

  • lima: implement zsbuf reload

Ilia Mirkin (24):

  • nv50: report max lod bias of 15.0

  • gitlab-ci: disable panfrost runners

  • mesa: fix _mesa_draw_nonzero_divisor_bits to return nonzero divisors

  • nv50,nvc0: add newly added PIPE_CAP’s to list

  • st/mesa: allow TXB2/TXL2 to work with cube array shadow textures

  • nvc0: enable EXT_texture_shadow_lod

  • st/vdpau: avoid asserting on new VDP_YCBCR_* formats

  • st/vdpau: make query test for 2D support

  • nv50: don’t try to upload MSAA settings for BUFFER textures

  • gallium: add viewport swizzling state and cap

  • mesa: add GL_NV_viewport_swizzle support

  • st/mesa: add NV_viewport_swizzle support

  • nvc0: add NV_viewport_swizzle support for GM200+

  • compiler: add VARYING_SLOT_VIEWPORT_MASK

  • glsl: add NV_viewport_array2 support

  • mesa: add NV_viewport_array2 enable, attach to glsl

  • gallium: add TGSI_SEMANTIC_VIEWPORT_MASK

  • gallium: add TGSI_PROPERTY_LAYER_VIEWPORT_RELATIVE

  • gallium: add PIPE_CAP_VIEWPORT_MASK

  • st/mesa: add support for GL_NV_viewport_array2

  • nvc0: enable GL_NV_viewport_array2

  • nv50,nvc0: update with latest caps

  • docs: update for recently-added nvc0 features

  • mesa: add interaction between compute derivatives and variable local sizes

Indrajit Kumar Das (4):

  • glapi/copyimage: Implement CopyImageSubDataNV

  • gallium: prepare framework for supporting AlphaToCoverageDitherControlNV

  • mesa: add support for AlphaToCoverageDitherControlNV

  • radeonsi: enable support for AlphaToCoverageDitherControlNV

Ivan Molodetskikh (1):

  • egl: allow INVALID format for linux_dmabuf

James Xiong (2):

  • iris: handle the failure of converting unsupported yuv formats to isl

  • gallium: let the pipe drivers decide the supported modifiers

James Zhu (1):

  • radeonsi: fix Segmentation fault during vaapi enc test

Jan Palus (1):

  • targets/opencl: fix build against LLVM>=10 with Polly support

Jan Vesely (2):

  • clover: Use explicit conversion from llvm::StringRef to std::string

  • clover: Check if the detected clang libraries are usable

Jan Zielinski (8):

  • gallium/swr: Fix various asserts and security issues

  • gallium/swr: fix corruptions in Unigine Heaven

  • gallium/swr: use ElementCount type arguments for getSplat()

  • gallium/gallivm: Remove workaround disabling AVX code for newer CPUs

  • gallium/gallivm: fix compilation issues with llvm 11

  • gallium/gallivm: remove unused header include for newer LLVM

  • gallium/swr: Fix LLVM 11 compilation issues

  • gallium/swr: Fix crashes and failures in vertex fetch

Faith Ekstrand (202):

  • genxml: Add a new 3DSTATE_SF field on gen12

  • anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+

  • intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11

  • iris: Set SLMEnable based on the L3$ config

  • iris: Store the L3$ configs in the screen

  • iris: Use the URB size from the L3$ config

  • i965: Re-emit l3 state before BLORP executes

  • intel: Take a gen_l3_config in gen_get_urb_config

  • intel/blorp: Always emit URB config on Gen7+

  • iris: Consolodate URB emit

  • anv: Emit URB setup earlier

  • intel/common: Return the block size from get_urb_config

  • intel/blorp: Plumb deref block size through to 3DSTATE_SF

  • anv: Plumb deref block size through to 3DSTATE_SF

  • iris: Plumb deref block size through to 3DSTATE_SF

  • anv: Always fill out the AUX table even if CCS is disabled

  • intel/eu/validate: Don’t validate regions of sends

  • intel/disasm: SEND has two sources on Gen12+

  • intel/tools: Handle strides better when dumping buffers

  • intel/fs: Write the address register with NoMask for MOV_INDIRECT

  • anv/blorp: Use the correct size for vkCmdCopyBufferToImage

  • anv: No-op submit and wait calls when no_hw is set

  • anv: Reject modifiers on depth/stencil formats

  • vulkan: Update the XML and headers to 1.2.133

  • nir: Fix the nir_builder include path for nir_builtin_builder

  • nir/builder: Return an integer from nir_get_texture_size

  • intel/isl: Add isl_aux_info.c to Makefile.sources

  • anv: Always enable the data cache

  • nir: Drop nir_tex_instr::texture_array_size

  • anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A

  • anv: Use a proper end-of-pipe sync instead of just CS stall

  • anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall

  • nir: Flush to zero with OOB low exponents in ldexp

  • isl: Set 3DSTATE_DEPTH_BUFFER::Depth correctly for 3D surfaces

  • iris: Allow HiZ on blit sources

  • blorp: Write to depth/stencil images as depth/stencil when possible

  • anv: Enable HiZ for VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL

  • iris: Enable CCS for copies from HiZ+CCS depth buffers

  • iris: Enable HiZ and stencil CCS for blorp blit destinations

  • iris: Don’t skip fast depth clears if the color changed

  • anv: Parse VkPhysicalDeviceFeatures2 in CreateDevice

  • anv: Mark max_push_range UNUSED and simplify the code

  • anv: Pass buffer addresses into emit_push_constant*

  • anv: Delete some pointless break statements

  • anv: Align UBO sizes to 32B

  • anv: Add an align_down_u32 helper

  • anv: Bounds-check pushed UBOs when robustBufferAccess = true

  • vulkan/wsi: Don’t leak the FD when GetImageDrmFormatModifierProperties fails

  • vulkan/wsi: Return an error if dup() fails

  • intel/isl: Clean up some aux surface logic

  • intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT

  • intel/blorp: Allow HIZ_CCS_WT in copy sources

  • iris: Use ISL_AUX_USAGE_HIZ_CCS_WT to indicate write-through HiZ

  • intel/isl: Require ISL_AUX_USAGE_HIZ_CCS_WT for HZ+CCS WT mode

  • intel/isl: Add a separate ISL_AUX_USAGE_STC_CCS

  • intel/blorp: Allow STC_CCS in blit sources

  • iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS

  • intel: Require ISL_AUX_USAGE_STC_CCS for stencil CCS

  • intel/isl: Set DepthStencilResource based on aux usage

  • anv: Dump push ranges via VK_KHR_pipeline_executable_properties

  • anv: Fix the comparison in an assert

  • anv: Push UBO ranges relative to the start of the binding

  • anv: Do an end-of-pipe sync before updating AUX table entries

  • intel/isl: Don’t align linear images to 64K on Gen12+

  • intel/blorp: Add support for swizzling fast-clear colors

  • anv: Swizzle fast-clear values

  • intel/iris: Always initialize CCS to 0

  • anv: Only add END_OF_PIPE_SYNC if we actually have AUX_INVAL

  • util/sparse_array: Finish the sparse_array in the tests

  • util/sparse_array: Add a node_size_log2 temporary

  • meson,ci: Disable sparse_array tests on windows

  • util/sparse_array: Stash the node level in the node pointer

  • anv: Stop fetching the timestamp frequency ourselves

  • intel/dump_gpu: Add an ensure_device_info helper

  • intel/dump_gpu: Handle a bunch of getparam in the no-HW case

  • intel/nir: Run copy-prop and DCE after lower_bool_to_int32

  • nir: Add b2b opcodes

  • aco: Implement b2b32 and b2b1

  • nir: Use b2b opcodes for shared and constant memory

  • nir: Insert b2b1s around booleans in nir_lower_to

  • anv: Set alignments on descriptor and constant loads

  • nir: Validate that memory load/store ops work on whole bytes

  • nir: Set UBO alignments in lower_uniforms_to_ubo

  • nir/opt_loop_unroll: Fix has_nested_loop handling

  • nir/lower_int64: Lower 8 and 16-bit downcasts with nir_lower_mov64

  • nir/algebraic: Add downcast-of-pack opts

  • nir: Add a nir_op_is_vec helper

  • nir: Copy propagate through vec8s and vec16s

  • nir: Handle vec8/16 in bool_to_bitsize

  • nir: Handle vec8/16 in gather_ssa_types

  • nir: Handle vec8/16 in lower_phis_to_scalar

  • nir: Handle vec8/16 in lower_regs_to_ssa

  • nir: Handle vec8/16 in opt_split_alu_of_phi

  • nir: Treat vec8/16 as select in opt_peephole_select

  • nir: Handle vec8/16 in opt_undef_vecN

  • nir: Handle vec8/16 in nir_shrink_array_vars

  • anv: Account for the header in anv_state_stream_alloc

  • anv/allocator: Use util_dynarray for blocks in anv_state_stream

  • spirv: Implement OpCopyObject and OpCopyLogical as blind copies

  • Revert “spirv: Implement OpCopyObject and OpCopyLogical as blind copies”

  • anv/image: Use align_u64 for image offsets

  • nir/from_ssa: Only chain movs when a src is also a dest

  • intel/fs: Choose memory message type based on bit size

  • anv: Improve brw_nir_lower_mem_access_bit_sizes

  • iris: Set alignments on cbuf0 and constant reads

  • intel/nir: Lower memory access bit sizes later

  • nir/load_store_vectorize: Fix shared atomic info

  • nir/load_store_vectorize: Use nir_iadd_imm for offsets

  • nir/load_store_vectorize: Add support for nir_var_mem_global

  • intel/nir: Enable load/store vectorization

  • spirv: Add a vtn_block() helper

  • spirv: Add cast and loop helpers for vtn_cf_node

  • spirv: Make vtn_case a vtn_cf_node

  • spirv: Make vtn_function a vtn_cf_node

  • spirv: Add a parent field to vtn_cf_node

  • spirv: Rewrite CFG construction

  • Revert “spirv: Rewrite CFG construction”

  • nir: Assert memory loads are aligned

  • anv: Advertise SEND count through VK_EXT_pipeline_executable_properties

  • anv: Fix UBO range detection in anv_nir_compute_push_layout

  • nir: Add an alignment to nir_intrinsic_load_constant

  • nir: Add some sanity assertions in opt_large_constants

  • intel: Add _const versions of prog_data cast helpers

  • anv: Report correct SLM size

  • intel/batch_decoder: Stop printing to stdout

  • intel/cfg: Add first/last_block helpers

  • anv: Emit pushed UBO bounds checking code in the back-end compiler

  • intel/blorp: Delete an unused enum

  • spirv: Handle OOB vector extract operations

  • spirv,nir: Add a better vector_insert

  • spirv: Error if OpCompositeInsert/Extract has OOB indices

  • nir/builder: Handle any bit-size selector in nir_extract

  • spirv: Call nir_builder directly for vector_extract

  • spirv,nir: Move the SPIR-V vector insert code to NIR

  • anv: Move vb_emit setup closer to where it’s used in flush_state

  • anv: Apply any needed PIPE_CONTROLs before emitting state

  • nir/dominance: Better handle unreachable blocks

  • nir/gcm: Loop over blocks in pin_instructions

  • nir/gcm: Use an array for storing the early block

  • nir/gcm: Move block choosing into a helper function

  • nir/gcm: Add a real concept of “progress”

  • nir/gcm: Delete dead instructions

  • nir/gcm: Prefer the instruction’s original block

  • intel/fs: Rename block to scan_block in can_coalesce_vars

  • intel/fs: Coalesce when the src live range is contained in the dst

  • glsl: Hard-code noise to zero in builtin_functions.cpp

  • nir: Delete the fnoise opcodes

  • meta,i965: Rip GL_EXT_texture_multisample_blit_scaled support out of meta

  • spirv: Allow constants and NULLs in SpvOpConvertUToPtr

  • anv: Properly handle all sizes of specialization constants

  • radv: Properly handle all sizes of specialization constants

  • turnip: Properly handle all sizes of specialization constants

  • spirv: Use nir_const_value for spec constants

  • nir/opt_deref: Remove certain sampler type casts

  • spirv: Fix passing combined image/samplers through function calls

  • anv: Drop an assert

  • nir/lower_subgroups: Mask off unused bits in ballot ops

  • anv: Add a vk_image_layout_to_usage_flags helper

  • anv: Move vk_image_layout_is_read_only higher

  • anv: Be more conservative about image view usage

  • anv: Rework anv_layout_to_aux_state

  • anv/blorp: Do less hard-coding of aux usages

  • anv: Generalize some aux usage checks

  • intel/blorp: Allow more HiZ usages in hiz_clear_depth_stencil

  • anv: Simplify a case in layout_to_aux_usage

  • anv/cmd_buffer: Move anv_image_init_aux_tt higher

  • intel/isl: Delete a misleading comment

  • intel/isl: Refactor isl_surf_get_ccs_surf

  • anv: Add support for HiZ+CCS

  • spirv: Rewrite CFG construction

  • intel/devinfo: Compute the correct L3$ size for Gen12

  • anv: Expose CS workgroup sizes based on a maximum of 64 threads

  • anv: Return an error if allocating attachment memory fails

  • anv: Add TRANSFER_SRC to pass usage not subpass usage

  • anv: Stop filling out the clear color in compute_aux_usage

  • anv: Assert surface states are valid

  • anv: Use ANV_FROM_HANDLE for pInheritanceInfo fields

  • anv: Mark images written in end_subpass

  • anv: Split command buffer attachment setup in three

  • anv: Allocate surface states per-subpass

  • intel: Move swizzle_color_value from blorp to ISL

  • anv: Disallow fast-clears which require format-reinterpretation

  • anv: Stop allowing non-zero clear colors in input attachments

  • anv: Refactor cmd_buffer_setup_attachments

  • anv: Rework depth_stencil_attachment_compute_aux_usage

  • anv: Split color_attachment_compute_aux_usage in two

  • anv: Use anv_layout_to_aux_usage for color during render passes

  • anv: Allow all clear colors for texturing on Gen11+

  • vulkan: Update Vulkan XML and headers to 1.2.139

  • nir/copy_prop_vars: Handle volatile better

  • nir/copy_prop_vars: Report progress when deleting self-copies

  • nir/dead_write_vars: Handle volatile

  • nir/combine_stores: Handle volatile

  • anv: Handle NULL descriptors

  • anv: Handle null vertex buffer bindings

  • anv: Claim VK_EXT_robustness2 support

  • intel/fs: Don’t delete coalesced MOVs if they have a cmod

  • vulkan: Allow destroying NULL debug report callbacks

  • anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+

  • nir/lower_double_ops: Rework the if (progress) tree

  • nir/opt_deref: Report progress if we remove a deref

  • nir/copy_prop_vars: Record progress in more places

Jesse Natalie (3):

  • wgl: add official gldrv.h header-file

  • wgl: use gldrv.h instead of stw_icd.h

  • util/ralloc: fix ralloc alignment on Win64

John Stultz (7):

  • freedreno: Add ir3_cf.c and ir3_delay.c to Makefile.sources

  • panfrost: Move pan_afbc.c file to the the right Makefile.source file

  • gallium: hud_context: Fix scalar initializer warning.

  • Android.mk: Tweak MESA_ENABLE_LLVM checks

  • etnaviv: Avoid shift overflow

  • vc4_bufmgr: Remove duplicative VC definition

  • r600: Fix build error in sfn_nir_lower_fs_out_to_vector.cpp

Jon Turney (1):

  • Fix util/process test on Cygwin

Jonathan Marek (79):

  • freedreno/a6xx: use single format enum

  • freedreno/a6xx: fix Z24_UNORM_S8_UINT_AS_R8G8B8A8

  • freedreno: name sysmem color/depth flush events

  • freedreno/a6xx: document some unknown bits

  • turnip: add option to force use of hw binning

  • turnip: fix COND_EXEC reserved size in tu_query

  • turnip: add tu_device pointer to tu_cs

  • turnip: automatically reserve cmdstream space in emit_pkt4/emit_pkt7

  • turnip: remove marker seqno

  • turnip: make cond_exec helper easier to use

  • turnip: move tile_load_ib/sysmem_clear_ib into draw_cs

  • hud: add GALLIUM_HUD_SCALE

  • turnip: enable sampleRateShading feature

  • turnip: enable fullDrawIndexUint32/independentBlend/dualSrcBlend/logicOp

  • etnaviv: disable INT_FILTER for ASTC

  • util/format: add missing BC4/BC5 vulkan formats

  • turnip: rework format table to support r5g5b5a1_unorm/b5g5r5a1_unorm

  • turnip: add r5g5b5a1_unorm/b5g5r5a1_unorm formats

  • turnip: check the right alignment requirement on shader iova

  • turnip: move some constant state to tu6_init_hw

  • turnip: remove unecessary MRT_CONTROL fill

  • turnip: minify image_view extent

  • turnip: fix hw binning + render_area offset interaction

  • turnip: fix srgb MRT

  • turnip: don’t hardcode gmem base for input attachment

  • turnip: remove unnecessary fb size check

  • turnip: fall back to sysmem when attachments don’t fit into gmem

  • turnip: increase array sizes in tu_descriptor_map

  • turnip: improve binning pipe layout config

  • turnip: fix tile->slot calculation

  • etnaviv: nir: add compile_check_limits

  • freedreno/registers: more GRAS_CL_CNTL bits, Z_CLAMP

  • turnip: fix znear clipping

  • turnip: implement depth clamp

  • turnip: implement timestamp query

  • turnip: fix compute shaders crashing after geometry shader change

  • turnip: improve vertex input handling

  • turnip: use buffer size instead of bo size for VFD_FETCH_SIZE

  • freedreno/registers: add RB_CCU_CNTL bitfields

  • freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter

  • turnip: RB_CCU_CNTL fixes

  • turnip: split up gmem/tile alignment

  • turnip: fix nir validate failure from push constant lowering

  • turnip: disable 8x msaa

  • turnip: save attachment samples in renderpass state

  • turnip: use dirty bits for dynamic viewport/scissor state

  • turnip: rework format helpers

  • turnip: add vk_format_is_snorm/is_float

  • turnip: new clear/blit implementation with shader path fallback

  • freedreno/computerator: support nop prefix

  • freedreno/computerator: support bindless sampler instructions

  • freedreno/ir3: fix emit_tex_info split_dest

  • freedreno/ir3: don’t overwrite wrmask in ir3_SAM

  • turnip: compute render_components/srgb_cntl at renderpass creation time

  • turnip: don’t limit framebuffer size to image size

  • turnip: image_view rework

  • nir: add common convert_ycbcr for vulkan csc

  • nir: convert_ycbcr: preserve alpha channel

  • anv: use common nir_convert_ycbcr

  • radv: use common nir_convert_ycbcr

  • turnip: fix GMEM resolve in CmdNextSubpass

  • turnip: disable depth test for S8_UINT attachment

  • turnip: improve GMEM load/store logic

  • turnip: enable VK_FORMAT_S8_UINT as stencil format

  • turnip: set shader key msaa field

  • turnip: implement VK_EXT_sample_locations

  • turnip: implement VK_EXT_filter_cubic

  • turnip: enable cube arrays

  • turnip: implement VK_EXT_sampler_filter_minmax

  • turnip: divide cube map depth by 6

  • freedreno/ir3: fix 16-bit ssbo access

  • freedreno/ir3: set even bit for f2f16_rtne

  • freedreno/ir3: fix incorrect conversion folding

  • turnip: remove unused RB_UNKNOWN_8E04_blit

  • turnip: use RESOLVE_TS event

  • turnip: add adreno 650

  • nir: add pack_32_2x16_split/unpack_32_2x16_split lowering

  • freedreno/ir3: run nir_lower_pack

  • turnip: fix wrong substream size in parse_multisample_and_color_blend

Jordan Justen (6):

  • intel/compiler: Restrict cs_threads to 64

  • intel: Update TGL PCI strings

  • intel: Add TGL PCI ID

  • intel/dev: Split .num_subslices out of GEN12_FEATURES macro

  • intel/dev: Add device info for RKL

  • docs/relnotes/new_features.txt: Add RKL to 20.1 release notes

Jose Maria Casanova Crespo (5):

  • broadcom: Fix implicit declaration of ffs for Android build

  • v3d: Sync on last CS when non-compute stage uses resource written by CS

  • v3d: Primitive Counts Feedback needs an extra 32-bit padding.

  • v3d: Fix swizzle in DXT3 and DXT5 formats

  • v3d: Include supported DXT formats to enable s3tc/dxt extensions

Joshua Ashton (3):

  • radv: Use TRUNC_COORD on samplers

  • radv: Pass logical device to si_emit_graphics

  • radeonsi: Use TRUNC_COORD on samplers

José Fonseca (4):

  • meson: Avoid duplicate symbols.

  • scons: Prune out unnecessary targets.

  • gitlab-ci: Prune all SCons jobs except scons-win64, and allows failures.

  • appveyor: Remove Meson job.

Juan A. Suarez Romero (6):

  • nir/lower_double_ops: add note for lowering mod

  • nir/lower_double_ops: relax lower mod()

  • nir/algebraic: coalesce fmod lowering

  • anv: use urb_setup_attribs in SBE

  • intel/compiler: store the FS inputs in WM prog data

  • anv/pipeline: allow more than 16 FS inputs

Karol Herbst (18):

  • clover: add trivial clCreateCommandQueueWithProperties implementation

  • nir/lower_ssbo: handle atomics

  • gallium: make handles of set_global_binding 64 bit

  • Revert “gallium: make handles of set_global_binding 64 bit”

  • nv50, nvc0: fix must_check warning of util_dynarray_resize_bytes

  • clover: fix build with single library clang build

  • gallium: add PIPE_CAP_SYSTEM_SVM

  • clover: add stubs for SVM

  • clover: implement CL_DEVICE_SVM_CAPABILITIES

  • clover: implement clSetKernelArgSVMPointer

  • clover: implement SVM functions for devices with fine grained system SVM support

  • clover: implement cl_arm_shared_virtual_memory

  • clover: expose cl_arm_shared_virtual_memory for devices with SVM support

  • nvc0: enable ASTC and ETC on GM20B

  • mesa: fix enum value of VIEWPORT_SWIZZLE_POSITIVE_W_NV

  • gallium: initialize viewport swizzle in cso_set_viewport_dims

  • Revert “nvc0: fix line width on GM20x+”

  • st/mesa: properly guard fallback_copy_texsubimage aginst failed maps

Kenneth Graunke (14):

  • intel/genxml: Drop “reserved” enum

  • isl: Fix the android build.

  • iris: Dump frame markers with INTEL_DEBUG=submit

  • iris: Trim “../../src/gallium/drivers/iris/” out of debug dump filenames

  • iris: Make mocs an inline helper in iris_resource.h

  • iris: Fix BLORP vertex buffers to respect ISL MOCS settings

  • iris: Set MOCS for constant packets on Gen12+

  • intel/compiler: Drop nir_lower_to_source_mods() and related handling.

  • intel/compiler: Put back saturate on [iu]add_sat opcodes

  • intel/compiler: Don’t copy prop source mods into PICK_HIGH_32BIT

  • intel/compiler: Delete abs/neg handling in fsign code

  • intel/compiler: Don’t create 64-bit src1 immediates in opt_peephole_sel

  • nir: Actually do load/store vectorization beyond vec2

  • iris: Fix downcast of bound_vertex_buffers from uint64_t to int

Konrad Dybcio (1):

  • freedreno/a4xx: enable A405

Kristian Høgsberg (39):

  • nir: Delete unused is_var_constant() helper

  • nir: Make unroll pragma work on clang

  • freedreno/fdperf: Cast away some ignored return values

  • spirv/opencl: Cast opcode up front to avoid warnings

  • glsl: Use ‘using’ to be explicit about visitor overloads

  • nir: Remove always-true assert

  • turnip: Be explicit about converting vk compare func to a6xx

  • freedreno/a6xx: Add fd6_resource_screen_init()

  • freedreno: Set up supported modifiers in fd*_resource_screen_init()

  • freedreno: Add layout_resource_for_modifier screen vfunc

  • freedreno/a6xx: Implement layout for DRM_FORMAT_MOD_QCOM_COMPRESSED

  • turnip: Drop explicit configure opt-in for turnip

  • ci: Drop turnip opt-in option

  • freedreno/ir3: Set IR3_REG_HALF flag on src as well in immediate MOV

  • Mark a few static inline helpers with ASSERTED

  • main/get: Converted type conversion macros to inline functions

  • nir/types: Add glsl_float16_type() helper

  • freedreno/ir3: Lower output precision

  • Revert “glsl: Use a simpler formula for tanh”

  • Revert “spirv: Use a simpler and more correct implementaiton of tanh()”

  • freedreno/ir3: Don’t fold conversions into sign

  • glsl: Add ir_constant constructor for fp16

  • glsl: Add fp16 case for ir_triop_lrp optimization

  • glsl: Implement constant propagation for fp16

  • glsl: Expand fp16 to float before constant expression evaluation

  • glsl: Add type queries for fp16+float and fp16+float+double

  • glsl/lower_instructions: Handle fp16 for FDIV_TO_MUL_RCP

  • radeonsi: Stop exposing PIPE_SHADER_CAP_FP16

  • turnip: Add missing VKAPI_ATTR annotations

  • turnip: Stub out VK_KHR_external_{fence,semaphore}_fd

  • turnip: Make Android platform build

  • turnip: Drop dep_llvm from dependencies

  • freedreno/ir3: Fix sz vs class confusion

  • freedreno/computerator: Decouple ir3 assembler

  • freedreno/ir3: Move ir3 assembler to backend compiler

  • freedreno/ir3: Parse, but ignore @in, @out and @tex headers

  • freedreno/ir3: Reset lex line number when we start parsing

  • freedreno/ir3: Print @tex write mask using 0x%x

  • freedreno: Use the right amount of &’s

Krzysztof Raszkowski (10):

  • gallium/swr: fix gcc warnings

  • gallium/swr: Fix gcc 4.8.5 compile error

  • gallium/swr: Fix llvm11 compilation issues

  • gallium/swr: simplify environmental variabled expansion code

  • gallium/swr: fix rdtsc debug statistics mechanism

  • gallium/swr: Fix min/max range index draw

  • Revert “gallium/swr: Fix min/max range index draw”

  • gallium/swr: Fix vcvtph2ps llvm intrinsic compile error

  • gallium/swr: Fix array stride problem.

  • gallium/swr: Re-enable scratch space for client-memory buffers

Leandro Ribeiro (1):

  • i965: remove duplicated comment

Leo Liu (1):

  • radeon/jpeg: fix the jpeg dt_pitch with YUYV format

Lepton Wu (1):

  • virgl: Use ETC2 formats directly when possible.

Lionel Landwerlin (49):

  • iris: implement gen12 post sync pipe control workaround

  • anv: implement gen9 post sync pipe control workaround

  • anv: implement gen12 post sync pipe control workaround

  • anv: set MOCS on push constants

  • mesa: add INTEL_blackhole_render

  • i965: enable INTEL_blackhole_render

  • st: add support for INTEL_blackhole_render

  • iris: add support INTEL_blackhole_render

  • intel/tools/aub_dump: move aub file initialization to maybe_init()

  • intel/tools/aub_dump: fix crash when using the default legacy context

  • intel/aub_dump: stub the waits when overriding the device

  • intel/tools/dump_gpu: fix getparam values

  • anv: stop storing prog param data into shader blobs

  • intel/decoder: don’t consider header fields past dword0

  • isl: implement linear tiling row pitch requirement for display

  • isl: properly filter supported display modifiers on Gen9+

  • isl: only apply main surface ccs pitch constraint with CCS

  • isl: drop min row pitch alignment when set by the driver

  • intel: add new TGL pci ids

  • i965/iris: fix crash when calling GetPerfQueryDataINTEL

  • vulkan/overlay: Add a workaround semaphore for application presenting without one

  • intel/perf: move register definition to special file

  • intel/perf: break GL query stuff away

  • intel/perf: move mdapi query definitions to their own file

  • intel/perf: document meaning of query field

  • intel/perf: store the probed i915-perf version

  • isl: set bpb for Y8_UNORM

  • isl: don’t warn in physical extent calculation for yuv formats

  • intel/aub_viewer: fix access to freed memory

  • drm-shim: return device platform as specified

  • drm-shim: stub libdrm’s use of realpath()

  • iris: properly free resources on BO allocation failure

  • iris: share buffer managers accross screens

  • iris: make resources take a ref on the screen object

  • i965: store DRM fd on intel_screen

  • i965: share buffer managers across screens

  • iris: drop cache coherent cpu mapping for external BO

  • intel/perf: Enable MDAPI queries for Gen12

  • anv: skip writing perfcntr in results on Gen12+

  • util/sparse_free_list: manipulate node pointers using atomic primitives

  • iris: fail screen creation when kernel support is not there

  • include/drm-uapi: bump headers

  • intel/perf: store default sseu configuration

  • intel/perf: specify sseu configuration when supported

  • anv: force whole EU array to be powered for perf queries

  • drm-shim: provide a valid fake syncobj handle at creation

  • drm-shim: stub syncobj wait ioctl

  • iris: don’t assert on unfinished aux import in copy paths

  • anv: don’t expose VK_INTEL_performance_query without kernel support

Liviu Prodea (2):

  • scons/windows: Support build with LLVM 10.

  • util: Make process_test path compatible with mingw native toolchains

Louis-Francis Ratté-Boulianne (7):

  • glsl/linker: add DisableTransformFeedbackPacking workaround

  • glsl/linker: handle array/struct members for DisableXfbPacking

  • glsl/linker: add xfb workaround for modified built-in variables

  • gallium: add PIPE_CAP_PACKED_STREAM_OUTPUT

  • gallium: add PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED

  • gallium: add PIPE_CAP_PSIZ_CLAMPED

  • panfrost: fix transform feedback

Lucas Stach (1):

  • etnaviv: retarget transfer to render resource when necessary

Marek Olšák (254):

  • vbo: move GLvertexformat initialization into a template header file for reuse

  • vbo: use the template for noop GLvertexformat initialization

  • vbo: use the template for save GLvertexformat initialization

  • vbo: move reusable code from vbo_attrib_tmp.h into vbo_util.h

  • mesa: implement missing display list functions while switching to the template

  • radeonsi: don’t report that multi-plane formats are supported

  • radeonsi: fix the DCC MSAA bug workaround

  • radeonsi: don’t update states for the DCC MSAA bug on GFX6-7

  • glx: print FPS with 2 decimal places

  • mesa: fix incorrect uses of FLUSH_CURRENT

  • mesa: remove FLUSH_CURRENT calls that have no effect

  • mesa: import PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET handling

  • vbo: create the immediate mode buffer only in vbo_exec_vtx_map

  • vbo: skip FlushMappedBufferRange for glBegin/End by using a persistent mapping

  • vbo: don’t unmap persistent buffer mappings for glBegin/End

  • vbo: remove immediate mode code that doesn’t do anything and simplify stuff

  • vbo: interleave attrsz, attrtype, and active_sz in memory

  • vbo: remove a funky recursive call in glBegin

  • vbo: don’t check ctx->NewState twice in glBegin

  • vbo: keep the immediate mode buffer always mapped for simplicity

  • vbo: don’t set FLUSH_UPDATE_CURRENT for glVertex

  • vbo: pass only either uint32_t or uint64_t into ATTR_UNION

  • vbo: don’t store glVertex values temporarily into exec

  • vbo: optimize resizing vertex attributes during immediate mode

  • vbo: fix resizing 64-bit vertex attributes

  • vbo: use FlushVertices flags properly and clear NeedFlush correctly

  • vbo: increase the size of the immediate mode buffer to decrease draw count

  • vbo: add/update unlikely statements in ATTR_UNION

  • vbo: delay flagging FLUSH_STORED_VERTICES until glEnd

  • vbo: also map the immediate mode buffer for read

  • vbo: clean up resetting vertex attribs

  • vbo: merge use_buffer_objects into vbo_CreateContext to skip the big malloc

  • í965: don’t use _mesa_prim::is_indirect

  • mesa: remove unused _mesa_prim::is_indirect

  • mesa: don’t use bitfields in _mesa_prim

  • st/mesa: optimize st_update_array with ALWAYSINLINE

  • radeonsi: don’t wait for shader compilation to finish when destroying a context

  • mesa: translate into gallium vertex formats in mesa/main

  • mesa: remove unused _mesa_draw_indirect

  • st/mesa: always inline the code setting non-64bit vertex elements

  • st/mesa: simplify determination whether a draw has user vertex buffers

  • st/mesa: simplify determination whether a draw needs min/max index

  • st/mesa: change some loops from while to do..while in st_atom_array.c

  • st/mesa: make st_setup_current static

  • st/mesa: simplify releasing the current attrib buffer

  • gallium/u_upload_mgr: reduce dereferences by adding buffer_size

  • gallium/u_upload_mgr: don’t do align twice in the u_upload_alloc fast path

  • gallium/u_vbuf: adjust the heuristic for unrolling indices

  • gallium/cso_hash: inline a bunch of functions

  • gallium/cso_hash: make cso_hash declared within structures instead of alloc’d

  • gallium/cso_hash: remove always constant variable nodeSize

  • gallium/cso_hash: cosmetic changes, no behavior changes

  • gallium/cso_hash: remove another layer of pointer indirection

  • st/mesa: try to fix MSVC build failure due to ALWAYS_INLINE

  • vbo: remove dead code in vbo_can_merge_prims

  • vbo: remove redundant code in vbo_exec_fixup_vertex

  • mesa: document _mesa_prim::begin/end

  • mesa: don’t use memset in glDrawArrays

  • mesa: fix immediate mode with tessellation and varying patch vertices

  • gallium/util: remove unused u_surfaces.c/h

  • util: remove the dependency on kcmp.h

  • nir: fix gl_nir_lower_images for bindless images

  • tgsi_to_nir: set num_images and num_samplers with holes correctly

  • gallium/hash_table: consolidate hash tables with pointer keys

  • gallium/hash_table: consolidate hash tables with FD keys

  • gallium/hash_table: use the same callback signatures as util/hash_table

  • gallium/hash_table: turn it into a wrapper around util/hash_table

  • gallium/hash_table: remove some function wrappers

  • mesa: remove leftovers from ARB_shadow_ambient

  • mesa: call FLUSH_VERTICES before updating CoordReplace

  • i965: stop using “indirect” parameter from Driver.Draw (non-indirect)

  • mesa: remove unused “indirect” parameter from Driver.Draw

  • gallium/cso_hash: pack cso_node better

  • gallium/cso_hash: inline struct cso_hash_data

  • gallium: pass cso_velems_state into cso_context instead of pipe_vertex_element

  • gallium/u_threaded: fix uploading user indices with start != 0

  • gallium/u_threaded: convert dividing by index_size to a bit shift

  • mesa/i965: remove _mesa_prim::indirect_offset

  • mesa: remove redundant _mesa_prim::is_indexed

  • mesa: move num_instances and base_instance out of _mesa_prim

  • mesa: clean up glMultiDrawElements code, use alloca for small draw count (v2)

  • mesa: don’t unroll glMultiDrawElements if one count is 0

  • mesa: optimize glMultiDrawArrays, call Draw only once (v2)

  • mesa: fix incorrect prim.begin/end for glMultiDrawElements

  • nir: replace GCC unroll with an option that works on GCC < 8.0

  • gallivm: fix 5 warnings

  • nir: fix 5 warnings

  • mesa: fix 11 warnings

  • gallium/u_vbuf: silence a warning by using unreachable

  • mesa: add index_size_shift = log2(index_size) into _mesa_index_buffer

  • mesa: replace some index_size multiplications and divisions with shifts

  • vbo: don’t look at the second draw’s count when merging 2 glBegin/End draws

  • vbo: deduplicate copy_vertices functions

  • vbo: clean up vbo_copy_vertices

  • vbo: handle GS and tess primitive types when splitting Begin/End

  • vbo: clean up conditional blocks in ATTR_UNION

  • vbo: fold code from vbo_exec_fixup_vertex to vbo_exec_wrap_upgrade_vertex

  • Revert “mesa: check for z=0 in _mesa_Vertex3dv()”

  • mesa: remove _mesa_index_buffer::index_size in favor of index_size_shift

  • mesa: optimize get_index_size

  • mesa: deduplicate draw indirect functions

  • vbo: merge more primitive types for glBegin/End (v2)

  • vbo: merge draws even when begin==0 or end==0

  • glthread: don’t generate the sync fallback if the call size is not variable

  • glthread: don’t prefix variable_data with const

  • glthread: inline _mesa_unmarshal_dispatch_cmd and convert the switch to a table

  • glthread: reduce pointer dereferences in glthread_unmarshal_batch

  • glthread: use int instead of size_t where it’s OK

  • glthread: simplify repeated function sequences in marshal_generated.c

  • glthread: don’t insert _mesa_post_marshal_hook into every function

  • glthread: don’t increment variable_data if it’s the last variable-size param

  • glthread: add GL_DRAW_INDIRECT_BUFFER tracking and generator support

  • glthread: add/update count and marshal fields for many GL functions

  • glthread: handle complex pointer parameters and support GL functions with strings

  • glthread: check the size of all variable params and clean up the code

  • glthread: replace custom ClearBuffer marshalling with generated one

  • glthread: add support for TexParameteri and SamplerParameteri functions

  • glthread: add support for glFog, glLight, glLightModel, glTexEnv, glTexGen

  • glthread: add support for glClearNamedFramebuffer, glMaterial, glPointParameter

  • glthread: add support for glCallLists, glPatchParameterfv

  • glthread: add support for glMemoryObjectParameteriv, glSemaphoreParameterui64v

  • glthread: don’t insert an empty line after (void) cmd;

  • glthread: add marshal_call_after and remove custom glFlush and glEnable code

  • glthread: track for each VAO whether the user has set a user pointer

  • glthread: sync instead of disabling glthread for non-VBO pointers

  • glthread: replace custom glBindBuffer marshalling with generated one

  • glthread: merge glBufferData and glNamedBufferData into 1 set of functions

  • glthread: merge glBufferSubData and glNamedBufferSubData into 1 set of functions

  • glthread: add custom marshalling for glNamedBuffer(Sub)DataEXT

  • glthread: fix a crash with incorrect glShaderSource parameters

  • glthread: fall back if a param size is non-zero and a pointer param is NULL

  • radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS

  • ac: add a bug workaround for the 100% NGG culling case

  • radeonsi: determine uses_bindless_samplers correctly

  • st/mesa: flush the bitmap cache before st/dri and vbo flushes

  • st/mesa: fix a possible crash with selection and feedback modes

  • gallium/cso_context: remove cso_delete_xxx_shader helpers to fix the live cache

  • st/mesa: keep serialized NIR instead of nir_shader in st_program

  • vbo: use vbo_exec_wrap_upgrade_vertex for glVertex in ATTR_UNION

  • vbo: fix transitions from glVertexN to glVertexM where M < N

  • vbo: fix vbo_copy_vertices for GL_PATCHES and adjacency primitive types

  • gallium: add PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES

  • mesa: don’t unroll glMultiDrawElements with user indices for gallium

  • radeonsi/gfx10: cache metadata in L2 on small chips

  • radeonsi: set better tessellation tunables on gfx9 and gfx10

  • radeonsi: tune primitive binning for small chips

  • ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally

  • ac: disable late alloc on small gfx10 chips

  • gallium/u_threaded: don’t sync the thread for all unsychronized mappings

  • gallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffers

  • ac: unify denorm setting enforcement

  • ac: set new LLVM denormal flags

  • ac: don’t set old denormals flags with LLVM >= 11

  • nir: fix clip/cull_distance_array_size in nir_lower_clip_cull_distance_arrays

  • mesa: use vbo_attrib_tmp.h to generate display list vertex attrib functions

  • mesa: remove redundant api_loopback functions

  • glthread: align the batch buffer to 8 bytes for pointers and doubles again

  • glthread: enable display lists

  • glthread: track VAOs created by CreateVertexArrays

  • glthread: don’t execute any custom VAO and BindBuffer code in the Core profile

  • glthread: remove debug_print_marshal function

  • glthread: clean up debug_print_sync code

  • glthread: don’t declare unmarshal functions as inline

  • winsys/radeon: change to 3-space indentation

  • driconf: enable glthread for “From The Depths”

  • glthread: remove _mesa_post_marshal_hook, because it’s not very useful

  • glthread: simplify printing safe_mul in gl_marshal.py

  • glthread: autogenerate prototypes for custom-marshalled functions

  • glthread: move buffer functions into glthread_bufferobj.c

  • glthread: rename marshal.h/c to glthread_marshal.h and glthread_shaderobj.c

  • mesa: put gl_thread_state inside gl_context to remove pointer indirection

  • glthread: handle buffer unbinding via glDeleteBuffers

  • glthread: rename non_vbo helper functions

  • glthread: track which vertex array attribs are enabled

  • glthread: ignore vertex arrays with user pointers if they’re disabled

  • glthread: remove the marshal_fail XML attribute

  • vbo,gallium: make glBegin/End buffer size configurable by drivers

  • ac: fix fast division

  • st/mesa: fix use of uninitialized memory due to st_nir_lower_builtin

  • glthread: inline SET_func and add -O1 to build _mesa_create_marshal_table faster

  • glthread: declare marshal and unmarshal functions as non-static

  • glthread: compile marshal_generated.c faster by breaking it up into 8 files

  • nir: add and gather shader_info::writes_memory

  • glsl_to_tgsi: set shader_info::writes_memory

  • mesa: allow out-of-order drawing to optimize immediate mode if it’s safe

  • radeonsi: enable full out-of-order drawing when allow_draw_out_of_order is set

  • mesa: try to fix the android build

  • Move compiler.h and imports.h/c from src/mesa/main into src/util

  • mesa: don’t use <> for including internal headers

  • util: stop including files from mesa/main

  • radv: stop including files from mesa/main

  • util: don’t include p_defines.h and u_pointer.h from gallium

  • util: remove duplicated MALLOC_STRUCT and CALLOC_STRUCT

  • radeonsi: remove obsolete TODO comment related to compute-based culling

  • radeonsi: fix incorrect ordered_wave_id initilization for compute-based culling

  • radeonsi: set amdgpu-gds-size for mode == 2 of compute-based culling

  • radeonsi: always create wait_mem_scratch for compute-based culling

  • radeonsi: add num_vbos_in_user_sgprs into the shader cache key

  • radeonsi/gfx10: don’t use NGG culling if compute-based culling is used

  • radeonsi/gfx10: fix ds.ordered.add intrinsic for compute-based culling

  • radeonsi/gfx10: user correct ACQUIRE_MEM packet for compute-based culling

  • radeonsi/gfx10: fix the wave size for compute-based culling

  • radeonsi/gfx10: fix descriptors and compute registers for compute-based culling

  • gallium/u_threaded: call the driver to pin threads to L3 immediately

  • st/mesa: add environment variable pin_app_thread for faster glthread on AMD Zen

  • driconf: whilelist more games for glthread

  • mesa: optimize initialization of new VAOs

  • mesa: don’t ever set NullBufferObj in gl_vertex_array_binding

  • mesa: don’t ever bind NullBufferObj for glBindBuffer targets

  • mesa: don’t ever bind NullBufferObj to glBindBuffer(Base,Range) slots

  • mesa: remove NullBufferObj

  • mesa: remove no longer needed _mesa_is_bufferobj function

  • mesa: precompute _mesa_primitive_restart_index during state changes

  • mesa: split _mesa_primitive_restart_index into a function without gl_context

  • vbo: expose helper function vbo_get_minmax_index_mapped for glthread

  • util: move and adjust the vertex upload heuristic equation from u_vbuf

  • st/mesa: fix a crash due to passing a draw vertex shader into the driver

  • ac: out-of-order rasterization is not supported on gfx10

  • ac,radeonsi: simplify checking for Navi1x chips

  • radeonsi: use pipe_blend_state::max_rt to update fewer blend registers

  • ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11

  • ac: update and document fast math flags used by radeonsi

  • ac: generate FMA for inexact instructions for radeonsi

  • ac: reassociate FP expressions for inexact instructions for radeonsi

  • mesa: replace _NEW_EVAL with vbo_exec_update_eval_maps

  • mesa: reset primitive restart state in glClientAttribDefaultEXT

  • mesa: remove exec=”dynamic” from Draw functions that are not really dynamic

  • glthread: use 32-bit align instead of 64-bit ALIGN

  • glthread: reduce dereferences of the next batch

  • glthread: use GLenum16 in batch buffers to save space

  • glthread: sort variables in marshal structures to pack them optimally

  • gallium: add PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE for glthread

  • mesa: add Const.BufferCreateMapUnsynchronizedThreadSafe & MESA_MAP_THREAD_SAFE

  • mesa: add offset_is_int32 param into _mesa_bind_vertex_buffer for glthread

  • mesa: extend _mesa_bind_vertex_buffer to take ownership of the buffer reference

  • mesa: replace GLenum target with gl_shader_stage in NewProgram

  • ac/surface: rename micro tile mode enums like gfx10 uses them

  • ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it’s always set

  • ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE

  • ac/surface: match get_display_flag() with expectations for is_displayable

  • ac/surface: don’t compute DCC if it’s unsupported by DCN on gfx9+

  • ac/surface: move non-displayable DCC to the end of the buffer

  • ac/surface: add code for gfx10 displayable DCC

  • ac/surface: validate that DCC is enabled correctly on gfx9+

  • ac: enable displayable DCC on Navi12 & Navi14

  • mesa: report GL_INVALID_OPERATION for invalid glTextureBuffer target

  • st/mesa: expose more SPIR-V capabilities

  • radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size

  • radeonsi: revert an accidental change in si_clear_buffer

  • Revert “ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it’s always set”

  • Revert “ac: reassociate FP expressions for inexact instructions for radeonsi”

  • ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9

  • radeonsi: fix compilation of monolithic PS

  • radeonsi: don’t expose 16xAA on chips with 1 RB due to an occlusion query issue

Marek Vasut (4):

  • etnaviv: Destroy rsc->pending_ctx set in etna_resource_destroy()

  • etnaviv: Emit PE.ALPHA_COLOR_EXT* on GPUs with half-float support

  • etnaviv: Fix depth stencil ops on GC880/GC2000

  • etnaviv: Disable seamless cube map on GC880

Mark Janes (2):

  • nir: check shader type before writing to shaderinfo.tess union

  • nir: place aligned members after bitfields in shader_info.tess

Mark Menzynski (2):

  • util/blob: Add overwrite function for uint8

  • tgsi/util: Change boolean for bool

Martin Fuzzey (3):

  • freedreno: android: fix build failure on android due to python version

  • freedreno: android: add a6xx-pack.xml.h generation to android build

  • freedreno: android: fix build of perfcounters.

Mathias Fröhlich (19):

  • egl: Implement getImage/putImage on pbuffer swrast.

  • mesa: Fix FLUSH_VERTICES in SubpixelPrecisionBiasNV.

  • egl: Fix A2RGB10 platform_{device,surfaceless} PBuffer configs.

  • egl: Factor out dri2_add_pbuffer_configs_for_visuals {device,surfaceless}.

  • mesa: Check for OpenGL state change before flushing vertices.

  • mesa: Flush vertices before changing the OpenGL state.

  • i965: Move down genX_upload_sbe in profiles.

  • iris: Move down iris_emit_sbe_swiz in profiles.

  • i965: Use 32 bit u_bit_scan for vertex attribute setup.

  • i965: Use the VAOs binding information in array setup.

  • i965: Test original vertex array pointer to skip array upload.

  • i965: Split merge_inputs and clear_buffers.

  • i965: Reorder workaround flags computation.

  • i965: Remove glbinding from brw_vertex_element.

  • mesa: Remove now unused _mesa_draw_attrib_and_binding.

  • mesa: Remove now unused _mesa_draw_attrib.

  • mesa: Provide gl_vertex_format accessors.

  • i965: Make use of the vertex format functions in i965.

  • i965: Use gl_vertex_format in brw_vertex_element.

Matt Turner (11):

  • intel/tools: Do not print type/qualifiers/name for c_literal

  • intel/vec4: Make implied_mrf_writes() a vec4_instruction method

  • intel/compiler: Remove unnecessary local variables

  • intel/compiler: Make instructions_to_schedule a local variable

  • intel/compiler: Mark some methods and parameters const

  • intel/compiler: Mark visitor parameters to scheduler const

  • intel/compiler: Pass backend_shader * to cfg_t()

  • intel/compiler: Pass shader_stats for each SIMD mode

  • intel/compiler: Discount NOPs from instruction counts

  • isl: Avoid EXPECT_DEATH in unit tests

  • meson: Specify the maximum required libdrm in dri.pc

Mauro Rossi (5):

  • android: gallium/auxiliary: fix “Unused source files” in tesselator

  • android: aco: fix PIPE_FORMAT related building errors

  • android: r600/sfn: fix includes and libmesa_nir dependency

  • android: r600/sfn: Add GDS instructions

  • android: aco: add various compiler statistics

Michel Dänzer (33):

  • gitlab-ci: Update to latest ci-templates HEAD

  • gitlab-ci: Pass -j4 to make

  • gitlab-ci: Merge ccache and libxml2-utils into main apt-get install

  • gitlab-ci: Add ppc64el and s390x cross-build jobs

  • gitlab-ci: Build radeonsi & RADV in the ppc64el job

  • llvmpipe: Bump test timeout to 180 seconds

  • gitlab-ci: Only use gstreamer runners for the s390x job for now

  • gitlab-ci: Sort random failure softpipe skips

  • gitlab-ci: Add three more dEQP-GLES31 tests to softpipe skips

  • st/vdpau: Only call is_video_format_supported hook if needed

  • winsys/amdgpu: Make local variable r signed

  • util: Change os_same_file_description return type from bool to int

  • gitlab-ci: Drop “test-” prefix from llvmpipe/softpipe job names

  • gitlab-ci: Distribute jobs across more stages

  • gitlab-ci: Always name artifacts archive after the job producing it

  • gitlab-ci: Don’t restrict ppc64el/s390x build jobs to gstreamer runners

  • gitlab-ci: Don’t use buster-backports packages by default for x86_build

  • gitlab-ci: Fold scons-swr job into scons job

  • gitlab-ci: Move classic driver testing to a new meson-classic job

  • llvmpipe: Use uintptr_t for pointer values

  • gitlab-ci: Enable more Gallium drivers in meson-i386 job

  • gitlab-ci: Restrict s390x/ppc64el jobs to packet runners

  • gitlab-ci: Update to current templates

  • gitlab-ci: Rename “paths” YAML anchor to “all_paths”

  • gitlab-ci/lava: Add needs: for container image to test jobs (again)

  • gitlab-ci: Don’t require triggering build/test jobs manually

  • gitlab-ci: Run merge request pipelines automatically only for Marge Bot

  • gitlab-ci: Use all_paths in .test-manual rules

  • gbm/dri: Propagate queryDmaBufModifiers return value

  • amd/addrlib: Use enum instead of sparse chars to identify dimensions

  • mesa: Skip 3-byte array formats in _mesa_array_format_flip_channels

  • Revert “ac,radeonsi: fix compilations issues with LLVM 11”

  • Revert “gallium/gallivm: fix compilation issues with llvm 11”

Mike Blumenkrantz (6):

  • zink: set UBO alignments in nir_intrinsic_load_uniform lowering

  • zink: remove framebuffer cache

  • zink: explicitly unref old fb object when setting new one

  • iris: move iris_vtable to iris_screen

  • gallium: add pipe cap for scissored clears and pass scissor state to clear() hook

  • iris: handle PIPE_CAP_CLEAR_SCISSORED

Nanley Chery (6):

  • isl: Add a module which manages aux resolves

  • iris: Use isl_aux_usage_has_fast_clear()

  • iris: Use ISL’s access preparation functions

  • iris: Use isl_aux_state_transition_write()

  • i965: Use ISL’s access preparation functions

  • i965: Use isl_aux_state_transition_write()

Nataraj Deshpande (1):

  • dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM

Neha Bhende (2):

  • svga: fix size of format_conversion_table[]

  • svga: Use pipe_shader_state_from_tgsi to set shader state

Neil Armstrong (4):

  • gitlab-ci/lava: fix handling of lava tags

  • Revert “ci: Remove T820 from CI temporarily”

  • gitlab-ci: add FILES_HOST_URL and move FILES_HOST_NAME into jobs

  • gitlab-ci: re-enable mali400/450 and t820 jobs

Neil Roberts (17):

  • nir/opcodes: Add nir_op_f2fmp

  • glsl: Add support for float16 types in the IR tree

  • glsl: Add IR conversion ops for 16-bit float types

  • glsl: Add b2f16 and f162b conversion operations

  • glsl: Add ir_unop_f2fmp

  • glsl/validate: Allow float16 in the expression tree

  • glsl/lower_instructions: Use float16 constants when appropriate

  • glsl/opt_minmax: Add support for float16

  • glsl: Add a method to get precision from a deref instruction

  • glsl/hierarchical_visitor: Call leave_callback on leaf nodes

  • glsl: Add an IR lowering pass to convert mediump operations to 16-bit

  • glsl/standalone: Add an option to lower the precision

  • glsl: Add unit tests for the lower_precision pass

  • freedreno/ir3: Lower bools to bitsize

  • glsl: Inline builtins in a separate pass

  • glsl/lower_precision: Lower builtins depending on arguments

  • glsl/lower_precision: Use vector.back() instead of vector.end()[-1]

Paulo Zanoni (8):

  • intel: fix the gen 11 compute shader scratch IDs

  • intel: fix the gen 12 compute shader scratch IDs

  • intel/device: bdw_gt1 actually has 6 eus per subslice

  • anv: multiply the scratch space by 4 on gen9-10 like iris and i965

  • iris: remove hole from struct iris_bo

  • iris: remove unnecessary forward declaration

  • iris: remove useless bo->gtt_offset assignment

  • iris: make BATCH_SZ smaller by BATCH_RESERVED bytes

Peng Huang (1):

  • radeonsi: make si_fence_server_signal flush pipe without work

Pierre Moreau (1):

  • clover/nir: Check the result of spirv_to_nir

Pierre-Eric Pelloux-Prayer (44):

  • radeonsi/ngg: add VGT_FLUSH when enabling fast launch

  • radeonsi: test subsampled format in testdma

  • format: add format_to_chroma_format

  • gallium/video: remove pipe_video_buffer.chroma_format

  • gallium/vl: add 4:2:2 support

  • radeonsi: fix surf_pitch for subsampled surface

  • st/va: enable 4:2:2 chroma format

  • st/va: add support YUY2

  • radeonsi: remove AMD_DEBUG=sisched option

  • omx: fix build with gcc 10

  • meson: enable -fno-common by default

  • gitlab-ci: rules:changes to test on tested drivers changes

  • vdpau: remove bogus assert

  • st/mesa: disallow deferred flush if there are multiple contexts

  • radeonsi: enable glsl_zero_init for Curse of the Dead Gods

  • radeonsi: clarify the conditions when FLUSH_AND_INV_DB is needed

  • util/os_file: extend os_read_file to return the file size

  • util/u_process: add util_get_process_exec_path

  • util/xmlconfig: add new sha1 application attribute

  • radeonsi: enable workarounds for YoYo engine based games

  • util/u_process: fix Windows build

  • nir: update uses_demote flag in discard_to_demote pass

  • ac: fix ac_build_is_helper_invocation when postponed_kill is null

  • util: fix process_test path

  • ddebug: add missing forward declaration

  • radeon: fix includes

  • radeonsi: switch to 3-spaces style

  • radeon: switch to 3-spaces style

  • gallium/util: let shader live cache users know if a hit occured

  • radeonsi: dump shader stats when hitting the live cache

  • util/xmlconfig: fix sha1 comparison code

  • mesa: update pipeline when re-linking a program in use

  • gallium/u_threaded: flush batch when hitting mapping limit

  • radeonsi: use thread_context::bytes_mapped_limit

  • radeonsi: don’t assume ctx is always a threaded_context

  • radeonsi: skip vs output optimizations for some outputs

  • mesa: fix crash in find_value

  • gallium/utils: silence strncpy warning

  • st/omx: fix gcc warnings

  • radeonsi: fix export count

  • mesa: add gl_coontext::ForceIntegerTexNearest

  • driconf: add force_integer_tex_nearest option

  • radeonsi: don’t print gs_copy_shader stats for shaderdb

  • amd/addrlib: fix forgotten char -> enum conversions

Plamena Manolova (2):

  • intel/compiler: Add support for variable workgroup size

  • i965: Implement ARB_compute_variable_group_size

Qiang Yu (35):

  • lima: remove definition of lima_is_scanout

  • lima: use util_copy_framebuffer_state

  • lima: always add texture bo to submit

  • lima: remove lima_ctx_buff_va submit flags (v2)

  • lima: pass array as parameter to PLBU and VS command macros

  • lima: delay add plb buffer to submit when flush

  • lima: delay plbu head command generation to flush stage (v2)

  • lima: add render target to submit by dirty buffer flags

  • lima: add missing resolve check for damage and reload

  • lima: move syncobj from lima_submit to lima_context

  • lima: merge gp/pp submit

  • lima: put hardware related info to lima_gpu.h

  • lima: move flush code to lima_submit.c

  • lima: pass submit parameter for functions in lima_submic.c (v2)

  • lima: add lima_submit_create_stream_bo

  • lima: adjust pp_stream to use lima_submit_create_stream_bo

  • lima: use lima_submit_create_stream_bo for plbu/vs_cmd and pp_stack

  • lima: add lima_submit_get

  • lima: make lima_submit one time use drop data (v3)

  • lima: track write submits of context (v3)

  • lima: move plbu/vs_cmd_array into lima_submit

  • lima: move resolve into lima_submit

  • lima: move pp_max_stack_size to lima_submit

  • lima: move damage_rect into lima_submit

  • lima: move clear into submit (v2)

  • lima: move framebuffer info to lima_submit

  • lima: use per submit dump file

  • lima: optinal flush submit in lima_clear

  • lima: enable multi submit optimization

  • lima: move dump check to macro for lima_dump_command_stream_print

  • lima: rename lima_submit to lima_job

  • lima: fix buffer import with offset

  • lima: also check tiled and depth case when import

  • lima: set offset when export resource

  • panfrost: don’t always build bifrost_compiler

Quentin Glidic (1):

  • meson: Use dependency.partial_dependency()

Rafael Antognolli (18):

  • intel: Load the driver even if I915_PARAM_REVISION is not found.

  • intel/tools: Update aubinator_error_decode.

  • intel/blorp: Implement GEN:BUG:1605967699.

  • iris: Apply the flushes when switching pipelines.

  • anv: Wait for the GPU to be idle before invalidating the aux table.

  • iris: Split aux map initialization from invalidation.

  • iris: Wait for the GPU to be idle before invalidating the aux table.

  • intel/isl: Implement D16_UNORM workarounds.

  • intel/gen12+: Disable mid thread preemption.

  • iris: Enable EXT_depth_bounds_test extension.

  • drm-uapi: Update headers from Linux 5.7-rc1.

  • i965/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.

  • iris/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.

  • i965/bufmgr: Add support for MMAP_OFFSET ioctl.

  • iris/bufmgr: Add support for MMAP_OFFSET ioctl.

  • anv: Add anv_device parameter to anv_gem_munmap.

  • anv: Add support for new MMAP_OFFSET ioctl.

  • anv: Enable HiZ on multi-layer depth buffers.

Rhys Perry (118):

  • aco: fix gfx10_wave64_bpermute

  • aco: gfx10_wave64_bpermute reduce op to print_ir

  • aco: disable some instruction combining if it could change an exec operand

  • aco: improve SCC handling in some SALU combines

  • nir: fix nir_const_value_as_uint bit size in load/store vectorizer tests

  • gitlab-ci: remove load_store_vectorizer from expected s390x test failures

  • aco: add RegisterFile

  • aco: add some helpers for filling/testing register ranges

  • aco: improve GFX9 1D ddx/ddy assertion

  • spirv: improve creation of memory_barrier

  • spirv: fix memory_barrier_tcs_patch emission

  • aco: keep track of which events are used in a barrier

  • aco: fix carry-out size for wave32 v_add_co_u32_e64

  • aco: handle v_add_co_u32_e64 in parse_base_offset()

  • aco: add new NOP insertion pass for GFX6-9

  • aco: improve get_wait_states()

  • aco: consider non-hazard writes in handle_raw_hazard_internal

  • aco: improve control flow handling in GFX6-9 NOP pass

  • aco: only reserve sgprs for vcc if it’s used

  • aco: fix uninitialized data error in waitcnt pass

  • glsl/list: use uintptr_t for exec_node_data()’s subtraction

  • aco: add helpers for moving instructions for scheduling

  • aco: add helpers for ensuring correct ordering while scheduling

  • aco: allow barriers to be skipped during scheduling

  • aco: don’t stop scheduling at exports

  • aco: move some register demand helpers into aco_live_var_analysis.cpp

  • aco: add a late kill flag

  • aco: set late kill for v_interp_p1_f32 for some APUs

  • aco: fix instruction encoding for LS VGPR init bug workaround

  • aco: fix operand order for LS VGPR init bug workaround

  • nir/gather_info: handle emit_vertex_with_counter

  • radv: call nir_shader_gather_info again

  • radv/winsys: set has_syncobj_wait_for_submit in the null winsys

  • aco: set has_divergent_branch for discards in loops

  • aco: handle missing second predecessors at merge block phis

  • aco: handle when ACO adds new continue edges

  • aco: skip NIR in unreachable merge blocks

  • aco: improve check for unreachable loop continue blocks

  • aco: emit IR in IF’s merge block instead if the other side ends in a jump

  • aco: fix boolean undef regclass

  • nir/gather_info: fix per-vertex handling in try_mask_partial_io

  • aco: remove dead code in handle_operands()

  • aco: implement 64-bit VGPR constant copies in handle_operands()

  • aco: look at p_{extract,split}_vector’s definitions in pred_by_exec_mask()

  • glsl: fix race in instance getters

  • util/u_queue: fix race in total_jobs_size access

  • radv: add code for exposing compiler statistics

  • aco: add various compiler statistics

  • aco: add vmem/smem score statistic

  • radv, aco: collect statistics if requested but executables are not

  • radv: fix null winsys gpu_info array

  • aco: make PhysReg in units of bytes

  • aco: add SDWA_instruction

  • aco: print and validate opsel

  • aco: add emission support for register-allocated sdwa sels

  • aco: remove divergence check in sanitize_if()

  • aco: zero-initialize Temp

  • aco: improve vector optimization with sub-dword vectors

  • aco: fix p_extract_vector validation

  • aco: improve p_create_vector RA for sub-dword operands

  • aco: clear moved operands in get_reg_create_vector()

  • aco: fix 1D textureGrad() on GFX9

  • aco: implement various 8/16-bit conversions

  • aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y

  • aco: fix copy statistic for 64-bit vgpr constant copy

  • aco: add VOP3P_instruction

  • aco: implement sub-dword swaps

  • aco: implement 64-bit sgpr swaps

  • nir/lower_bit_size: fix lowering of shifts

  • nir/lower_bit_size: fix lowering of {imul,umul}_high

  • nir/algebraic: don’t undo lowering of 8/16-bit comparisons to 32-bit

  • aco: decrease the uses of other copy operations after splitting/removing

  • aco: copy-propagate p_create_vector copies of vectors

  • aco: remove copy in load_input_from_temps()

  • aco: move call to store_output_to_temps in store_ls_or_es_output earlier

  • aco: combine VALU and SALU into various VOP3 instructions

  • aco: improve code for 32-bit isign

  • aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations

  • aco: fix outdated label_vec from p_create_vector labelling

  • radv: align buffer descriptor sizes to dword

  • radv: allocate larger shader memory slabs if needed

  • aco: be more careful about using SMEM for load_global

  • aco: add and use RegClass::get() helper

  • aco: add emit_load helper

  • aco: refactor load_lds to use new helpers

  • aco: use emit_load helper for VMEM/SMEM loads

  • aco: add helpers for splitting stores

  • aco: refactor store_lds() to use new helpers

  • aco: refactor store_vmem_mubuf() to use new helpers

  • aco: refactor visit_store_ssbo() to use new helpers

  • aco: refactor visit_store_global() to use new helpers

  • aco: refactor visit_store_scratch() to use new helpers

  • aco: add and use get_buffer_store_op() helper

  • aco: allow 8/16-bit shared loads

  • aco: vectorize global loads/stores

  • aco: handle undef p_create_vector operands in the optimizer

  • aco: clobber scc in s_bfe_u32 in get_alu_src()

  • aco: improve sub-dword emit_split_vector() with sgprs

  • aco: lower 8/16-bit integer arithmetic

  • radv/aco: enable 8/16-bit storage and int8/int16 on GFX8+

  • aco: make RegisterFile::block() take a regclass

  • aco: check alignment of non-subdword registers in get_reg_specified()

  • aco: fix neighboring register check in get_reg_simple()

  • aco: split self-intersecting copies instead of swapping

  • aco: don’t recurse in sub-dword get_reg_simple()

  • aco: improve RA for uneven p_split_vector

  • aco: add missing adjust_max_used_regs()

  • aco: fix sub-dword out-of-bounds check in RA validator

  • aco: fix sub-dword overwrite check in RA validator

  • aco: add various GFX10 int16 opcodes

  • aco: improve clamped integer addition disassembly workaround

  • aco: fix vgpr nir_op_vecn with sgpr operands

  • aco: consider blocks unreachable if they are in the logical cfg

  • aco: remove use of f-strings

  • aco: add message to static_assert

  • nir: add missing group_memory_barrier handling

  • nir/opt_if: run opt_peel_loop_initial_if after all other optimizations

  • nir: fix lowering to scratch with boolean access

Rob Clark (147):

  • freedreno/drm: readonly cmdstream

  • freedreno/ir3: shuffle a few ir3_register fields

  • freedreno/ir3: cleanup after lower_locals_to_regs

  • freedreno/ir3: fix crash when no non-input instructions

  • freedreno/ir3: split out delay helpers

  • freedreno/ir3: move nop padding to legalize

  • freedreno/ir3: move block-scheduling into legalize

  • freedreno/ir3: move atomic fixup after RA

  • freedreno/ir3: a bit more optmsgs debug

  • freedreno/ir3/ra: make use()/def() functions instead of macros

  • freedreno/ir3: fix kill scheduling

  • freedreno/ir3: post-RA sched pass

  • freedreno/ir3: number instructions from one

  • freedreno/ir3: add is_tex_or_prefetch()

  • freedreno/ir3: don’t precolor unused inputs

  • freedreno/ir3: two pass register allocation

  • freedreno/a6xx: fix lrz overflow

  • freedreno/ir3: add RA sanity check

  • freedreno/ir3: remove unused tex arg harder

  • freedreno/ir3: create fragcoord instructions in input block

  • freedreno/ir3: simplify split from collect

  • freedreno/ir3: fix a dirty lie

  • freedreno: allow ctx->batch to be NULL

  • freedreno/ir3: fold const conversion into consumer

  • freedreno: allow INVALID modifier

  • freedreno/registers: teach gen_header.py about a3xx_regid

  • freedreno/a6xx: few register updates

  • freedreno: quiet INFO_MSG

  • freedreno/registers: cleanup CP_SET_MARKER

  • freedreno/computerator: import parser/lexer from fdre-a3xx

  • freedreno/computerator: polish out some of the rust

  • freedreno/computerator: rename prefix asm->ir3

  • freedreno/ir3: allow block->predecessors to be null

  • freedreno/computerator: add computerator

  • freedreno/computerator: fix build dependency

  • freedreno/ir3: remove from_tgsi

  • freedreno/a6xx: remove unused param

  • freedreno/a6xx: emit LRZ clear in sysmem too

  • freedreno/a6xx: whitespace fix

  • freedreno/a6xx: don’t emit YIELD packet

  • freedreno/a6xx: enable SKIP_IB2_ENABLE properly

  • freedreno: honor FD_MESA_DEBUG=nogrow

  • freedreno/ir3: remove regmask_set_if_not()

  • freedreno/ir3: rewrite regmask to better support a6xx+

  • freedreno/ir3: don’t hide latency when there is none to hide

  • freedreno/ir3: track half-precision live values

  • freedreno/ir3: update SFU delay

  • freedreno/ir3: fix crash with samgq workaround

  • freedreno/ir3: don’t precolor unassigned inputs

  • freedreno/ir3: fix assert with getinfo

  • freedreno/ir3: add assert

  • nir/print: show variable precision

  • freedreno/ir3: also lower lowp frag outputs

  • freedreno/computerator: add hrsq/hlog2/hexp2

  • freedreno/ir3: remove extra nops inserted in scheduler

  • freedreno/ir3: add simplified stall estimation

  • freedreno: fix FD_MESA_DEBUG=inorder

  • util/ra: spiff out select_reg_callback

  • util/ra: move NO_REG to header

  • freedreno/ir3: split out has_latency_to_hide()

  • freedreno/ir3: fix has_latency_to_hide

  • freedreno/ir3: track register usage in first RA pass

  • freedreno/ir3: round-robin RA

  • freedreno/ir3: try to avoid syncs

  • freedreno/computerator: add performance counter support

  • freedreno/fdperf: set locale

  • freedreno/a6xx: register update

  • freedreno/ir3: small cleanup and comments

  • freedreno/ir3: add bary_ij as src for meta:tex_prefetch

  • freedreno/ir3: remove unused helper

  • freedreno/ir3: fix bogus register footprint with tess/gs

  • freedreno/ir3: reformat disasm output

  • freedreno/ir3: convert debug bitfield to BITFIELD_BIT()

  • freedreno/ir3/ra: add debug option for RA debug msgs

  • freedreno/ir3/ra: split-up

  • freedreno/ir3/ra: add helper to map name to instruction

  • freedreno/ir3/ra: fix target register calculation

  • freedreno/ir3/ra: add helper to map name to array

  • freedreno/ir3/ra: drop extending output live-ranges

  • freedreno/ir3/ra: add def/use iterators

  • freedreno/ir3/ra: fix array liveranges

  • freedreno/ir3/ra: compute register target from liveranges

  • freedreno/ir3/ra: pick higher numbered scalars in first pass

  • freedreno/ir3/ra: split building regs/classes and conflicts

  • freedreno/ir3/ra: re-work a6xx merged register file conflicts

  • gitlab-ci: disable vs2019 build

  • freedreno: remove some obsolete debug options

  • util: fix u_fifo_pop()

  • freedreno: add logging infrastructure

  • freedreno/a6xx: timestamp logging support

  • freedreno: add some initial fd_log tracepoints

  • freedreno/a6xx: add some more tracepoints

  • freedreno/log: avoid duplicate ts’s

  • util: move ALIGN/ROUND_DOWN_TO to u_math.h

  • freedreno/ir3: fix android build

  • freedreno/log: fix build error

  • nir: fix definition of imadsh_mix16 for vectors

  • freedreno/ir3/cf: handle widening too

  • freedreno/ir3: fixup cat3 32b vs 16b

  • freedreno/ir3/cf: skip array load/store

  • freedreno/ir3: add a pass to collect SSA uses

  • freedreno/ir3/cf: use ssa-uses

  • freedreno/a6xx: add some compute logging

  • freedreno: fix missing locking

  • freedreno/ir3: also precompile compute shaders for shaderdb

  • freedreno: limit fp16 to frag and compute

  • glsl: don’t limit fp16 lowering to frag

  • nir: add some swizzle helpers

  • nir/lower_amul: fix slot calculation

  • freedreno/log: android support

  • freedreno/log: spiff out parser some more

  • freedreno/log: better decoding for multiple chunks per batch

  • freedreno/ir3: spiff out disasm a bit

  • freedreno/ir3: make falsedep use’s optional

  • freedreno/ir3: simplify grouping pass

  • freedreno/ir3: fix location of inserted mov’s

  • freedreno/ir3: new pre-RA scheduler

  • freedreno/ir3/sched: awareness of partial liveness

  • freedreno/ir3/postsched: remove some leftovers

  • freedreno/ir3/postsched: avoid moving tex ahead of kill

  • freedreno/ir3: add mov/cov stats

  • freedreno/ir3/ra: handle array case for SFU select_reg opt

  • freedreno/ir3: better cleanup when removing unused instructions

  • freedreno/ir3: rename depth->dce

  • freedreno/ir3/ra: cleanup some leftovers

  • mesa: avoid redundant VBO updates

  • mesa/st: avoid u_vbuf for GLES

  • gallium: add # of MRT to blend state

  • freedreno/computer: add script to test widening/narrowing

  • freedreno/ir3/ra: remove unused variable

  • freedreno/ir3/ra: use ir3_debug_print helper

  • freedreno/ir3/ra: split out helper for array assignment

  • freedreno/ir3/ra: only assign array base in first pass

  • freedreno/a6xx+tu: rename VSC_DATA/VSC_DATA2

  • freedreno: add helper to estimate # of bins per pipe

  • freedreno/a6xx: pre-calculate expected vsc stream sizes

  • freedreno/log-parser: support to read gzip’d logs

  • freedreno: small whitespace fix

  • freedreno: don’t realloc idle bo’s

  • freedreno: mark more state dirty when rebinding resources

  • freedreno: optimize rebind_resource()

  • freedreno: rebind resource in all contexts

  • freedreno: rebind_resource() *before* bo changes

  • freedreno/a6xx: invalidate tex state cache entries on rebind

  • freedreno: fix buffer import

  • freedreno/ir3: fix indirect cb0 load_ubo lowering

  • freedreno: clear last_fence after resource tracking

Rohan Garg (5):

  • ci: Split out radv build-testing on arm64

  • ci: Drop the git dependency in tracie

  • tracie: Switch to using shutil.move for cross filesystem moves

  • tracie: Print results in a machine readable format

  • tracie: Reformat code to fix indentation

Roland Scheidegger (7):

  • gallivm: fix crash with bptc border color sampling

  • gallivm: fix crash in emit_get_buffer_size

  • gallivm: disable rgtc/latc SNORM accellerated fetches

  • gallium/util: Add back (and rename) util_float_to_half implementation

  • gallivm: fix rgtc2 format

  • gallivm: switch the mask6/mask7 cases for signed rgtc formats

  • gallivm: fix stream id fetch

Roman Stratiienko (3):

  • panfrost: Align Android makefiles with recent changes

  • lima: Add missing source file to Android.mk

  • panfrost: Align Android makefiles with recent changes

Sagar Ghuge (13):

  • intel/isl: Move get_format_encoding function to isl

  • intel/isl: Switch to R8_UNORM format for compatiblity

  • intel/tools: Handle illegal instruction

  • intel/tools: Handle STATE_REG in typed source operand

  • intel/tools: Set correct address register file and number in i965_asm

  • intel/tools: Add test for address register as source

  • intel/tools: Add test for state register as source

  • intel/tools: Print c_literals 4 byte wide

  • intel/tools: Allow i965_disasm to disassemble c_literal input type

  • intel/genxml: Add patch count threshold field on gen12

  • intel/compiler: Track patch count threshold

  • anv: Set patch count threshold in 3DSTATE_HS

  • iris: Set patch count threshold in 3DSTATE_HS

Samuel Iglesias Gonsálvez (2):

  • radv: check buffer size in vkCreateBuffer()

  • radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE

Samuel Pitoiset (197):

  • aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6

  • aco: do not use ds_{read,write}2 on GFX6

  • gitlab-ci: disable a630 tests as mesa-cheza is down (again)

  • aco: fix waiting for scalar stores before “writing back” data on GFX8-GFX9

  • radv: make sure to not submit any IBs when RADV_FORCE_FAMILY is set

  • radv: set the chip name to GCN-NOOP when RADV_FORCE_FAMILY is set

  • aco: fix creating v_madak if v_mad_f32 has two sgpr literals

  • nir: do not use De Morgan’s Law rules for flt and fge

  • radv: fix line width range and granularity

  • radv: implement VK_EXT_line_rasterization

  • radv: remove LLVM sicheduler enable for The Talos Principle

  • radv: remove RADV_DEBUG=nosisched and RADV_PERFTEST=sisched

  • radv: remove unused RADV_HASH_SHADER_IS_GEOM_COPY_SHADER

  • radv: remove unnecessary RADV_DEBUG=nobatchchain option

  • docs/new_features: empty the feature list for the 20.1 cycle

  • radv: enable shaderStorageImageMultisample on GFX6-GFX7

  • radv: enable VK_EXT_sampler_filter_minmax on GFX6

  • radv: enable VK_NV_compute_shader_derivatives on GFX6-GFX7

  • radv: add a comment about VK_AMD_mixed_attachment_samples on GFX6-GFX7

  • docs/envvars: document RADV_TEX_ANISO

  • radv/winsys: add a new flag that requests zerovram allocations

  • radv: use RADEON_FLAG_ZERO_VRAM when creating the trace BO

  • radv: add the trace BO to the BO list at submit time

  • radv: implement a dummy winsys for creating devices without AMDGPU

  • ac,radeonsi: add ac_gpu_info::lds_size_per_cu

  • ac: add more ac_gpu_info related shader fields

  • radv/gfx10: adjust the number of simd per compute unit

  • radv/gfx10: adjust SGPRs/VGPRs related info

  • radv/gfx10: adjust the LDS size used to compute waves

  • radv/gfx10: adjust the number of VGPRs used to compute waves

  • radv: make use of ac_gpu_info::max_wave64_per_simd

  • radv: fix creating null devices if KHR_display is enabled

  • ac/llvm: fix 64-bit fmed3

  • ac/llvm: fix 16-bit fmed3 on GFX8 and older gens

  • ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens

  • ac: add more fields to ac_gpu_info

  • ac/registers: add definitions for thread trace

  • radv: add a small helper that allows to submit internal CS

  • radv: add initial SQ Thread Trace support for GFX9

  • radv: emit thread trace markers after every draw/dispatch call

  • radv: add initial SQTT files generation support

  • radv: allow to capture SQTT traces with RADV_THREAD_TRACE=<start_frame>

  • radv: fix 32-bit build failure in radv_queue_internal_submit()

  • radv: fix size of sqtt_file_chunk_asic_info on 32-bit system

  • radv/rgp: adjust trace memory/shader clocks to fix frame duration

  • radv/sqtt: do not assume that the number of shader engines is 4

  • radv/sqtt: update SPI_CONFIG_CNTL.EXP_PRIORITY_ORDER value

  • ac/registers: add definitions for thread trace on GFX10

  • radv/sqtt: add support for GFX10

  • radv: update entrypoints generation from ANV

  • ac: rename lds_size_per_cu to lds_size_per_workgroup

  • ac: rename vgpr_alloc_granularity to wave64_vgpr_alloc_granularity

  • ac: rename min_vgpr_alloc to min_wave64_vgpr_alloc

  • aco: fix image load/store with lod and 1D images

  • gitlab-ci: build Fossilize in the test image for VK

  • gitlab-ci: add Fossilize support to detect compiler regressions

  • gitlab-ci: enable building the test image for VK unconditionally

  • gitlab-ci: add a job that runs Fossilize on RADV/Polaris10

  • radv/winsys: fix missing initializations of shader info in the null device

  • radv/sqtt: fix wrong check in radv_is_thread_trace_complete()

  • radv/sqtt: tidy up radv_emit_thread_trace_{start,stop}

  • radv/sqtt: add radv_copy_thread_trace_info_regs() helper

  • ac/registers: adjust some definitions for thread trace on GFX8

  • radv/sqtt: add support for GFX8

  • radv/sqtt: abort if SQTT is used on GFX6-GFX7

  • ac: add ac_gpu_info::cu_mask to store bitmask of compute units

  • radv/rgp: report correct cu_mask info

  • radv/rgp: report correct system ram size

  • nir/lower_input_attachments: remove bogus assert in try_lower_input_texop()

  • radv/entrypoints: declare a driver internal layer for SQTT

  • radv: use device entrypoints from the SQTT layer if enabled

  • radv/sqtt: add a helper that emits thread trace userdata markers

  • radv: initial implementation of the driver internal layer SQTT

  • radv/sqtt: describe begin/end command buffers with user markers

  • radv/sqtt: describe draw/dispatch and emit event markers

  • radv/sqtt: describe render pass color/depthstencil clears

  • radv/rgp: bump the instrumentation spec version to 1

  • radv/sqtt: describe pipeline and wait events barriers

  • gitlab-ci: add rules:changes for RADV

  • radv: do not recursively begin/end render pass for meta operations

  • radv: fix 32-bits build (again)

  • gitlab-ci: build RADV in meson-i386 to avoid 32-bit build failures

  • ac/llvm: add missing optimization barrier for 64-bit readlanes

  • radv/sqtt: describe begin/end subpass barriers with user markers

  • radv/sqtt: describe layout transitions with user markers

  • radv/gfx10: cache metadata in L2 on small chips

  • radv: use better tessellation tunables on GFX9+

  • radv: tune primitive binning for small chips

  • radv: rewrite late alloc computation

  • radv: use ac_gpu_info::use_late_alloc

  • radv: cleanup occurences of use_aco everywhere

  • radv: remove radv_shader_variant::aco_used

  • radv: remove unnecessary LLVM includes

  • radv: add llvm_compiler_shader() helper

  • gitlab-ci: remove useless ‘patch’ package in the VK test image

  • gitlab-ci: allow deqp-runner to use the maximum number of jobs

  • gitlab-ci: do not set the number of deqp-parallel jobs for RADV CTS

  • gitlab-ci: bump Vulkan CTS to 1.2.1.0

  • radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()

  • radv: only inject implicit subpass dependencies if necessary

  • radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control

  • radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control

  • radv: fix random depth range unrestricted failures due to a cache issue

  • radv: remove wrong assert that checks compute subgroup size

  • radv: fix optional pSizes parameter when binding streamout buffers

  • radv/winsys: fix wrong PCI ID for Vega10 in the null winsys

  • radv/winsys: spoof some values for num_render_backends in the null winsys

  • gitlab-ci: compile fossils with both RADV compiler backends (LLVM/ACO)

  • gitlab-ci: compile fossils with more ASICs

  • gitlab-ci: add a new stage for RADV CI

  • gitlab-ci: add a bunch of new fossils from the Sascha Vulkan demos

  • radv/llvm: fix subgroup shuffle for chips without bpermute

  • radv: enable VK_KHR_8bit_storage on GFX6-GFX7

  • ac/nir: use llvm.amdgcn.rcp for nir_op_frcp

  • ac/nir: use llvm.amdgcn.rsq for nir_op_frsq

  • ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()

  • nir/algebraic: add fexp2(fmul(flog2(a), 0.5) -> fsqrt(a) optimization

  • aco: only break SMEM clauses if XNACK is enabled (mostly APUs)

  • aco: always optimize v_mad to v_madak in presence of literals

  • ac/nir: split 8-bit load/store to global memory on GFX6

  • ac/nir: split 8-bit SSBO stores on GFX6

  • radv/llvm: enable 8-bit storage features on GFX6-GFX7

  • ac/nir: split 16-bit load/store to global memory on GFX6

  • ac/nir: split 16-bit SSBO stores on GFX6

  • radv/llvm: enable 16-bit storage features on GFX6-GFX7

  • radv: rename decompress/resummarize depth/stencil functions

  • radv: rename extra graphics pipeline decompress/resummarize fields

  • radv: cleanup creating the decompress/resummarize pipelines

  • radv: remove radv_layout_has_htile() helper

  • radv: enable lowering of GS intrinsics for the LLVM backend

  • ac,radv: add ac_gpu_info::has_double_rate_fp16

  • radv: only expose shaderFloat16 for chips with double rate fp16

  • radv: only expose storageInputOutput16 for chips with double rate fp16

  • radv: only expose fp16 control features for chips with double rate fp16

  • radv: only enable TC-compat HTILE for images readable by a shader

  • radv: allow TC-compat HTILE with GENERAL outside of render loops

  • aco: implement 16-bit nir_op_frexp_sig/nir_op_frexp_exp

  • aco: implement 16-bit nir_op_ffract

  • aco: implement 16-bit nir_op_fexp2/nir_op_flog2

  • aco: implement 16-bit nir_op_ftrunc/nir_op_fround_even

  • aco: implement 16-bit nir_op_fsqrt/nir_op_frcp/nir_op_frsq

  • aco: implement 16-bit nir_op_ffloor/nir_op_fceil

  • aco: implement 16-bit nir_op_fmax/nir_op_fmin

  • aco: implement 16-bit nir_op_fabs/nir_op_fneg

  • aco: implement 16-bit nir_op_fsub/nir_op_fadd

  • aco: implement 16-bit nir_op_fcos/nir_op_fsin

  • aco: implement 16-bit nir_op_fmul

  • aco: implement 16-bit nir_op_fsat

  • aco: implement 16-bit nir_op_fsign

  • aco: implement 16-bit nir_op_bcsel

  • aco: implement 16-bit nir_op_f2i32/nir_op_f2u32

  • aco: implement 16-bit nir_op_ldexp

  • aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3

  • aco: implement 16-bit comparisons

  • aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16

  • aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow

  • aco: implement 16-bit nir_op_f2i64/nir_op_f2u64

  • aco: fix nir_op_pack_32_2x16_split if one operand is a constant

  • radv: add radeon_set_context_reg_rmw() helper

  • radv: use RMW packets for updating the maximum sample distance

  • aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents

  • radv/aco: do not advertise VK_KHR_shader_subgroup_extended_types

  • aco: implement nir_op_f2i8/nir_op_f2u8

  • aco: fix emitting stream output with tess eval shaders

  • radv: do not abort with unknown/unimplemented descriptor types

  • radv: fix geometry shader primitives query with ACO on GFX10

  • radv: set missing SHARED_VGPR_CNT for NGG VS and ACO

  • radv/llvm: fix exporting the viewport index if the fragment shader needs it

  • aco: fix exporting the viewport index if the fragment shader needs it

  • nir/lower_int64: lower imin3/imax3/umin3/umax3/imed3/umed3

  • nir/opt_algebraic: lower 64-bit fmin3/fmax3/fmed3

  • gitlab-ci: add a list of excluded tests for RADV

  • radv: make sure to export the viewport index if FS needs it

  • radv: simplify checking for Navi1x chips

  • radv: adjust the supported subgroup stages

  • radv: fix robust_buffer_access if enabled via VkPhysicalDeviceFeatures2

  • gitlab-ci: add lists of expected failures for RADV CI

  • ac,radeonsi: fix compilations issues with LLVM 11

  • radv: do not expose GTT as device local memory mostly for APUs

  • radv: enable FMASK for color attachments only

  • radv: remove unused radv_device_memory::map_size field

  • radv: track memory heaps usage if overallocation is explicitly disallowed

  • radv: advertise VK_AMD_memory_overallocation_behavior

  • ac/llvm: fix nir_texop_texture_samples with NULL descriptors

  • aco: fix nir_texop_texture_samples with NULL descriptors

  • aco: fix adjusting the sample index with FMASK if value is negative

  • radv: handle NULL descriptors

  • radv: handle NULL vertex bindings

  • radv: advertise VK_EXT_robustness2

  • gitlab-ci: add a list of expected failures for FIJI with ACO

  • ci: fix reporting the number of unexpected/flakes

  • radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed

  • radv: don’t report error with other vendor DRM devices

  • aco: fix 64-bit trunc with negative exponents on GFX6

  • radv: limit the Vulkan version to 1.1 for Android

  • radv: handle different Vulkan API versions correctly

  • radv: update the list of allowed Android extensions

Satyajit Sahu (1):

  • st/va: GetConfigAttributes: check profile and entrypoint combination

Simon Ser (1):

  • mesa: add support for NV_pixel_buffer_object

Simon Zeni (1):

  • mesa: enable GL_EXT_draw_instanced for gles2

Sonny Jiang (1):

  • radeonsi: enable EXT_texture_shadow_lod

Szymon Andrzejuk (1):

  • virgl: Use align_free for align_malloc allocated buffer

Tapani Pälli (27):

  • intel/vec4: fix valgrind errors with vf_values array

  • glsl: fix a memory leak with resource_set

  • iris: fix aux buf map failure in 32bits app on Android

  • mesa: introduce boolean toggle for EXT_texture_norm16

  • i965: toggle on EXT_texture_norm16

  • mesa/st: toggle EXT_texture_norm16 based on format support

  • mesa/st: fix formats required for EXT_texture_norm16

  • nir: fix compilation warning on glsl_get_internal_ifc_packing

  • iris: toggle on PIPE_CAP_MIXED_COLOR_DEPTH_BITS

  • nir/glsl: gather bitmask of images used by program

  • iris: use the images_used mask in resolve pass

  • intel/compiler: detect if atomic load store operations are used

  • iris: provide dummy iris_image_view_aux_usage

  • iris: move existing image format fallback as a helper function

  • iris: determine aux usage during predraw and state setup

  • isl: allow compression for storage images on gen12+

  • iris: allow compression conditionally for images on gen12

  • glsl: set error_emitted true if type not ok for assignment

  • mesa/st: unbind shader state before deleting it

  • mesa/st: release variants for active programs before unref

  • mesa: remove redudant check

  • mesa: remove redudant assignment

  • glsl: remove redudant assignment

  • glsl: stop processing function parameters if error happened

  • mesa/st: initialize all winsys_handle fields for memory objects

  • anv: remove assert from GetImageMemoryRequirements[2]

  • st/mesa: destroy only own program variants when program is released

Thomas Hellstrom (5):

  • svga: Fix banded DMA upload

  • svga, winsys/svga: Fix persistent memory discard maps

  • svga: Treat forced coherent maps as maps of persistent memory

  • gallium/pipebuffer: Use persistent maps for slabs

  • winsys/svga: Optionally avoid caching buffer maps

Thong Thai (7):

  • Revert “st/va: Convert interlaced NV12 to progressive”

  • gallium/auxiliary/vl: fix bob compute shaders for deint yuv

  • st/va: remove unneeded code

  • st/va/postproc: reallocate interlaced destination buffer

  • radeonsi: add 10-bit HEVC encode support for VCN2.0 devices

  • radeon: add support for 10-bit HEVC encoding to VCN 2.0

  • st/va: add check for P010 and P016 encode/decode support

Timothy Arceri (51):

  • glsl: fix gl_nir_set_uniform_initializers() for image arrays

  • glsl: fix possible memory leak in nir uniform linker

  • glsl: set the correct number of samplers in a shader

  • glsl: set the correct number of images in a shader

  • glsl: fix resizing of the uniform remap table

  • glsl: reset next_image_index count for each shader stage

  • glsl: fix sampler index calculation in nir linker

  • glsl: add some error checks to the nir uniform linker

  • glsl: move nir link uniforms struct defs earlier

  • glsl: move add_parameter() earlier in nir link uniforms

  • glsl: move get_next_index() earlier in nir link uniforms

  • glsl: add name support to nir uniform linker

  • glsl: correctly find block index when linking glsl with nir linker

  • nir: add glsl_get_internal_ifc_packing() helper

  • nir: add glsl_get_std140_base_alignment() helper

  • nir: add glsl_get_std140_size() helper

  • nir: add glsl_get_std430_base_alignment() helper

  • nir: add glsl_get_std430_size() helper

  • glsl: add std140 and std430 layouts to nir uniform linker

  • glsl: correctly set explicit offsets for struct members

  • glsl: find the base offset for block members from unnamed blocks

  • glsl: nir linker fix setting of ssbo top level array

  • glsl: set ShaderStorageBlocksWriteAccess in the nir linker

  • glsl: add support for builtins to the nir uniform linker

  • glsl: dont try to assign uniform storage for uniform blocks

  • glsl: add subroutine support to nir linker

  • glsl: fix varying packing for 64bit integers

  • nir: fix packing of TCS varyings not read by the TES

  • nir: fix crash in varying packing on interface mismatch

  • glsl_to_nir: remove dead code

  • radeonsi: don’t lower constant arrays to uniforms in GLSL IR

  • nir: make opt_if_loop_terminator() less strict

  • nir: add matrix_layout to nir_variable data

  • glsl: fix struct offsets in the nir uniform linker

  • glsl: tidy up uniform storage value count code in NIR linker

  • Revert “glsl: fix resizing of the uniform remap table”

  • glsl: fix explicit locations for the glsl linker

  • glsl: error check max user assignable uniform locations

  • glsl: fix block index in NIR uniform linker

  • glsl: pull mark_array_elements_referenced() out into common helper

  • glsl: only set stage ref when uniforms referenced in stage

  • nir/gcm: allow derivative dependent intrinisics to be moved earlier

  • nir/gcm: be more conservative about moving instructions from loops

  • nir/gcm: dont move movs unless we can replace them later with their src

  • glsl: add bindless support to nir uniform linker

  • glsl: fix gl_nir_set_uniform_initializers() for bindless textures

  • st/glsl_to_nir: make use of nir linker for linking uniforms

  • glsl: some nir uniform linker fixes

  • glsl: remove some duplicate code from the nir uniform linker

  • glsl: stop cascading errors if process_parameters() fails

  • glsl: fix slow linking of uniforms in the nir linker

Timur Kristóf (90):

  • aco/optimizer: Don’t combine uniform bool s_and to s_andn2.

  • radv: Move some helper functions to the radv_shader.h header file.

  • aco: Extract setup_gs_variables into a separate function.

  • aco: Setup tessellation control shader variables.

  • aco: Implement load_tess_coord.

  • aco: Implement load_primitive_id for tessellation shaders.

  • aco: Implement load_patch_vertices_in.

  • aco: Implement load_invocation_id for tessellation control shaders.

  • aco: Implement control_barrier for tessellation control shaders.

  • aco: Implement memory_barrier_tcs_patch.

  • aco: Implement load_view_index for TCS and TES.

  • aco: Setup correct HW stages when tessellation is used.

  • aco: Use mesa shader stage when loading inputs.

  • aco: Remove vertex_geometry_gs assertion from merged shaders.

  • aco: Extract LDS alignment calculation to a separate function.

  • aco: Remove esgs_itemsize from LDS alignment calculation.

  • aco: Introduce new VMEM load/store helpers.

  • aco: Introduce new helpers for calculating address offsets.

  • aco: Refactor load_per_vertex_input in preparation for tessellation.

  • aco: Refactor VS output stores in preparation for tessellation.

  • aco: Slight fix to lds_store and lds_load.

  • aco: Fix combining DS additions in the optimizer.

  • aco: Implement tessellation control shader input/output.

  • aco: Store VS outputs correctly when tessellation is used.

  • aco: Fix LS VGPR init bug on affected hardware.

  • radv: Enable ACO for tessellation control shaders.

  • aco: Setup tessellation evaluation shader variables.

  • aco: Use TES output info when TES runs on the VS stage.

  • aco: Store TES outputs when TES runs on the HW VS stage.

  • aco: Enable streamout when TES runs on the HW VS stage.

  • aco: Implement loading TES inputs.

  • radv: Enable ACO for TES when there is no GS.

  • aco: Enable running TES as ES, including merged TES+GS.

  • radv: Enable ACO on all stages.

  • aco: Don’t generate an if when the first part of a merged HS or GS is empty.

  • aco: Store tess factors in VMEM only at the end of the shader.

  • aco: Only write TCS outputs to LDS when they are read by the TCS.

  • aco: Don’t store TCS outputs to LDS when we’re sure that none are read.

  • nir: Add ability to lower non-const quad broadcasts to const ones.

  • radv: Enable lowering dynamic quad broadcasts.

  • radv: Enable subgroup shuffle on GFX10 when ACO is used.

  • aco: Create null exports in instruction selection instead of assembler.

  • aco: Extract tcs_driver_location_matches_api_mask to separate function.

  • aco: Fix handling of tess factors.

  • aco: Allow combining TCS output VMEM stores.

  • aco: Allow combining LDS loads when loading tess factors.

  • aco: Skip 2nd read of merged wave info when TCS in/out vertices are equal.

  • aco: Use more optimal sequence at the beginning of merged shaders.

  • nir: Collect if shader uses cross-invocation or indirect I/O.

  • aco: Treat outputs of the previous stage as inputs of the next stage.

  • aco: Change isel inputs/outputs to a flat array.

  • aco: Zero-fill undefined elements in create_vec_from_array.

  • aco: Extract setup_tcs_info to a separate function.

  • aco: Fix workgroup size calculation.

  • aco: Extract store_output_to_temps into a separate function.

  • aco: When LS and HS invocations are the same, pass LS outputs in temps.

  • aco: Don’t store LS VS outputs to LDS when TCS doesn’t need them.

  • aco: Fix crash in insert_wait_states.

  • aco: Extract uniform if handling to separate functions.

  • aco: Print block_kind_export_end.

  • aco: Extract merged_wave_info_to_mask to its own function.

  • aco: Treat s_setprio as a scheduling barrier.

  • aco/ngg: Add new stage for hw_ngg_gs.

  • aco/ngg: Initialize exec mask for NGG VS and TES.

  • aco/ngg: Fix exports for NGG VS and TES.

  • aco/ngg: Setup NGG VS and TES stages.

  • aco/ngg: Implement NGG VS and TES.

  • aco/ngg: Schedule position exports of NGG VS/TES.

  • aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.

  • radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.

  • aco: Print shader stage in aco_print_program.

  • radv: Print shader stage before disassembly.

  • radv: Add inputs read by TES to radv_shader_info.

  • aco: Only store TCS outputs to VMEM when they are read by TES.

  • aco: Increase barrier_count to 7 to include barrier_barrier.

  • aco: Abort when RA can’t find a register.

  • aco: Const correctness for get_barrier_interaction.

  • aco: Const correctness for aco_print_ir.

  • aco: Use 24-bit multiplication in TCS I/O

  • aco: Use 24-bit multiplication for NGG wave id and thread id.

  • aco: Move s_setprio to correct place after the gs_alloc_req.

  • radv: Refactor calculate_tess_lds_size and get_tcs_num_patches.

  • aco: Use context variables instead of calculating TCS inputs/outputs.

  • aco: Remember VS/TCS output driver locations.

  • aco: Calculate workgroup size of legacy GS.

  • aco: Set config->lds_size when TES or VS is running on HW ESGS.

  • nir: Add new linking helper to set linked driver locations.

  • radv: Use new linking helper to set default driver locations.

  • aco: Use new default driver locations.

  • radv: Use smaller esgs_itemsize for ACO.

Tobias Jakobi (1):

  • meson: Link Gallium Nine with ld_args_build_id

Tomasz Pyra (1):

  • gallium/swr: spin-lock performance improvement

Tomeu Vizoso (34):

  • panfrost: Print intended field when decoding

  • panfrost: Add more info to some assertions

  • pan/midgard: Handle nir_intrinsic_load_barycentric_centroid

  • panfrost: Use DBG macro to avoid noise in the console

  • panfrost: Fix decoding of tiled 3D textures

  • panfrost: Only clamp the LOD to disable mipmapping when needed

  • gitlab-ci: Switch kernel for LAVA jobs to 5.5

  • gitlab-ci: Disable the lima job for now

  • gitlab-ci: Run GLES3 tests in dEQP on Panfrost

  • panfrost: Remove some more prints to stdout

  • gitlab-ci: Move to 5.5 kernel plus fixes for Panfrost

  • gitlab-ci: Use PAN_MESA_DEBUG=gles3 for Panfrost

  • gitlab-ci: Remove GLES3 test from Panfrost fails list

  • gitlab-ci: Skip dEQP-GLES3.functional.shaders.derivate.*

  • gallium: Add forgotten docs for new CAPs related to transform feedback

  • gitlab-ci: Update renderdoc

  • gitlab-ci: Use surfaceless platform also for apitrace

  • gitlab-ci: Place files from the Mesa repo into the build tarball

  • gitlab-ci: Serve files for LAVA via separate service

  • gitlab-ci: Disable jobs for Collabora’s LAVA lab

  • Revert “gitlab-ci: Disable jobs for Collabora’s LAVA lab”

  • panfrost: Remove most usage of midgard_payload_vertex_tiler

  • panfrost: Pass IS_BIFROST to pandecode_jc

  • panfrost: Don’t emit write_value jobs on Bifrost

  • panfrost: On Bifrost, set the right tiler descriptor

  • gitlab-ci: Test virgl driver

  • panfrost: Clean up a bit the tiler structs for Bifrost

  • panfrost: Emit sampler descriptor on bifrost

  • panfrost: Emit texture descriptor on bifrost

  • gitlab-ci: Update virglrenderer in the x86_test-gl image

  • gitlab-ci: Allow test jobs to add options to the dEQP invocation

  • gitlab-ci: Test OpenGL ES 3.1 on virgl

  • gitlab-ci: Test Virgl with traces

  • panfrost: Add Bifrost texture trampoline BO to batch

Uros Bizjak (1):

  • doc: Update features.txt for r600 with misc supported features

Vasily Khoruzhick (19):

  • lima: handle early-z and pixel kill better

  • lima: implement PLB PP stream cache

  • lima: add RGBA5551 and RGBA4444 formats

  • lima: don’t disable tiling if there’s linear modifier in list

  • lima: gpir: enforce instruction limit earlier

  • panfrost: split index cache into shared part

  • lima: enable minmax cache for index buffers

  • lima: print gp uniforms if gp debug is enabled

  • lima/gpir: improve disassembler output

  • lima/gpir: print acc ops even if we have only one source

  • lima/gpir: kill dead writes to regs in DCE

  • lima/gpir: add better lowering for ftrunc

  • lima/gpir: fix crash in schedule_insert_ready_list()

  • lima: disable Z16 format

  • lima: decode depth/stencil write bits in RSW

  • lima: split pixel and texel format tables

  • lima: add support for R and RG formats

  • lima: Implement lima_texture_subdata

  • lima: avoid situations when scissor minx > maxx or miny > maxy

Veerabadhran (1):

  • radeon/vce: Move global function pointer si_get_pic_param to local encoder structure Multi gpu use case broken when the function was global

Vilya Harvey (1):

  • zink. Don’t set incorrect sType in VkImportMemoryFdInfoKHR struct

Vinson Lee (16):

  • swr: Fix build with GCC 10.

  • lima: Fix build with GCC 10.

  • swr: Fix GCC 4.9 checks.

  • panfrost: Remove unused anonymous enum variables.

  • meson: Enable -Wno-deprecated only for bison > 2.3.

  • swr: Fix non-pod-varargs error.

  • st/nine: Fix incompatible-pointer-types-discards-qualifiers errors.

  • panfrost: Fix gnu-empty-initializer error.

  • util/u_process: Add util_get_process_exec_path for macOS.

  • mesa: Change _mesa_exec_malloc argument type.

  • gallivm: Add missing header for powf.

  • swr/rasterizer: Use private functions for min/max to avoid namespace issues.

  • swr: Remove Byte Order Mark.

  • r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member variable.

  • r600/sfn: Use correct setter method.

  • freedreno: Add missing va_end.

Yevhenii Kolesnikov (1):

  • intel/compiler: fix cmod propagation optimisations

Zhang, Boyuan (1):

  • radeonsi: Add support for midstream bitrate change in encoder

luc (1):

  • zink: confused compilation macro usage for zink in target helpers.