summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: Allow PIPE_TEXTURE_2D_ARRAY in si_texture_from_handleMichel Dänzer2019-07-231-2/+3
| | | | | | | Needed for the following st/mesa fix. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Fake CAPs for dEQP-GLES31Alyssa Rosenzweig2019-07-231-2/+14
| | | | | | | | | | | We still have some big ticket items left on GLES 3.0, but it's often helpful to be able to access higher dEQP levels for debugging features that just don't quite match a particular API. Plus, this opens up a whole slew of new features to poke at if boredom overtakes, ahem. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* nvc0/ir: Fix assert accessing null pointerMark Menzynski2019-07-231-1/+1
| | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111007 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111167 Signed-off-by: Mark Menzynski <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann<[email protected]>
* lima/ppir: fix branch codegen register encodeErico Nunes2019-07-231-2/+2
| | | | | | | | | | | The branch instruction has 6 bits per register operand which allows it to specify a component in the register. Fix codegen so that it outputs the right component, otherwise it always outputs the x component. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: fix debug logs in regallocErico Nunes2019-07-231-2/+2
| | | | | | | | The macros already prepend "ppir: ", remove them from the actual strings so it doesn't appear duplicated. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: fix alignment on regalloc spilling loadsErico Nunes2019-07-231-1/+1
| | | | | | | | | | | The spilling code spills entire vec4 registers regardless of the components used by the spilled uses. The inserted stores code force the 4 components, but these loads were using a variable number of components, causing bugs on loading the spilled registers. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* gallium: remove boolean from state tracker APIsIlia Mirkin2019-07-2220-144/+142
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gallium: switch boolean -> bool at the interface definitionsIlia Mirkin2019-07-22160-769/+769
| | | | | | | | | | | | | | | | | | This is a relatively minimal change to adjust all the gallium interfaces to use bool instead of boolean. I tried to avoid making unrelated changes inside of drivers to flip boolean -> bool to reduce the risk of regressions (the compiler will much more easily allow "dirty" values inside a char-based boolean than a C99 _Bool). This has been build-tested on amd64 with: Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d vc4 i915 svga virgl swr panfrost iris lima kmsro Gallium st: mesa xa xvmc xvmc vdpau va Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* radeonsi: fix warning: ‘ret’ may be used uninitializedMarek Olšák2019-07-221-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* tgsi: fix warning: ‘interp’ may be used uninitializedMarek Olšák2019-07-221-0/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallivm: fix warning: ‘op’ may be used uninitializedMarek Olšák2019-07-221-0/+3
| | | | Reviewed-by: Dave Airlie <[email protected]>
* iris: Support storage images that have matching typed formats for readsKenneth Graunke2019-07-221-3/+2
| | | | | Even if we don't directly support typed reads on a format, we can often translate them to a reasonable matching format. Advertise those too.
* iris: Stop advertising MSAA storage images by mistakeKenneth Graunke2019-07-221-1/+1
| | | | | | | | | | | | | | | | | | st_extensions.c sets const->MaxImageSamples (GL_MAX_IMAGE_SAMPLES) by looping over [16, 15, .. 1x] MSAA modes, and RGBA/BGRA/ARGB/ABGR 8888 color formats, calling pipe->is_format_supported() for each, with the usage set to PIPE_BIND_SHADER_IMAGE. If any are supported, it selects that number of samples. We were checking if sample_count <= 1, which meant that we were getting a value of 1x MSAA, rather than the expected 0x (feature doesn't exist). But, only on Icelake because Gen11 adds support for typed read messages for R8G8B8A8_UNORM. The lack of typed read messages for these formats was tricking the check on Gen9 to say no correctly. This caused some Icelake conformance failures, because we don't implement this feature. Just check for sample_count == 0 instead.
* panfrost: Set `initialized` in more casesAlyssa Rosenzweig2019-07-222-10/+9
| | | | | | | Indirect linear writes were not being marked as initialized, causing the back blit to be dropped, breaking the listed tests. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/ci: Update expectationsAlyssa Rosenzweig2019-07-221-4/+0
| | | | | | We've fixed some shader tests. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Implement register spillingAlyssa Rosenzweig2019-07-221-1/+2
| | | | | | | | | | | | Now that we run RA in a loop, before each iteration after a failed allocation we choose a spill node and spill it to Thread Local Storage using st_int4/ld_int4 instructions (for spills and fills respectively). This allows us to compile complex shaders that normally would not fit within the 16 work register limits, although it comes at a fairly steep performance penalty. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* v3d: fill logicop_func in the fragment shader key when precompiling shadersIago Toral Quiroga2019-07-221-0/+2
| | | | | | | | | | | Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng of logic ops on precompiled shaders, which we don't want to do. Also, this had the side effect of making shader-db crash, as during this lowering we would try to read the color format swizzle information from the fragment shader key that we don't populate in precompiled shaders because right now we only need it when logic operations are enabled. Reviewed-by: Eric Anholt <[email protected]>
* virgl: fix a sync issue in virgl_buffer_transfer_extendChia-I Wu2019-07-191-62/+15
| | | | | | | | | | | | | | | | | | | | | | | | | In virgl_buffer_transfer_extend, when no flush is needed, it tries to extend a previously queued transfer instead if it can find one. Comparing to virgl_resource_transfer_prepare, it fails to check if the resource is busy. The existence of a previously queued transfer normally implies that the resource is not busy, maybe except for when the transfer is PIPE_TRANSFER_UNSYNCHRONIZED. Rather than burdening us with a lengthy comment, and potential concerns over breaking it as the transfer code evolves, this commit makes the valid_buffer_range check the only condition to take the fast path. In real world, we hit the fast path almost only because of the valid_buffer_range check. In micro benchmarks, the condition should always be true, otherwise the benchmarks are not very representative of meaningful workloads. I think this fix is justified. The recent change to PIPE_TRANSFER_MAP_DIRECTLY usage disables the fast path. This commit re-enables it as well. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: rework virgl_transfer_queue_extendChia-I Wu2019-07-193-25/+24
| | | | | | | | Do not take a transfer and do the memcpy. Add a _buffer suffix to the function name to make it clear that it is only for buffers. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: fix virgl_buffer_transfer_extendChia-I Wu2019-07-191-0/+1
| | | | | | | | Without setting hw_res, virgl_transfer_queue_extend never finds a match and always returns NULL. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* radeonsi: initialize scissor registers etc. without clear stateMarek Olšák2019-07-191-1/+1
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: return success from vi_dcc_clear_level to simplify callersMarek Olšák2019-07-193-28/+26
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: fix compute-based culling regression in 1ce52c1e373Marek Olšák2019-07-191-1/+1
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: fix VGT_PRIMITIVE_TYPE programmingMarek Olšák2019-07-191-1/+3
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: enable Wave32 for vertex, geometry, and tessellation shadersMarek Olšák2019-07-191-0/+5
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: add debug options to enable/disable Wave32Marek Olšák2019-07-192-1/+35
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64Marek Olšák2019-07-194-15/+32
| | | | | | | Legacy GS has to use Wave64, so TES before GS has to use Wave64 too. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: implement Wave32Marek Olšák2019-07-1915-71/+144
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: use 32-bit wavemasks for Wave32Marek Olšák2019-07-191-8/+20
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: create the LLVM builder in ac_llvm_context_initMarek Olšák2019-07-191-4/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_initMarek Olšák2019-07-191-2/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/rtld: add support for Wave32Marek Olšák2019-07-193-0/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: initial Wave32 support in LLVM build helpersMarek Olšák2019-07-191-1/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: assume that selector != NULL for compute shadersMarek Olšák2019-07-191-14/+6
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove what appears to be legacy compute codeMarek Olšák2019-07-191-35/+6
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove si_program::use_code_object_v2Marek Olšák2019-07-192-6/+3
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: add si_shader_selector into si_computeMarek Olšák2019-07-194-81/+57
| | | | | | | | Now we can assume that shader->selector is always set. This will simplify some code. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: set threadgroup size to 0 for threadgroups with only 1 waveMarek Olšák2019-07-191-3/+3
| | | | | | | This has no effect on Wave64. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: set as_ngg for GS prologMarek Olšák2019-07-192-5/+9
| | | | | | | as_ngg is required by Wave32. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: remove the disable_ngg optionMarek Olšák2019-07-193-6/+2
| | | | | | | because legacy VS hangs. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behaviorMarek Olšák2019-07-194-19/+73
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: deduplicate code for esvert_lds_sizeMarek Olšák2019-07-191-6/+16
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: simplify a streamout loop in gfx10_emit_ngg_epilogueMarek Olšák2019-07-191-7/+6
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: don't use MALLOC for outputsMarek Olšák2019-07-191-9/+2
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: clean up ESGS ring size computationMarek Olšák2019-07-192-24/+11
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: fix unnecessary LDS overallocation for NGG GSMarek Olšák2019-07-192-8/+2
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: don't compile the GS copy shader if it's 100% not neededMarek Olšák2019-07-192-8/+12
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGGMarek Olšák2019-07-193-27/+27
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: update a tunable max_es_verts_base for NGGMarek Olšák2019-07-193-7/+12
| | | | | | | We have to fix the computation so as not to break quads. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx10: implement ARB_post_depth_coverageMarek Olšák2019-07-192-1/+6
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>