summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: always use compute rings for clover on CI and newer (v2)Marek Olšák2019-02-2611-75/+130
| | | | | | initialize all non-compute context functions to NULL. v2: fix SI
* radeonsi: fix query buffer allocationTimothy Arceri2019-02-262-25/+32
| | | | | | | | | | | | Fix the logic for buffer full check on alloc. This patch just takes the fix Nicolai attached to the bug report and updates it to work on master. Fixes: e0f0d3675d4 ("radeonsi: factor si_query_buffer logic out of si_query_hw") Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
* nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fsAlejandro Piñeiro2019-02-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: e68871f6a ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop
* radeonsi: use SDMA for uploading data through const_uploaderMarek Olšák2019-02-205-28/+143
| | | | | | | | v2: use tc.stream_uploader in si buffer_transfer_map if not called from the driver thread Reviewed-by: Nicolai Hähnle <[email protected]> (v1) Tested-by: Dieter Nützel <[email protected]>
* radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpowKenneth Graunke2019-02-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL leaves it undefined). Performing fpow lowering in NIR would break this behavior, preventing us from using prog_to_nir. According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>, which presumably does a zero-wins multiply. Lowering in NIR results in a non-legacy multiply, where: pow(0, 0) = 2^(log2(0) * 0) = 2^(-INF * 0) = 2^(-NaN) = -NaN which isn't the desired result. This reverts: - commit d6b75392067712908bdc372f1007e085439bf9f5 (ac/nir: remove emission of nir_op_fpow) - commit 22430224fec31591432d4a3e65c6f457ba1c1653 (radeonsi/nir: enable lowering of fpow) and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir after enabling prog_to_nir in st/mesa later in this series. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: set shader_buffers_declared properlyTimothy Arceri2019-02-201-10/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set colors_read properlyTimothy Arceri2019-02-201-7/+10
| | | | | | | | | | | | | | | | | | shader-db results for VEGA64: Totals from affected shaders: SGPRS: 1976 -> 1976 (0.00 %) VGPRS: 1240 -> 1144 (-7.74 %) Spilled SGPRs: 145 -> 145 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 34632 -> 34604 (-0.08 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 261 -> 285 (9.20 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set input_usage_mask properlyTimothy Arceri2019-02-201-11/+36
| | | | | | | | | | | | | | | | | | shader-db results for VEGA64: Totals from affected shaders: SGPRS: 791528 -> 792616 (0.14 %) VGPRS: 421624 -> 410784 (-2.57 %) Spilled SGPRs: 1639 -> 1674 (2.14 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 16103516 -> 16063696 (-0.25 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 136307 -> 137830 (1.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: Use uniform location when calculating const_file_max.Timur Kristóf2019-02-201-6/+6
| | | | | | | | | | | The nine state tracker can produce NIR uniform variables whose location is explicitly set. radeonsi did not take that into account when calculating const_file_max, resulting in rendering glitches. This patch fixes that. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add driconf option radeonsi_enable_nirMarek Olšák2019-02-192-1/+3
| | | | | Cc: 18.3 19.0 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: Fix guardband computation for large render targetsOscar Blumberg2019-02-121-2/+28
| | | | | | | | | Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <[email protected]> Cc: 18.3 19.0 <[email protected]>
* radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SELMarek Olšák2019-02-121-3/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUGMarek Olšák2019-02-121-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0Marek Olšák2019-02-111-2/+5
| | | | | Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_MAX_VARYINGSKarol Herbst2019-02-071-0/+3
| | | | | | | | | | | | | | | | | Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* radeonsi: use local ws variable in si_need_dma_spaceMarek Olšák2019-02-061-9/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't leak an index buffer if draw_vbo failsMarek Olšák2019-02-061-3/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make allocator_zeroed_memory unmappable and use bigger buffersMarek Olšák2019-02-061-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clear allocator_zeroed_memory with SDMAMarek Olšák2019-02-064-12/+9
| | | | | | | | so that it can be used in parallel IBs. This also removes the SO_FILLED_SIZE hack. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: initialize textures using DCC to black when possibleMarek Olšák2019-02-063-13/+63
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: release tokens after creating the shader programGert Wollny2019-02-051-0/+2
| | | | | | | | | | | | | | | | | | | | ureg_get_tokens clears the reference to the tokens, and create_compute_state makes a copy, hence the tokens must be explicitely released. Fixes: Direct leak of 256 byte(s) in 1 object(s) allocated from: #0 0x7ff729cf3c60 in realloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdbc60) #1 0x7ff721b1240c in tokens_expand ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:234 #2 0x7ff721b1c9c0 in get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:257 #3 0x7ff721b1c9c0 in copy_instructions ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2040 #4 0x7ff721b1c9c0 in ureg_finalize ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2090 #5 0x7ff721b1e919 in ureg_get_tokens ../../samba/mesa/src/gallium/auxiliary/tgsi/tgsi_ureg.c:2167 #6 0x7ff721f8b35a in si_create_dma_compute_shader ../../samba/mesa/src/gallium/drivers/radeonsi/si_shaderlib_tgsi.c:219 #7 0x7ff722043ed9 in si_compute_do_clear_or_copy ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:156 #8 0x7ff7220448d3 in si_clear_buffer ../../samba/mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:247 #9 0x7ff7220350e8 in vi_dcc_clear_level ../../samba/mesa/src/gallium/drivers/radeonsi/si_clear.c:274 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix crashing performance counters (division by zero)Marek Olšák2019-02-041-1/+1
| | | | Fixes: e2b9329f17 "radeonsi: move remaining perfcounter code into si_perfcounter.c"
* radeonsi: handle render_condition_enable in si_compute_clear_render_targetMarek Olšák2019-02-043-3/+8
|
* radeonsi: use compute for clear_render_target when possibleSonny Jiang2019-02-045-0/+184
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* ac/radv/radeonsi: add ac_get_num_physical_sgprs() helperTimothy Arceri2019-02-011-4/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: fix a comment typo in si_fine_fence_setMarek Olšák2019-01-301-1/+1
|
* radeonsi: unify error paths in si_texture_create_objectMarek Olšák2019-01-301-9/+9
|
* radeonsi: merge & rename texture BO metadata functionsMarek Olšák2019-01-301-64/+53
|
* radeonsi: enable dithered alpha-to-coverage for better qualityMarek Olšák2019-01-301-4/+5
| | | | | | | same as AMDVLK. GL_NV_alpha_to_coverage_dither_control allows controlling this behavior. The default is implementation-dependent.
* radeonsi/nir: add missing piece for bindless image supportTimothy Arceri2019-01-231-0/+6
| | | | | | This fixes some piglit tests and is was TGSI does. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename rfence -> sfenceMarek Olšák2019-01-221-49/+49
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rbo, rbuffer to buf or bufferMarek Olšák2019-01-225-102/+102
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rsrc -> ssrc, rdst -> sdstMarek Olšák2019-01-226-51/+51
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rquery -> squeryMarek Olšák2019-01-223-68/+68
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename r600_resource -> si_resourceMarek Olšák2019-01-2226-224/+224
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: remove r600 from commentsMarek Olšák2019-01-223-3/+3
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rview -> sviewMarek Olšák2019-01-221-3/+3
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rscreen -> sscreenMarek Olšák2019-01-223-5/+5
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: disable render cond & pipeline stats for internal compute dispatchesMarek Olšák2019-01-221-0/+18
|
* radeonsi: use compute for resource_copy_region when possibleSonny Jiang2019-01-225-0/+215
| | | | | | | v2: marek: fix snorm8 blits Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: add compute_last_block to configure the partial block fieldsJiang, Sonny2019-01-222-5/+49
|
* gallium/util: add util_format_snorm8_to_sint8 (from radeonsi)Marek Olšák2019-01-221-30/+2
|
* radeonsi: move PKT3_WRITE_DATA generation into a helper functionMarek Olšák2019-01-226-41/+43
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIKMarek Olšák2019-01-222-2/+4
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: fix the top-of-pipe fence on SIMarek Olšák2019-01-221-1/+2
| | | | | | SI doesn't have MEM. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: correct WRITE_DATA.DST_SEL definitionsMarek Olšák2019-01-223-3/+3
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: compile clear and copy buffer compute shaders on demandMarek Olšák2019-01-222-8/+14
| | | | same as all other shaders
* radeonsi: remove redundant call to emit_cache_flush in compute clear/copyMarek Olšák2019-01-221-1/+0
| | | | launch_grid calls it.
* radeonsi: use buffer_store_format_x & xyMarek Olšák2019-01-221-8/+17
|
* radeonsi: fix rendering to tiny viewports where the viewport center is > 8KMarek Olšák2019-01-221-3/+18
| | | | | | | | This fixes an assertion failure with GL CTS when cts-runner is used. (not a specific test) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877 Cc: 18.3 <[email protected]>