summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radv: add a workaround for Monster Hunter World and LLVM 7&8Samuel Pitoiset2019-05-175-3/+16
| | | | | | | | | | | | | | The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* svga: Add an environment variable to force coherent surface memoryThomas Hellstrom2019-05-179-31/+82
| | | | | | | | | | The vmwgfx driver supports emulated coherent surface memory as of version 2.16. Add en environtment variable to enable this functionality for texture- and buffer maps: SVGA_FORCE_COHERENT. This environment variable should be used for testing only. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flagsThomas Hellstrom2019-05-173-27/+33
| | | | | | | | | | | In order to be able to add access modes to a pb_validate_entry, update the pb_validate_add_buffer function to take a pointer hash table and also to return whether the buffer was already on the validate list. Update the svga winsys accordingly. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Set the rendered-to flag for dma transfers to surfacesThomas Hellstrom2019-05-171-0/+4
| | | | | | | | | | The rendered-to flag indicates that the HW surface content is more recent than the content of the mob. That's the case after a SurfaceDMA transfer to the surface. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* winsys/svga: Fix RELOC_INTERNAL mob GPU accessThomas Hellstrom2019-05-171-1/+9
| | | | | | | | | | | | | | | SVGA_RELOC_INTERNAL indicates a transfer between surface and backing mob. This means that if the GPU for example reads from the surface it writes to the backing mob. But since the buffer mapping code allows for simultaneous gpu- and cpu read access, a read from the surface to the mob will not synchronize a subsequent map to the readback. Fix this by inverting the mob access mode in a surface relocation with SVGA_RELOC_INTERNAL set. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Remove the surface_invalidate winsys functionThomas Hellstrom2019-05-176-31/+12
| | | | | | | | | | | Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed also for Linux for dirty buffers and operation without SurfaceDMA. For non-guest-backed operation, remove the surface cache surface invalidation altogether. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* Revert "softpipe/buffer: load only as many components as the the buffer ↵Gert Wollny2019-05-171-5/+2
| | | | | | | | | | | | | resource type provides" This reverts commit 865b9ddae4874186182e529b5fd154ab04a61f79. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl/nir: init non-static class member.Dave Airlie2019-05-171-0/+1
| | | | | | | | glsl_to_nir.cpp:276: uninit_member: Non-static class member "sig" is not initialized in this constructor nor in any functions that it calls. Reported by coverity Acked-by: Ilia Mirkin <[email protected]>
* imgui: fix undefined behaviour bitshift.Dave Airlie2019-05-171-1/+1
| | | | | | | | imgui_draw.cpp:1781: error[shiftTooManyBitsSigned]: Shifting signed 32-bit value by 31 bits is undefined behaviour Reported by coverity Acked-by: Ilia Mirkin <[email protected]>
* glsl: init non-static class member in link uniforms. (v2)Dave Airlie2019-05-171-1/+2
| | | | | | | | | | link_uniforms.cpp:477: uninit_member: Non-static class member "shader_storage_blocks_write_access" is not initialized in this constructor nor in any functions that it calls. Reported by coverity. v2: fix 9->0 typo (Ilia) Acked-by: Ilia Mirkin <[email protected]>
* glsl: init packed in more constructors.Dave Airlie2019-05-171-6/+6
| | | | | | | | | | src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls. from Coverity. Fixes: 659f333b3a4 (glsl: add packed for struct types) Acked-by: Ilia Mirkin <[email protected]>
* panfrost: Cleanup leak todosAlyssa Rosenzweig2019-05-173-16/+9
| | | | | | | Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: assert(0) -> unreachable for some switchAlyssa Rosenzweig2019-05-162-31/+18
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* anv: Fix some depth buffer sampling cases on ICL+Nanley Chery2019-05-161-1/+7
| | | | | | | | | | | Don't attempt sampling with HiZ if the sampler lacks support for it. On ICL, the HW docs state that sampling with HiZ is not supported and that instances of AUX_HIZ in the RENDER_SURFACE_STATE object will be interpreted as AUX_NONE. Cc: <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nir: Only convert SSA values to regs when neededCaio Marcelo de Oliveira Filho2019-05-161-6/+22
| | | | | | | | | | | | | | | | | If the SSA def produced by this instruction is only in the block in which it is defined and is not used by ifs or phis, then we don't have a reason to convert it to a register in nir_lower_ssa_defs_to_regs_block(). The special case for derefs is covered by the general case, so can be removed: at this point all derefs in the block are materialized (i.e. the whole deref chain is in the block) and derefs are not used in phis. v2: Fix wrong check for if_uses. If there's such an use, the def is not "local_to_block". (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: Record samplers for extra planes in info->textures_used.Kenneth Graunke2019-05-161-0/+5
| | | | | | | | | | | | | | Normally gl_nir_lower_samplers_as_deref records info->textures_used for us, but this pass runs after that, attempting to assign samplers in the same order as st_atom_texture's external_samplers_used loop so the stars align and we get the same locations. Since we're adding textures late, we need to amend info->textures_used. iris uses info->textures_used to set up texture bindings; this fixes Piglit's ext_image_dma_buf_import-sample-{nv12,yuv420,yvu420} there. Reviewed-by: Rob Clark <[email protected]>
* nir: Fix nir_opt_idiv_const when negatives are involvedCaio Marcelo de Oliveira Filho2019-05-161-3/+5
| | | | | | | | | | | | | | | | | First, allow the case for negative powers of two. Then ensure that we use the absolute value of the non-constant value to calculate the quotient -- this was hinted in the code by the name 'uq'. This fixes an issue when 'd' is positive and 'n' is negative. The ishr will propagate the negative sign and we'll use nir_ineg() again, incorrectly. v2: First version used only ishr, but that isn't sufficient, since it never can produce a zero as a result. (Jason) Allow negative powers of two. (Caio) Fixes: 74492ebad94 "nir: Add a pass for lowering integer division by constants" Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno: Log the number of loops in the shader for shader-db.Eric Anholt2019-05-163-2/+4
| | | | | | | | | | shader-db's report.py will use this to see when we've changed loop unrolling behavior on a shader and skip including other stats like instruction count from being considered for that shader, since they won't be useful as a proxy for real world performance in that case. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Output the same shader-db format as v3d and intel.Eric Anholt2019-05-161-15/+4
| | | | | | | | This lets us reuse their report.py, at the expense of fd-report.py no longer working. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Remove the ir3_tgsi_to_nir() helper function.Eric Anholt2019-05-163-20/+6
| | | | | | | | It was more of a hindrance, as it pretended that we could compile in the driver with a missing screen. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Fix assertion failures in context setup in shader-db mode.Eric Anholt2019-05-164-0/+4
| | | | | | | | The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* ac: match radeonsi code in ac_shader_binary_read_configMarek Olšák2019-05-161-3/+3
|
* r600+radeonsi: use ctx_query_reset_status on radeonMarek Olšák2019-05-1610-55/+5
| | | | This allows a nice cleanup, because the winsys always handles it.
* winsys/radeon: implement ctx_query_reset_status by copying radeonsiMarek Olšák2019-05-164-6/+43
| | | | | To make it behave like amdgpu. I'm just trying to move this out of radeonsi. The radeonsi code will be removed in the next commit.
* winsys/amdgpu: report a CS rejection as a reset only if there's no GPU resetMarek Olšák2019-05-161-6/+5
|
* radeonsi: update buffer descriptors in all contexts after buffer invalidationMarek Olšák2019-05-163-33/+72
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <[email protected]>
* radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsetsMarek Olšák2019-05-163-40/+25
| | | | | | This is a prerequisite for the next commit. Cc: 19.1 <[email protected]>
* radeonsi: compute culling - flush CS to remove write references to buffersMarek Olšák2019-05-161-5/+16
| | | | | | Only read-only buffers can use compute culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: invalidate caches at the beginning of the prim discard compute IBMarek Olšák2019-05-163-9/+23
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable primitive restart for triangles for DiRT RallyMarek Olšák2019-05-165-14/+28
| | | | | | It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add primitive culling stats to the HUDMarek Olšák2019-05-164-4/+44
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: cull primitives with async compute for large draw callsMarek Olšák2019-05-1618-28/+2124
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_spaceMarek Olšák2019-05-169-15/+26
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1Marek Olšák2019-05-162-2/+24
| | | | | | | The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make some functions non-staticMarek Olšák2019-05-163-18/+25
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow si_shader_select_with_key to return an optimized shader or failMarek Olšák2019-05-162-12/+32
| | | | | | | | If a prim discard compute shader hasn't finished compilation, we don't want to any shader. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::instance_count indirectlyMarek Olšák2019-05-161-14/+22
| | | | | | | It will be modified by compute shader culling. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::prim and primitive_restart indirectlyMarek Olšák2019-05-161-31/+40
| | | | | | | so that the fields can be changed by the driver. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make functions for creating LLVM functions non-staticMarek Olšák2019-05-162-23/+32
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add a parallel compute IB coupled with a gfx IBMarek Olšák2019-05-168-10/+204
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add LLVM code for triangle cullingMarek Olšák2019-05-164-0/+338
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_copy_dataMarek Olšák2019-05-165-9/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_release_memMarek Olšák2019-05-165-10/+9
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limitsMarek Olšák2019-05-162-4/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_*_descriptors_idx functions into si_state.hMarek Olšák2019-05-162-14/+14
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make si_initialize_compute reusableMarek Olšák2019-05-162-7/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helperMarek Olšák2019-05-162-12/+23
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: return the last part's return value from @wrapperMarek Olšák2019-05-161-3/+26
| | | | | | | The primitive discard compute shader will get the position output this way. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: always set NO_CPU_ACCESS and NO_SUBALLOC on GDS resourcesMarek Olšák2019-05-161-2/+5
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* swr: clean up supported OGL4.0/4.1 extensions listJan Zielinski2019-05-161-4/+5
| | | | | | | | | | | | This commit adjusts the capabilities returned by the SWR driver and the documentation to correctly report the following extensions: GL_ARB_texture_query_lod, GL_ARB_texture_cube_map_array, GL_ARB_gpu_shader_fp64, GL_ARB_texture_gather, GL_ARB_vertex_attrib_64bit. Reviewed-by: Alok Hota <[email protected]>