summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* meson: expose glapi through osmesaEric Engestrom2019-05-181-2/+2
| | | | | | | | | | | Suggested-by: Pierre Guillou <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659 Fixes: f121a669c7d94d2ff672 "meson: build gallium based osmesa" Fixes: cbbd5bb889a2c271a504 "meson: build classic osmesa" Cc: Brian Paul <[email protected]> Cc: Dylan Baker <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Chuck Atkins <[email protected]>
* draw: fix memory leak introduced 7720ce32aNeha Bhende2019-05-171-1/+3
| | | | | | | | | | | | | We need to free memory allocation PrimitiveOffsets in draw_gs_destroy(). This fixes memory leak found while running piglit on windows. Fixes: 7720ce32a ("draw: add support to tgsi paths for geometry streams. (v2)") Tested with piglit Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* panfrost/midgard: TypofixAlyssa Rosenzweig2019-05-171-1/+1
| | | | | Reported-by: Ryan Houdek <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]>
* svga: Add an environment variable to force coherent surface memoryThomas Hellstrom2019-05-179-31/+82
| | | | | | | | | | The vmwgfx driver supports emulated coherent surface memory as of version 2.16. Add en environtment variable to enable this functionality for texture- and buffer maps: SVGA_FORCE_COHERENT. This environment variable should be used for testing only. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flagsThomas Hellstrom2019-05-173-27/+33
| | | | | | | | | | | In order to be able to add access modes to a pb_validate_entry, update the pb_validate_add_buffer function to take a pointer hash table and also to return whether the buffer was already on the validate list. Update the svga winsys accordingly. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Set the rendered-to flag for dma transfers to surfacesThomas Hellstrom2019-05-171-0/+4
| | | | | | | | | | The rendered-to flag indicates that the HW surface content is more recent than the content of the mob. That's the case after a SurfaceDMA transfer to the surface. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* winsys/svga: Fix RELOC_INTERNAL mob GPU accessThomas Hellstrom2019-05-171-1/+9
| | | | | | | | | | | | | | | SVGA_RELOC_INTERNAL indicates a transfer between surface and backing mob. This means that if the GPU for example reads from the surface it writes to the backing mob. But since the buffer mapping code allows for simultaneous gpu- and cpu read access, a read from the surface to the mob will not synchronize a subsequent map to the readback. Fix this by inverting the mob access mode in a surface relocation with SVGA_RELOC_INTERNAL set. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Remove the surface_invalidate winsys functionThomas Hellstrom2019-05-176-31/+12
| | | | | | | | | | | Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed also for Linux for dirty buffers and operation without SurfaceDMA. For non-guest-backed operation, remove the surface cache surface invalidation altogether. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* Revert "softpipe/buffer: load only as many components as the the buffer ↵Gert Wollny2019-05-171-5/+2
| | | | | | | | | | | | | resource type provides" This reverts commit 865b9ddae4874186182e529b5fd154ab04a61f79. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* panfrost: Cleanup leak todosAlyssa Rosenzweig2019-05-173-16/+9
| | | | | | | Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: assert(0) -> unreachable for some switchAlyssa Rosenzweig2019-05-162-31/+18
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno: Log the number of loops in the shader for shader-db.Eric Anholt2019-05-161-2/+2
| | | | | | | | | | shader-db's report.py will use this to see when we've changed loop unrolling behavior on a shader and skip including other stats like instruction count from being considered for that shader, since they won't be useful as a proxy for real world performance in that case. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Output the same shader-db format as v3d and intel.Eric Anholt2019-05-161-15/+4
| | | | | | | | This lets us reuse their report.py, at the expense of fd-report.py no longer working. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Remove the ir3_tgsi_to_nir() helper function.Eric Anholt2019-05-163-20/+6
| | | | | | | | It was more of a hindrance, as it pretended that we could compile in the driver with a missing screen. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Fix assertion failures in context setup in shader-db mode.Eric Anholt2019-05-164-0/+4
| | | | | | | | The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* r600+radeonsi: use ctx_query_reset_status on radeonMarek Olšák2019-05-168-52/+5
| | | | This allows a nice cleanup, because the winsys always handles it.
* winsys/radeon: implement ctx_query_reset_status by copying radeonsiMarek Olšák2019-05-164-6/+43
| | | | | To make it behave like amdgpu. I'm just trying to move this out of radeonsi. The radeonsi code will be removed in the next commit.
* winsys/amdgpu: report a CS rejection as a reset only if there's no GPU resetMarek Olšák2019-05-161-6/+5
|
* radeonsi: update buffer descriptors in all contexts after buffer invalidationMarek Olšák2019-05-163-33/+72
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <[email protected]>
* radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsetsMarek Olšák2019-05-163-40/+25
| | | | | | This is a prerequisite for the next commit. Cc: 19.1 <[email protected]>
* radeonsi: compute culling - flush CS to remove write references to buffersMarek Olšák2019-05-161-5/+16
| | | | | | Only read-only buffers can use compute culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: invalidate caches at the beginning of the prim discard compute IBMarek Olšák2019-05-163-9/+23
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable primitive restart for triangles for DiRT RallyMarek Olšák2019-05-164-14/+25
| | | | | | It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add primitive culling stats to the HUDMarek Olšák2019-05-164-4/+44
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: cull primitives with async compute for large draw callsMarek Olšák2019-05-1618-28/+2124
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_spaceMarek Olšák2019-05-169-15/+26
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1Marek Olšák2019-05-162-2/+24
| | | | | | | The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make some functions non-staticMarek Olšák2019-05-163-18/+25
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow si_shader_select_with_key to return an optimized shader or failMarek Olšák2019-05-162-12/+32
| | | | | | | | If a prim discard compute shader hasn't finished compilation, we don't want to any shader. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::instance_count indirectlyMarek Olšák2019-05-161-14/+22
| | | | | | | It will be modified by compute shader culling. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::prim and primitive_restart indirectlyMarek Olšák2019-05-161-31/+40
| | | | | | | so that the fields can be changed by the driver. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make functions for creating LLVM functions non-staticMarek Olšák2019-05-162-23/+32
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add a parallel compute IB coupled with a gfx IBMarek Olšák2019-05-166-10/+195
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_copy_dataMarek Olšák2019-05-165-9/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_release_memMarek Olšák2019-05-165-10/+9
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limitsMarek Olšák2019-05-162-4/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_*_descriptors_idx functions into si_state.hMarek Olšák2019-05-162-14/+14
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make si_initialize_compute reusableMarek Olšák2019-05-162-7/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helperMarek Olšák2019-05-162-12/+23
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: return the last part's return value from @wrapperMarek Olšák2019-05-161-3/+26
| | | | | | | The primitive discard compute shader will get the position output this way. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: always set NO_CPU_ACCESS and NO_SUBALLOC on GDS resourcesMarek Olšák2019-05-161-2/+5
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* swr: clean up supported OGL4.0/4.1 extensions listJan Zielinski2019-05-161-4/+5
| | | | | | | | | | | | This commit adjusts the capabilities returned by the SWR driver and the documentation to correctly report the following extensions: GL_ARB_texture_query_lod, GL_ARB_texture_cube_map_array, GL_ARB_gpu_shader_fp64, GL_ARB_texture_gather, GL_ARB_vertex_attrib_64bit. Reviewed-by: Alok Hota <[email protected]>
* vl/dri3: set back buffer from output to NULL with front buffer caseLeo Liu2019-05-161-0/+1
| | | | | | | Since the using output optimization is only for back buffer case Signed-off-by: Leo Liu <[email protected]> Acked-by: Alex Deucher <[email protected]>
* auxiliary/draw: fix crash with zero-stride draw autoRoland Scheidegger2019-05-161-1/+2
| | | | | | | | | | | | transform feedback draws get the number of vertices from the transform feedback object. In draw, we'll figure this out with the number of bytes written divided by the stride. However, it is apparently possible we end up with a stride of 0 there (not entirely sure it could happen with GL). Probably when nothing was actually ever written (so we don't actually have a stride set). Just avoid the division by zero by setting the count to 0. Reviewed-by: Jose Fonseca <[email protected]>
* iris: Dodge more GLSL IR loweringKenneth Graunke2019-05-151-2/+3
| | | | This avoids some lower_instructions bits in st.
* panfrost/midgard: Add load/store opcodesAlyssa Rosenzweig2019-05-164-52/+131
| | | | | | | | | This commit adds a bunch of new load/store opcodes, largely related to OpenCL, as well as adjusting the name of existing opcodes to be more uniform. The immediate effect is compute shaders are substantially easier to interpret now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Enable integer constant inliningAlyssa Rosenzweig2019-05-161-4/+0
| | | | | | | | | | | | Midgard ALU features two types of constants: embedded constants (128-bit chunk, zero/one per schedule bundle) and inline constants (16-bit splattered into the op, second source if present). Inline constants are much more efficient from a space and scheduling freedom standpoint, so it's desirable to inline when possible. Now that integer ops are well understood and in use, we enable inlining of integers constants in addition to floats (which have been inlined since forever). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Remove imov workaroundAlyssa Rosenzweig2019-05-161-26/+0
| | | | | | The previous commit fixes the issue this patched around. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Set int outmod for ops writing integersAlyssa Rosenzweig2019-05-162-7/+23
| | | | | | | | | | | | | | | | | | | | | By default, the "normal" output modifier is set on ALU ops. This is the correct default for float outputs -- for floats, it preserves the semantic value. Unfortunately, when used with integers, it does not preserve the bitstream encoding, causing misbehaviour. (It's an open question what happens when `normal` is used with integers -- does it apply some other transformation? or does it do floating point normalization/etc on the ints as if they were floats?). Instead, we default to the "clamp to integer" output modifier for ops writing integers. Semantically, this makes sense (clamping an integer to the nearest integer is the identity function). In the hardware with an integer opcode, this is the actual "normal". This fixes numerous sporadic and sometimes bizarre bugs relating to integers, especially integer moves. With this in place, we no longer care about the types involved; it's just bits on the wire again. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set custom stride for textures when necessaryAlyssa Rosenzweig2019-05-161-0/+25
| | | | | | | | | | | | | | | From Gallium (and our) perspective, the stride of a BO is arbitrary. For internal buffers, we can make it something nice, but for imported linear buffers (e.g. EGL clients), we don't always have that luxury. To cope, we calculate the expected stride of a texture, compare it to the BO's actual reported stride, and if they differ, set the latter as a custom stride. Fixes rendering of windows not on tile boundaries (noticeable in Weston with es2gears_wayland, for instance). Also, this should fix stride issues with bufer reloading. Signed-off-by: Alyssa Rosenzweig <[email protected]>