summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* panfrost/midgard: Split up midgard_compile.c (RA)Alyssa Rosenzweig2019-05-1911-928/+1149
| | | | | | | | | | | | This commit moves the register allocator out of midgard_compile.c and into its own midgard_ra.c file. In doing so, a number of dependencies are identified and moved into their own files in turn. midgard_compile.c is still fairly monolithic, but this should help. Code churn, but no functional changes should be introduced by this commit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Improve fixed-function blendingAlyssa Rosenzweig2019-05-192-978/+34
| | | | | | | | This fixes a few miscellaneous issues with the fixed-function blending programming, though it is far from complete. For cases known to be buggy, we force a fallback to blend shaders. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Wire up nir_lower_blendAlyssa Rosenzweig2019-05-191-14/+33
| | | | | | | | This implements blend shaders via nir_lower_blend, by creating dummy fragment shaders simply passing through the source color and using the new lowering pass to inject blendability. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Route new blending intrinsicsAlyssa Rosenzweig2019-05-191-106/+117
| | | | | | | To prepare for the new nir_lower_blend pass, we wire up the intrinsics for tilebuffer reads and constant colour loading. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/nir: Add nir_lower_blend passAlyssa Rosenzweig2019-05-194-1/+305
| | | | | | | | | | | | | | | | | | | | | | | | | | | This new lowering pass implements the OpenGL ES blend pipeline in shaders, applicable to hardware lacking full-featured blending hardware (including Midgard/Bifrost and vc4). This pass is run on a fragment shader, rewriting the store to a blended version, loading in the framebuffer destination color and constant color via intrinsics as necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to pass dEQP's blend tests. MIN/MAX modes are included and tested as well. That said, at present it has the following limitations: - MRT is not supported (ES3). - sRGB support is missing (ES3). - Extended blending is not yet ported from GLSL IR lowering (ES3.2) - Dual-source blending is not supported. (N/A) - Logic ops are not supported. (N/A) v2: Fix code conventions (per Ian Romanick's feedback). Implement color masks. This pass should be in common nir/ space, but due to non-technical reasons, for now it's in Panfrost space. In the future, depending if other drivers need some of the functionality, we can move this back to src/compiler/nir space. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix Bifrost-specific paddingAlyssa Rosenzweig2019-05-191-7/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost: Cleanup panfrost_job commentsAlyssa Rosenzweig2019-05-191-6/+3
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/decode: Decode blend constantAlyssa Rosenzweig2019-05-192-2/+31
| | | | | | | | | | | | | This adds a forgotten decode line on Midgard and adds the field of a blend constant on Bifrost. The Bifrost encoding is fairly weird; whereas Midgard is just a regular 32-bit float, Bifrost uses a fancy fixed-point-esque encoding. The decode logic here is experimentally correct. The encode logic is a sort of "guesstimate", assuming that the high byte is just int(f / 255.0) and then solving algebraicly for the low byte. This might be slightly off in some cases. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost: Hoist blend constant into Midgard-specific structAlyssa Rosenzweig2019-05-196-16/+21
| | | | | | | | | This eliminates one major source of #ifdef parity between Midgard and Bifrost, better representing how the struct acts on Midgard and allowing proper decodes on Bifrost. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/decode: Disassemble Bifrost shadersAlyssa Rosenzweig2019-05-192-8/+10
| | | | | | | | | We already have the Bifrost disassembler in-tree, so now that panwrap is able to dump Bifrost command streams, hook up the disassembler to pandecode. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* meson: expose glapi through osmesaEric Engestrom2019-05-181-2/+2
| | | | | | | | | | | Suggested-by: Pierre Guillou <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659 Fixes: f121a669c7d94d2ff672 "meson: build gallium based osmesa" Fixes: cbbd5bb889a2c271a504 "meson: build classic osmesa" Cc: Brian Paul <[email protected]> Cc: Dylan Baker <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Chuck Atkins <[email protected]>
* draw: fix memory leak introduced 7720ce32aNeha Bhende2019-05-171-1/+3
| | | | | | | | | | | | | We need to free memory allocation PrimitiveOffsets in draw_gs_destroy(). This fixes memory leak found while running piglit on windows. Fixes: 7720ce32a ("draw: add support to tgsi paths for geometry streams. (v2)") Tested with piglit Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* panfrost/midgard: TypofixAlyssa Rosenzweig2019-05-171-1/+1
| | | | | Reported-by: Ryan Houdek <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]>
* svga: Add an environment variable to force coherent surface memoryThomas Hellstrom2019-05-179-31/+82
| | | | | | | | | | The vmwgfx driver supports emulated coherent surface memory as of version 2.16. Add en environtment variable to enable this functionality for texture- and buffer maps: SVGA_FORCE_COHERENT. This environment variable should be used for testing only. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flagsThomas Hellstrom2019-05-173-27/+33
| | | | | | | | | | | In order to be able to add access modes to a pb_validate_entry, update the pb_validate_add_buffer function to take a pointer hash table and also to return whether the buffer was already on the validate list. Update the svga winsys accordingly. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Set the rendered-to flag for dma transfers to surfacesThomas Hellstrom2019-05-171-0/+4
| | | | | | | | | | The rendered-to flag indicates that the HW surface content is more recent than the content of the mob. That's the case after a SurfaceDMA transfer to the surface. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* winsys/svga: Fix RELOC_INTERNAL mob GPU accessThomas Hellstrom2019-05-171-1/+9
| | | | | | | | | | | | | | | SVGA_RELOC_INTERNAL indicates a transfer between surface and backing mob. This means that if the GPU for example reads from the surface it writes to the backing mob. But since the buffer mapping code allows for simultaneous gpu- and cpu read access, a read from the surface to the mob will not synchronize a subsequent map to the readback. Fix this by inverting the mob access mode in a surface relocation with SVGA_RELOC_INTERNAL set. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Remove the surface_invalidate winsys functionThomas Hellstrom2019-05-176-31/+12
| | | | | | | | | | | Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed also for Linux for dirty buffers and operation without SurfaceDMA. For non-guest-backed operation, remove the surface cache surface invalidation altogether. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* Revert "softpipe/buffer: load only as many components as the the buffer ↵Gert Wollny2019-05-171-5/+2
| | | | | | | | | | | | | resource type provides" This reverts commit 865b9ddae4874186182e529b5fd154ab04a61f79. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* panfrost: Cleanup leak todosAlyssa Rosenzweig2019-05-173-16/+9
| | | | | | | Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: assert(0) -> unreachable for some switchAlyssa Rosenzweig2019-05-162-31/+18
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno: Log the number of loops in the shader for shader-db.Eric Anholt2019-05-161-2/+2
| | | | | | | | | | shader-db's report.py will use this to see when we've changed loop unrolling behavior on a shader and skip including other stats like instruction count from being considered for that shader, since they won't be useful as a proxy for real world performance in that case. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Output the same shader-db format as v3d and intel.Eric Anholt2019-05-161-15/+4
| | | | | | | | This lets us reuse their report.py, at the expense of fd-report.py no longer working. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Remove the ir3_tgsi_to_nir() helper function.Eric Anholt2019-05-163-20/+6
| | | | | | | | It was more of a hindrance, as it pretended that we could compile in the driver with a missing screen. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* freedreno: Fix assertion failures in context setup in shader-db mode.Eric Anholt2019-05-164-0/+4
| | | | | | | | The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <[email protected]> Tested-by: Eduardo Lima Mitev <[email protected]>
* r600+radeonsi: use ctx_query_reset_status on radeonMarek Olšák2019-05-168-52/+5
| | | | This allows a nice cleanup, because the winsys always handles it.
* winsys/radeon: implement ctx_query_reset_status by copying radeonsiMarek Olšák2019-05-164-6/+43
| | | | | To make it behave like amdgpu. I'm just trying to move this out of radeonsi. The radeonsi code will be removed in the next commit.
* winsys/amdgpu: report a CS rejection as a reset only if there's no GPU resetMarek Olšák2019-05-161-6/+5
|
* radeonsi: update buffer descriptors in all contexts after buffer invalidationMarek Olšák2019-05-163-33/+72
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <[email protected]>
* radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsetsMarek Olšák2019-05-163-40/+25
| | | | | | This is a prerequisite for the next commit. Cc: 19.1 <[email protected]>
* radeonsi: compute culling - flush CS to remove write references to buffersMarek Olšák2019-05-161-5/+16
| | | | | | Only read-only buffers can use compute culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: invalidate caches at the beginning of the prim discard compute IBMarek Olšák2019-05-163-9/+23
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable primitive restart for triangles for DiRT RallyMarek Olšák2019-05-164-14/+25
| | | | | | It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add primitive culling stats to the HUDMarek Olšák2019-05-164-4/+44
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: cull primitives with async compute for large draw callsMarek Olšák2019-05-1618-28/+2124
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_spaceMarek Olšák2019-05-169-15/+26
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1Marek Olšák2019-05-162-2/+24
| | | | | | | The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make some functions non-staticMarek Olšák2019-05-163-18/+25
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow si_shader_select_with_key to return an optimized shader or failMarek Olšák2019-05-162-12/+32
| | | | | | | | If a prim discard compute shader hasn't finished compilation, we don't want to any shader. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::instance_count indirectlyMarek Olšák2019-05-161-14/+22
| | | | | | | It will be modified by compute shader culling. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use pipe_draw_info::prim and primitive_restart indirectlyMarek Olšák2019-05-161-31/+40
| | | | | | | so that the fields can be changed by the driver. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make functions for creating LLVM functions non-staticMarek Olšák2019-05-162-23/+32
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: add a parallel compute IB coupled with a gfx IBMarek Olšák2019-05-166-10/+195
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_copy_dataMarek Olšák2019-05-165-9/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a cs parameter into si_cp_release_memMarek Olšák2019-05-165-10/+9
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limitsMarek Olšák2019-05-162-4/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_*_descriptors_idx functions into si_state.hMarek Olšák2019-05-162-14/+14
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make si_initialize_compute reusableMarek Olšák2019-05-162-7/+8
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helperMarek Olšák2019-05-162-12/+23
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: return the last part's return value from @wrapperMarek Olšák2019-05-161-3/+26
| | | | | | | The primitive discard compute shader will get the position output this way. Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>