summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_voteIlia Mirkin2016-06-063-1/+26
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nv50/ir: use round toward 0 when converting doubles to integersSamuel Pitoiset2016-06-061-1/+3
| | | | | | | | | | | | Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* gallium/radeon: don't re-set BO metadata after CMASK deallocationMarek Olšák2016-06-061-1/+0
| | | | | | CMASK has no effect on metadata, because it's not sharable. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a performance tweak for 4 SE partsMarek Olšák2016-06-061-0/+11
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify PRIMGROUP_SIZE computation for tessellationMarek Olšák2016-06-061-9/+1
| | | | | | | | Ported from Vulkan. v2: keep the comment Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: use hw MSAA resolve for non-trivial resolvesMarek Olšák2016-06-061-9/+53
| | | | | | This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use hw MSAA resolve for non-trivial resolvesMarek Olšák2016-06-061-10/+54
| | | | | | This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set descriptor dirty mask on shader buffer unbindNicolai Hähnle2016-06-061-0/+1
| | | | | | | | Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: fix mixed data type comparison in tgsi_point_sprite.cCharmaine Lee2016-06-061-3/+3
| | | | | | | | | | | | | | | | Cast the unsigned semantic index to integer datatype before comparing to max_generic, otherwise, max_generic which is initialized to -1 will be converted to unsigned int before the comparison, causing a wrong semantic index to be assigned to a shader output. Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265) Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord. v2: use the original max_generic variable but add the (int) cast to the semantic index, as suggested by Brian. Reviewed-by: Brian Paul <[email protected]>
* svga: print shader linkage info when tgsi debug bit is onCharmaine Lee2016-06-061-2/+5
| | | | | | | | When TGSI debug flag is enabled, print the shader linkage info as well. Tested with mesa demos with SVGA_DEBUG=tgsi Reviewed-by: Brian Paul <[email protected]>
* tgsi: use truncf in micro_truncLars Hamre2016-06-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Switches to using truncf in micro_trunc. Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-trunc-float fs-trunc-vec2 fs-trunc-vec3 fs-trunc-vec4 vs-trunc-float vs-trunc-vec2 vs-trunc-vec3 vs-trunc-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-trunc-float gs-trunc-vec2 gs-trunc-vec3 gs-trunc-vec4 Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv50,nvc0: fix BGR10_A2UI vertex formatIlia Mirkin2016-06-051-1/+1
| | | | | | | | This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: do not clear surfaces bins in the validate functionSamuel Pitoiset2016-06-052-5/+2
| | | | | | | | | We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: re-validate images after launching a grid on FermiSamuel Pitoiset2016-06-051-0/+3
| | | | | | | | | | | | | | | | | | | | | Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* radeonsi: fix images with level > 0Marek Olšák2016-06-051-1/+1
| | | | | | | | | | This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43 radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nvc0: reduce overhead from always marking images dirtyIlia Mirkin2016-06-041-9/+36
| | | | | | | | | | We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: reduce overhead from always marking buffers dirtyIlia Mirkin2016-06-041-6/+20
| | | | | | | | | | We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: fix memory barrier flag handlingIlia Mirkin2016-06-041-9/+16
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: mark bound buffer range validIlia Mirkin2016-06-043-0/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* gallium/radeon: don't use the DMA ring for pipelined buffer uploadsMarek Olšák2016-06-041-5/+4
| | | | | | | | | | | | | | | | | | | | Submitting a DMA IB flushes the GFX IB and all GPU caches. Vedran Miletić said: "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling from 1200p)." Some anonymous dude said: R9 390 results: Tomb Raider (normal settings): 80 -> 88 FPS Talos Principle (custom settings): 23 -> 56 FPS Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS Reviewed-by: Alex Deucher <[email protected]> Tested-by: Vedran Miletić <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: don't flush caches when binding shader resourcesMarek Olšák2016-06-044-31/+26
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: only do necessary cache flushes in cp_dma_copy_bufferMarek Olšák2016-06-041-14/+1
| | | | | | | | | The main impact is that {upload, draw, upload, draw, ..} doesn't flush framebuffer caches before every upload. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: only do necessary cache flushes in cp_dma_clear_bufferMarek Olšák2016-06-042-14/+18
| | | | | | | | The main impact is that fast color clear doesn't flush TC, CONST, DB. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: remove a CP DMA workaround that's not needed anymoreMarek Olšák2016-06-041-6/+0
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: fix CP DMA hazard with index buffer fetches (v3)Marek Olšák2016-06-047-7/+93
| | | | | | | | | v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel, otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: properly sync CP with CP DMA on R6xxMarek Olšák2016-06-041-1/+8
| | | | | | | | This will allow removing useless cache & IB flushes. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: write WAIT_UNTIL in the correct placeMarek Olšák2016-06-041-8/+11
| | | | | | | | | | This has been wrong all along. Fixing this will allow removing useless cache flushes. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memoryMarek Olšák2016-06-043-6/+6
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/u_suballoc: allow different alignment for each allocationMarek Olšák2016-06-048-21/+20
| | | | | | | | | Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* freedreno/ir3: do idiv lowering after main opt loopRob Clark2016-06-031-16/+27
| | | | | | | Give algebraic-opt pass a chance to catch udiv by const power-of-two, before running lower-idiv pass. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: mark buffer texture range valid for shader imagesNicolai Hähnle2016-06-031-0/+23
| | | | | | | | | | | | | When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer <[email protected]> Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: mark buffer texture range valid for shader imagesSamuel Pitoiset2016-06-033-0/+31
| | | | | | | | Loosely based on radeonsi (Thanks to Nicolai). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: 12.0 <[email protected]>
* svga: allow copy box in svga_transfer_dma_band()Charmaine Lee2016-06-021-13/+20
| | | | | | | | | | | | | Instead of just allow copy of a rectangle in svga_transfer_dma_band(), this patch allows it to copy a box, hence allows copy a 3d texture in one transfer. Fixes black screen in running Heaven after commit fb9fe35. (Bug 1663282) Tested with Heaven, glretrace, piglit. Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* freedreno: fix bad bitshift warningsRob Clark2016-06-021-0/+2
| | | | | | | | | | | | | Coverity doesn't realize idx will never be negative. Throw in some assert()s to help it out. (Hopefully assert() isn't getting compiled out for coverity build.. but there seems to be just one way to find out. We might have to change these to assume()) Fixes CID 1362442, 1362443 Signed-off-by: Rob Clark <[email protected]>
* freedreno: assume builtin shaders do compileRob Clark2016-06-021-1/+2
| | | | | | | | | | Maybe we should switch to ureg to build the builtin shaders. But at any rate, if they fail to compile it is because someone messed them up (or changed TGSI syntax?). CID 1362444 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: silence coverity warningRob Clark2016-06-021-0/+6
| | | | | | CID 1362451 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: fix potential null ptr derefRob Clark2016-06-022-2/+4
| | | | | | | | Coverity spotted the a3xx case (not sure why not the a4xx). CID 1362452 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix coverity warningRob Clark2016-06-021-1/+3
| | | | | | CID 1362453 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use nir_shader_get_entrypoint() helperRob Clark2016-06-021-10/+1
| | | | | | Should also fix coverity warning: CID 1362454 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix incorrect enum typeRob Clark2016-06-021-1/+1
| | | | | | | | a4xx has it's own enum, different from a2xx/a3xx. Spotted by coverity: CID 1362458, 1362459 Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix coverity negative array index warningRob Clark2016-06-021-0/+2
| | | | | | | | | | Never can happen, since query would not have been created in the first place if pidx(query_type) return negative. Lets let coverity realize this. CID 1362460 Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix dereference before null checkRob Clark2016-06-021-2/+1
| | | | | | | | | | ptr can actually never be null so just drop the check. CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking ptr suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Rob Clark <[email protected]>
* gallium/util: remove u_stagingRob Clark2016-06-023-205/+0
| | | | | | | Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170 Signed-off-by: Rob Clark <[email protected]> Acked-by: Marek Olšák <[email protected]>
* freedreno/a3xx: only update/emit bordercolor state when neededRob Clark2016-06-023-17/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: only update/emit bordercolor state when neededRob Clark2016-06-023-17/+26
| | | | | | I noticed in stk that it was contributing to a lot of overhead. Signed-off-by: Rob Clark <[email protected]>
* st/osmesa: remove double-write (overwriting)Eric Engestrom2016-06-021-1/+0
| | | | | | | | | | | | These two lines have been here since the file was created. I'm guessing the second one was just for testing during dev, so it's the one that's going away. CoverityID: 1296205 Signed-off-by: Eric Engestrom <[email protected]> Cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* st/vdpau: check for null pointer in get/put bits.Nayan Deshmukh2016-06-022-0/+12
| | | | | | | | Check for null pointer before accessing arrays in get/put bits native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface. Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/uvd: fix the H264 level for Tonga v2Christian König2016-06-021-1/+1
| | | | | | | | | | We support 5.2 for a while now. v2: we even support 5.2 for H264, 5.1 is for HEVC. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: <[email protected]>
* winsys/amdgpu: decay max_ib_size over timeNicolai Hähnle2016-06-011-0/+2
| | | | | | So that memory use will eventually decrease again after a temporary peak. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: implement IB chaining on the gfx ringNicolai Hähnle2016-06-012-18/+109
| | | | | | As a consequence, CE IB size never triggers a flush anymore. Reviewed-by: Marek Olšák <[email protected]>