summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: fix behavior of GLSL findLSB(0)Marek Olšák2016-10-291-4/+13
| | | | | | | 12.0 and older need the same fix but elsewhere. Cc: 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set VGT_GS_ONCHIP_CNTL on CIK and laterMarek Olšák2016-10-291-0/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Cc: 11.2 12.0 13.0 <[email protected]>
* nvc0/ir: fix emission of IMAD with NEG modifiersSamuel Pitoiset2016-10-272-2/+2
| | | | | | | | | The emitter tried to emit sub instead of subr when src0 has actually a NEG modifier. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 12.0 13.0" <[email protected]>
* nvc0/ir: fix emission of SHLADD with NEG modifiersSamuel Pitoiset2016-10-262-2/+2
| | | | | | | | | | | | | This affects GF100:GK110 chipsets, but not GM107+ where the logic is a bit different. The emitters tried to emit sub instead of subr when src0 has a NEG modifier. This fixes the following piglit tests glsl-fs-loop-nested and glsl-vs-loop-nested. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: "13.0" <[email protected]>
* radeonsi: remove si_resource_create_customMarek Olšák2016-10-265-20/+11
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: stop using PIPE_BIND_CUSTOMMarek Olšák2016-10-2611-24/+18
| | | | | | it has no effect whatsoever Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: remove a redundant buffer_create helperMarek Olšák2016-10-261-23/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove unused r600_cmask_info membersMarek Olšák2016-10-262-16/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't force the same tiling parameters for FMASKMarek Olšák2016-10-261-8/+10
| | | | | | | | | GCN can use a completely different tile mode for FMASK. FMASK allocation now skips one unrelated amdgpu_surface_init codepath as hinted by the assertion. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: print tiling index when printing texture infoMarek Olšák2016-10-261-4/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't do (fmask.size && cmask.size)Marek Olšák2016-10-263-3/+3
| | | | | | fmask implies that cmask is present too. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: re-order radeon_surf::dcc and htile membersMarek Olšák2016-10-261-5/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: rename bo_size -> surf_size, bo_alignment -> surf_alignmentMarek Olšák2016-10-265-11/+11
| | | | | | these names were misleading. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove flags specific to libdrm_radeon from winsys interfaceMarek Olšák2016-10-262-14/+3
| | | | | | | These just say whether libdrm can assume that the latest radeon_surface definition is used by Mesa. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove r600_htile_infoMarek Olšák2016-10-263-38/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove unnecessary fields from radeon_surf_levelMarek Olšák2016-10-265-22/+16
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: decrease the size of radeon_surfMarek Olšák2016-10-262-32/+34
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: pass pipe_resource and other params to surface_init directlyMarek Olšák2016-10-262-102/+51
| | | | | | | | | This removes input-only parameters from the radeon_surf structure. Some of the translation logic from pipe_resource to radeon_surf is moved to winsys/radeon. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/vce: use nblk_y instead of npix_yMarek Olšák2016-10-264-7/+7
| | | | | | | | | | | | npix_y will be removed. level[0].npix_y will be removed too. nblk_y should be the same as npix_y if the block height == 1. However, nblk_y is aligned to the tile size, so it can be greater than npix_y. If that's a problem, we'll have to save the input height of surface_init and use that. Reviewed-by: Christian König <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: define RADEON_SURF_MODE_* as enumsMarek Olšák2016-10-262-9/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: stop using some input fields from radeon_surfaceMarek Olšák2016-10-264-20/+20
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fold r600_setup_surface into r600_init_surfaceMarek Olšák2016-10-261-38/+24
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fold radeon_winsys::surface_best into radeon/winsysMarek Olšák2016-10-262-13/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use r600_gfx_write_event_eop everywhereMarek Olšák2016-10-263-23/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make r600_gfx_write_fence more genericMarek Olšák2016-10-264-14/+34
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix a ZPASS comment, EVENT_WRITE_EOP fixupsMarek Olšák2016-10-262-4/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable SDMA on Carrizo and all CIK chips againMarek Olšák2016-10-261-10/+0
| | | | | | | | SDMA might be fixed by: "winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures" Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make sure the address of separate CMASK is aligned properlyMarek Olšák2016-10-261-2/+3
| | | | | | | | This should fix random GPU hangs on Hawaii and Fiji. Cc: 11.2 12.0 13.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix incorrect bpe use in si_set_optimal_micro_tile_modeMarek Olšák2016-10-261-7/+7
| | | | | | | | Oh my god, I wonder what catastrophic issues this was causing on SI. Cc: 13.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/i965/anv/radv/gallium: make shader info a pointerTimothy Arceri2016-10-263-5/+5
| | | | | | | | | | When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <[email protected]>
* nv50/ir: start LocalCSE with getFirst to merge PHI instructionsKarol Herbst2016-10-251-1/+1
| | | | | | | | | | | | | | | total instructions in shared programs : 3499888 -> 3499445 (-0.01%) total gprs used in shared programs : 453866 -> 453803 (-0.01%) total local used in shared programs : 21621 -> 21621 (0.00%) total bytes used in shared programs : 32078952 -> 32074936 (-0.01%) local gpr inst bytes helped 0 39 119 119 hurt 0 0 0 0 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: use correct bufctx when invalidating CP texturesSamuel Pitoiset2016-10-251-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0 13.0" <[email protected]>
* nv50/ir: do not perform global membar for shared memorySamuel Pitoiset2016-10-241-1/+4
| | | | | | | | | | Shared memory is local to CTA, thus we should only wait for prior memory writes which are visible to other threads in the same CTA, and not at global level. This should speedup compute shaders which use shared memory. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: display OP_BAR subops in debug modeSamuel Pitoiset2016-10-241-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: it appears that OP_DISCARD can't take a join modifierIlia Mirkin2016-10-221-0/+1
| | | | | | | nvdisasm does not print a .S even though the bit is set. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: use levelZero for non-frag tex/txp opsIlia Mirkin2016-10-221-0/+5
| | | | | | | | | radeonsi also does the same thing. I suspect that this is likely to be a no-op in reality, but it brings nouveau code closer to what the blob produces. Plus it makes sense to not try to do auto-derivatives on this. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERSIlia Mirkin2016-10-2215-0/+15
| | | | | | | | | | | | | | This allows the driver to signal that it can't handle random interleaving of attributes across buffers. This is required for ARB_transform_feedback3, and it's initialized to whatever the previous value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where it is disabled. Note that the proprietary drivers never expose ARB_transform_feedback3 on any GT21x's (where nouveau previously did), and after some effort I was unable to get it to work. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0/ir: remove outdated comment about SHLADDSamuel Pitoiset2016-10-222-2/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Avoid making temporaries for assignments to NIR registers.Eric Anholt2016-10-211-35/+79
| | | | | | | | | | | | | | | | | | | | | | | Getting stores to NIR regs to not generate new MOVs is tricky, since the result we're trying to store into the NIR reg may have been from a conditional update of a temp, or a series of packed writes. The easiest solution seems to be to require that nir_store_dest()'s arg comes from an SSA temp. This causes us to put in a few more temporary MOVs in the NIR SSA dest case, but copy propagation successfully cleans those up. The shader-db change is modest: total instructions in shared programs: 93774 -> 93598 (-0.19%) instructions in affected programs: 14760 -> 14584 (-1.19%) total estimated cycles in shared programs: 212135 -> 211946 (-0.09%) estimated cycles in affected programs: 27005 -> 26816 (-0.70%) but I was seeing patterns in some register-allocation failures in DEQP tests that looked like the extra MOVs would increase maximum register pressure in loops. Some debug code indicates that that's not the case, though I'm still a bit confused by that result.
* vc4: Add a comment with discussion of how simulation works.Eric Anholt2016-10-211-0/+25
|
* vc4: Move simulator winsys mapping and tracking to the simulator.Eric Anholt2016-10-213-20/+56
| | | | | One tiny hack is left in vc4_bufmgr.c for what kind of mapping we got so that we can free it.
* vc4: Move simulator memory management to a u_mm.h heap.Eric Anholt2016-10-215-41/+208
| | | | | | Now we aren't limited to 256MB total allocated across a driver instance, just 256MB at one time. We're still copying in and out, which should get fixed.
* vc4: Move simulator globals into a struct.Eric Anholt2016-10-212-34/+29
| | | | | I would like to put a couple more things in here, so it's time to package it up.
* vc4: Restructure the simulator mode.Eric Anholt2016-10-215-84/+182
| | | | | | | | | | | | | Rather than having simulator mode changes scattered around vc4_bufmgr.c and vc4_screen.c, make vc4_bufmgr.c just call a vc4_simulator_ioctl, which then dispatches to a corresponding implementation. This will give the simulator support a centralized place to do tricks like storing most BOs directly in simulator memory rather than copying in and out. This leaves special casing of mmaping BOs and execution, because of the winsys mapping.
* vc4: Fix termination of the initial scan for branch targets.Eric Anholt2016-10-211-11/+8
| | | | | | | | | | | | | The loop is scanning until the original max_ip (size of the BO), but we want to not examine any code after the PROG_END's delay slots. There was a block trying to do that, except that we had some early continue statements if the signal wasn't a PROG_END or a BRANCH. The failure mode would be that a valid shader is rejected because some undefined memory after the PROG_END slots is parsed as a branch and the rest of its setup is illegal. I haven't seen this in the wild, but valgrind was complaining and the new userland simulator code started triggering it.
* radeonsi: fix a regression in si_eliminate_const_outputNicolai Hähnle2016-10-211-4/+3
| | | | | | | | | | A constant value of float type is not necessarily a ConstantFP: it could also be a constant expression that for some reason hasn't been folded. This fixes a regression in GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 that was introduced by commit 3ec9975555d1cc5365413ad9062f412904f944a3. Reviewed-by: Marek Olšák <[email protected]>
* nv50,nvc0: don't keep track of whether fb rt0 is integer-onlyIlia Mirkin2016-10-216-44/+22
| | | | | | | | | | This reverts commits 1af0641db345209c076e9b1ba4dca7524541671a and a6ad49cbbd599aec054d0a3163fff5ad724f2b18. st/mesa adjusts the rasterizer state for us now. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: do not break 3D state by pushing MS coordinates on FermiSamuel Pitoiset2016-10-201-43/+44
| | | | | | | | | | | | | | | | | | Long story short, 3D and CP are aliased on Fermi and initializing compute after pushing the MS sample coordinate offsets seems to corrupt 3D state for weird reasons. I still don't have the faintest clue what is going on, but this seems to only affect Fermi generation. A possible fix could be to use two different channels, one for 3D and one for CP. This fixes a bunch of regressions pinpointed by piglit. Fixes: "nvc0: fix up image support for allowing multiple samples" Cc: "13.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: translate compute shaders at program creationSamuel Pitoiset2016-10-201-0/+4
| | | | | | | This makes shader-db reports results for compute shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: fix build of si_eliminate_const_vs_outputs on LLVM <= 3.8Marek Olšák2016-10-201-3/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>