summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* r600: handle image size support.Dave Airlie2017-11-173-9/+101
| | | | | | | | This adds support for the RESQ opcode with the workaround required due to hw bugs for buffers and cube arrays. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/sb: disable SB for images.Dave Airlie2017-11-171-0/+1
| | | | | | | Until we can work further on sb, disable it for images for now. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: add support for load/store/atomic ops on images.Dave Airlie2017-11-171-4/+315
| | | | | | | | This adds support to the shader assembler for load/store/atomic ops on images which are handled via the RAT operations. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: add core pieces of image support.Dave Airlie2017-11-176-3/+428
| | | | | | | | | This adds the atoms and gallium api implementations, along with support for compress/decompress paths for shader images. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: implement getting thread id.Dave Airlie2017-11-171-0/+74
| | | | | | | | We need the thread id to use the immediate buffer readback mechanism, so add support for calculating it. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: add flag to denote if shader uses imagesDave Airlie2017-11-172-0/+2
| | | | | Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: implement basic memory barrier.Dave Airlie2017-11-172-5/+24
| | | | | | | | This isn't 100% perfect (fglrx also fails a bunch of those tests) but implement the start of a memory barrier for image support. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: allocate immed buffer resource for images.Dave Airlie2017-11-173-0/+21
| | | | | | | | | In order to image readback we have to execute a MEM_RAT instruction that needs a buffer to transfer the result into until the shader can fetch it. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: handle writes_memory properlyDave Airlie2017-11-172-3/+13
| | | | | | | This implements proper handling for shaders with side effects. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* freedreno: also mark images used by draw/gridRob Clark2017-11-161-0/+18
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: mark SSBOs written at draw timeRob Clark2017-11-161-1/+1
| | | | | | Comment was right, implementation was wrong ;-) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: ARB_framebuffer_no_attachments supportRob Clark2017-11-163-1/+11
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: copy some nir gs infoTimothy Arceri2017-11-161-0/+7
| | | | | | v2: copy input primitive Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: add gs_{prim,invocation}_id to the abiTimothy Arceri2017-11-162-10/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: gather stream info in nir pathTimothy Arceri2017-11-161-0/+37
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* svga: s/unsigned/enum tgsi_texture_type/Brian Paul2017-11-151-5/+5
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: issue debug warning for unsupported two-sided stencil stateBrian Paul2017-11-151-0/+15
| | | | | | | | | We only have a single stencil read mask and write mask. Issue a warning if different front/back values are used. The Piglit gl-2.0-two-sided-stencil test hits this. Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* etnaviv: Add sampler TS supportWladimir J. van der Laan2017-11-153-6/+99
| | | | | | | | | | | | | Sampler TS is an hardware optimization that can be used when rendering to textures. After rendering to a resource with TS enabled, the texture unit can use this to bypass lookups to empty tiles. This also means a resolve-in-place can be avoided to flush the TS. This commit is also an optimization when not using sampler TS, as resolve-in-place will now be skipped if a resource has no (valid) TS. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Flush TS cache before changing TS configurationWladimir J. van der Laan2017-11-151-0/+5
| | | | | | | | | This is to make sure that the TS is properly flushed to memory before rendering to a new surface starts. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Add TS_SAMPLER formats to etnaviv_formatWladimir J. van der Laan2017-11-152-74/+91
| | | | | | | | | | Sampler TS introduces yet another format enumeration for renderable+textureable formats. Introduce it into the etnaviv_format table as another column. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Check that resource has a valid TS in etna_resource_needs_flushWladimir J. van der Laan2017-11-152-1/+18
| | | | | | | | | Resources only need a resolve-to-itself if their TS is valid for any level, not just if it happens to be allocated. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: rnndb updateWladimir J. van der Laan2017-11-156-9/+20
| | | | | Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* r600: set the number type correctly for float rts in cb setupRoland Scheidegger2017-11-152-2/+15
| | | | | | | | | | | | Float rts were always set as unorm instead of float. Not sure of the consequences, but at least it looks like the blend clamp would have been enabled, which is against the rules (only eg really bothered to even attempt to specify this correctly, r600 always used clamp anyway). Albeit r600 (not r700) setup still looks bugged to me due to never setting BLEND_FLOAT32 which must be set according to docs... Not sure if the hw really cares, no piglit change (on eg/juniper). Reviewed-by: Dave Airlie <[email protected]>
* r600: use ieee version of rsqRoland Scheidegger2017-11-151-5/+1
| | | | | | | | | | | | Both r600 and evergreen used the clamped version, whereas cayman used the ieee one. I don't think there's a valid reason for this discrepancy, so let's switch to the ieee version for r600 and evergreen too, since we generally want to stick to ieee arithmetic. With this, behavior for both rcp and rsq should now be the same for all of r600, eg, cm, all using ieee versions (albeit note rsq retains the abs behavior for everybody, which may not be a good idea ultimately). Reviewed-by: Dave Airlie <[email protected]>
* r600: use ieee version of rcpRoland Scheidegger2017-11-151-6/+2
| | | | | | | | | | | | | r600 used the clamped version for rcp, whereas both evergreen and cayman used the ieee version. I don't know why that discrepancy exists (it does so since day 1) but there does not seem to be a valid reason for this, so make it consistent. This seems now safer than before the previous commit (using the dx10 clamp bit). Note that rsq still uses clamped version (as before even though the table may have suggested otherwise for evergreen) for r600/eg, but not for cayman. Will be changed separately for better regression tracking... Reviewed-by: Dave Airlie <[email protected]>
* r600: use DX10_CLAMP bit in shader setupRoland Scheidegger2017-11-152-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The docs are not very concise in what this really does, however both Alex Deucher and Nicolai Hähnle suggested this only really affects instructions using the CLAMP output modifier, and I've confirmed that with the newly changed piglit isinf_and_isnan test. So, with this bit set, if an instruction has the CLAMP modifier bit (which clamps to [0,1]) set, then NaNs will be converted to zero, otherwise the result will be NaN. D3D10 would require this, glsl doesn't have modifiers (with mesa clamp(x,0,1) would get converted to such a modifier) coupled with a whatever-floats-your-boat specified NaN behavior, but the clamp behavior should probably always be used (this also matches what a decomposition into min(1.0, max(x, 0.0)) would do, if min/max also adhere to the ieee spec of picking the non-nan result). Some apps may in fact rely on this, as this prevents misrenderings in This War of Mine since using ieee muls (ce7a045feeef8cad155f1c9aa07f166e146e3d00), without having to use clamped rcp opcode, which would also fix this bug there. radeonsi also seems to set this bit nowadays if I see that righ (albeit the llvm amdgpu code comment now says "Make clamp modifier on NaN input returns 0" instead of "Do not clamp NAN to 0" since it was changed, which also looks a bit misleading). v2: set it in all shader stages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544 Reviewed-by: Dave Airlie <[email protected]>
* r600: use min_dx10/max_dx10 instead of min/maxRoland Scheidegger2017-11-152-6/+9
| | | | | | | | | | | | | | I believe this is the safe thing to do, especially ever since the driver actually generates NaNs for muls too. The ISA docs are not very helpful here, however the dx10 versions will pick a non-nan result over a NaN one (this is also the ieee754 behavior), whereas the non-dx10 ones will pick the NaN (verified by newly changed piglit isinf-and-isnan test). Other "modern" drivers will most likely do the same. This was shown to make some difference for bug 103544, albeit it is not required to fix it. Reviewed-by: Dave Airlie <[email protected]>
* r600: fix cubemap arraysDave Airlie2017-11-151-9/+17
| | | | | | | | | | | | | | | | A lot of cubemap array piglits fail, port the texture type picking code from radeonsi which seems to fix most of them. For images I will port the rest of the code. Fixes: getteximage-depth gl_texture_cube_map_array-* fbo-generatemipmap-cubemap array getteximage-targets cube_array amongst others. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* freedreno/a5xx: small comment fixRob Clark2017-11-141-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: indirect draw supportRob Clark2017-11-142-1/+37
| | | | | | | | | A couple failures in piglit tests w/ TF or gl_VertexID + indirect draws. OTOH all the deqp tests (although they don't test those combinations). I suspect this could be fixed by a firmware update, but I don't think there is much we can do in mesa for that. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: split out helper for pipeline stallsRob Clark2017-11-142-6/+13
| | | | | | We need a similar thing for indirect draws. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2017-11-146-16/+135
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium/radeon: disable the cache when nir backend enabledTimothy Arceri2017-11-151-0/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* broadcom/vc4: fix indentation in vc4_screen.cAndres Rodriguez2017-11-141-8/+8
| | | | | | | Stumbled into this when adding a new PIPE_CAP. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swr/rast: Faster emulated simd16 permuteTim Rowley2017-11-141-23/+11
| | | | | | | | Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* swr/rast: Use gather instruction for i32gather_ps on simd16/avx512Tim Rowley2017-11-141-11/+1
| | | | | | | | Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* radeonsi: remove has_cp_dma, has_streamout flags (v2)Marek Olšák2017-11-143-20/+2
| | | | v2: remove r600_can_dma_copy_buffer
* meson: don't use build_by_default for specific gallium driversDylan Baker2017-11-137-7/+0
| | | | | | | | | | | | | | | | | | | Using build_by_default : false is convenient for dependencies that can be pulled in by various diverse components of the build system, the gallium hardware/software drivers and state trackers do not fit that description. Instead, these should be guarded using the variable that tracks whether that driver should be enabled. This leaves a few helper libraries: trace, rbug, etc, and the generic winsys bits as `build_by_default : false` because there are a large number of gallium components that pull them in. v2: - remove build_by_default from winsys convenience libs as well. v3: - Always put drivers before winsys for consistency Signed-off-by: Dylan Baker <[email protected]> Tested-by: Lionel Landwerlin <[email protected]> (v1) Reviewed-by: Eric Anholt <[email protected]>
* r600/shader: handle bitfield extract semantics properly.Dave Airlie2017-11-141-4/+53
| | | | | | | Fixes: tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldExtract.shader_test Signed-off-by: Dave Airlie <[email protected]>
* r600: handle bitfieldInsert corner case.Dave Airlie2017-11-141-1/+39
| | | | | | | | | This handles the bits >= 32 corner case in bitfieldInsert. Fixes: tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldInsert.shader_test. Signed-off-by: Dave Airlie <[email protected]>
* r600: add gs tri strip adjacency fix.Dave Airlie2017-11-144-5/+62
| | | | | | | | | | | | | | Like radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation evergreen hw suffers from the same problem, so rotate the geometry inputs to fix this. This fixes: ./bin/glsl-1.50-geometry-primitive-types GL_TRIANGLE_STRIP_ADJACENCY on evergreen. Signed-off-by: Dave Airlie <[email protected]>
* r600: fix isoline tess factor component swapping.Dave Airlie2017-11-141-0/+7
| | | | | | | | | As per radeonsi, the tess factor components for isolines are reversed. Fixes: tests/spec/arb_tessellation_shader/execution/isoline.shader_test Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: reserve first register of vertex shader.Dave Airlie2017-11-141-2/+4
| | | | | | | | | | r0 in input into vertex shaders contains things like vertexid, we need to reserve it even if we have no inputs. This fixes a bunch of tessellation piglits. Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: don't emit atomic save if we have no atomic counters.Dave Airlie2017-11-141-0/+3
| | | | | | | Otherwise we end up emitting the fence. Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* etnaviv: automake,meson: include common_3d.xml.h in the sources listsJuan A. Suarez Romero2017-11-132-0/+2
| | | | | | | v2: include the file also in the meson.build (Eric Engestrom). Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb") Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/a5xx: fix SSBO emit for non-zero offsetRob Clark2017-11-121-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: remove obsolete commentRob Clark2017-11-121-4/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't create split/fo if only writing .xRob Clark2017-11-121-0/+6
| | | | | | | In case an instruction only writes one register, and it is .x, we can skip the extra level of fanout indirection. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: indirect gridsRob Clark2017-11-123-20/+86
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: add global size compute capRob Clark2017-11-121-0/+5
| | | | Signed-off-by: Rob Clark <[email protected]>