summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: use a clever alignment for index buffer uploadsMarek Olšák2017-02-181-4/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a clever alignment for descriptor uploadsMarek Olšák2017-02-181-4/+7
| | | | | | | Non-VBO descriptors won't be smaller than the cache line, so simply use the cache line size. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a clever alignment for constant buffer uploadsMarek Olšák2017-02-183-1/+19
| | | | | | This results in a very tiny decrease in lgkm wait cycles. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move index buffer flushing into a non-upload indexed caseMarek Olšák2017-02-181-7/+6
| | | | | | The other codepaths don't need this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use SI_MAX_ATTRIBS where it should be usedMarek Olšák2017-02-184-5/+5
| | | | | | for consistency; no change in behavior Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: sort members of si_shader_key::partMarek Olšák2017-02-181-6/+6
| | | | | | and improve some comments Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: have separate LS and ES main shader parts in the shader selectorMarek Olšák2017-02-183-5/+49
| | | | | | | This might reduce the on-demand compilation if the initial VS/LS/ES determination is wrong. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't compile pure monolithic shaders asynchronouslyMarek Olšák2017-02-181-2/+6
| | | | | | there is no point, we have to wait anyway. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VIMarek Olšák2017-02-181-3/+9
| | | | | | | | | So that we can disable u_vbuf for GL core profiles. This is a v2 of the previous VI-only patch. It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove the fix_size3 workaroundMarek Olšák2017-02-183-36/+0
| | | | | | not needed with the shader fallback Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loadsMarek Olšák2017-02-185-6/+60
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make fix_fetch an array of uint8_tMarek Olšák2017-02-185-23/+25
| | | | | | so that we can add 3-component fallbacks. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_suballoc: allow setting pipe_resource::flagsMarek Olšák2017-02-183-4/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: remove unneeded extern "C"George Kyriazis2017-02-161-3/+0
| | | | | | the guards have been added to the header files that needed them. Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: use shared emit_umsb helper.Dave Airlie2017-02-161-22/+2
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: use shared emit imsb code.Dave Airlie2017-02-161-25/+3
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/radeon: add a HUD query for monitoring the CS thread activityMarek Olšák2017-02-153-1/+25
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement uploading zero-stride vertex attribsMarek Olšák2017-02-141-8/+23
| | | | | | This is the only kind of user buffer we can get with the GL core profile. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: include SDMA in the GPU load queryMarek Olšák2017-02-142-1/+12
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add an assertion to texture_transfer_map for app bugsMarek Olšák2017-02-141-0/+1
| | | | | Tested-by: Kai Wasserbäch <[email protected]> Reviewed-by: Kai Wasserbäch <[email protected]>
* radeonsi: implement legacy GL_DOUBLE vertex formatsMarek Olšák2017-02-143-21/+117
| | | | | | so that we can disable u_vbuf for GL core profiles. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up si_get_paramMarek Olšák2017-02-141-19/+11
| | | | | | has_streamout is always true Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove the internal u_upload_mgr pointerMarek Olšák2017-02-147-26/+33
| | | | | | | | also remove the BIND flags Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Tested-by: Charmaine Lee <[email protected]>
* gallium: set pipe_context uploaders in drivers (v3)Marek Olšák2017-02-1418-6/+131
| | | | | | | | | | | | | | | Notes: - make sure the default size is large enough to handle all state trackers - pipe wrappers don't receive transfer calls from stream_uploader, because pipe_context::stream_uploader points directly to the underlying driver's stream_uploader (to keep it simple for now) v2: add error handling to nv50, nvc0, noop v3: set const_uploader Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> (v1) Tested-by: Charmaine Lee <[email protected]>
* nvc0: disable linked tsc mode in compute launch descriptorIlia Mirkin2017-02-132-2/+6
| | | | | | | | | | Empirically, this makes things work. Presumably this was originally copied from the blob, which does make use of linked tsc mode. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* radeonsi: use common sendmsg emission function.Dave Airlie2017-02-141-26/+6
| | | | | | | | This just ports radeonsi to use the sendmsg common code. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nv50,nvc0: use alternate samplers for stencilIlia Mirkin2017-02-121-3/+3
| | | | | | | | The blob uses these, and it fixes a bunch of dEQP stencil sampling tests involving border colors. Probably the Z-based samplers work somehow differently wrt border colors when using the stencil swizzle. Signed-off-by: Ilia Mirkin <[email protected]>
* etnaviv: Set shader instruction area correctly for GC3000Wladimir J. van der Laan2017-02-121-5/+21
| | | | | | | | | | | | | - Use the same instruction area on GC3000 as the Vivante driver. This allows the same number of instructions on GC3000 as GC2000 instead of half. - Makes sure that the "PE to FE" stall before updating the shader code or constants is hit (which is conditional on vs_offset > 0x4000). This is necessary on GC3000 too, it increases stability. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Update hw header filesWladimir J. van der Laan2017-02-125-48/+160
| | | | | | | | Update from etnaviv repository rnndb. This adds some newly discovered state for GC3000 (and some GC2000) features. Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Christian Gmeiner <[email protected]>
* nvc0: set the render condition in the compute objectIlia Mirkin2017-02-111-2/+10
| | | | | | | Fixes GL45-CTS.compute_shader.conditional-dispatching Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gm107/ir: fix address offset bitfield for ATOMSIlia Mirkin2017-02-111-1/+1
| | | | | | | Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50/ir: convert an ATOM.EXCH without a destination into a storeIlia Mirkin2017-02-111-0/+5
| | | | | | | | | | On SM35 there does not appear to be a way to emit a ATOM.EXCH with a null destination. This should be functionally equivalent to a plain store however, so just do that. Fixes GL45-CTS.compute_shader.atomic-case2 on SM35. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix 64-bit integer query buffer writesIlia Mirkin2017-02-113-20/+37
| | | | | | | The former logic just plain didn't work at all. We need to write the subsequent dword to the next buffer location. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: return a register when retrieving thread id sysvalIlia Mirkin2017-02-111-1/+1
| | | | | | | | | | We have logic to short-circuit such retrievals to zero. However "zero" was an immediate, and some logic expected to get registers (to later be propagated). Fix this by using loadImm. Fixes GL45-CTS.gpu_shader5.images_array_indexing Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add missing break after DSSGIlia Mirkin2017-02-111-0/+1
| | | | | | Recently broken during int64 addition. Signed-off-by: Ilia Mirkin <[email protected]>
* etnaviv: shader-db tracesChristian Gmeiner2017-02-114-1/+47
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: keep track of emitted loopsChristian Gmeiner2017-02-112-0/+7
| | | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* etnaviv: wire up core pipe_debug_callbackChristian Gmeiner2017-02-112-0/+15
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* vc4: Enable glSampleMask() even when !rasterizer->multisample.Eric Anholt2017-02-101-2/+1
| | | | | | | | gallium's blitter expects that it can set the sample mask even when the rasterizer doesn't have the flag on. Between this and the previous test, 10 new ext_framebuffer_multisample tests start passing.
* vc4: Respect glSampleMask() even when we're not writing color.Eric Anholt2017-02-101-3/+13
| | | | | | gallium's quad-based blitter for copying MSAA depth textures expects to be able to do 4 passes updating a sample at a time using glSampleMask, and there's no color buffer bound when it's doing that.
* vc4: Use the nir_builder helper for loading sample mask.Eric Anholt2017-02-101-10/+1
|
* vc4: Use accurate 1/w in coordinate shader as well as vert shader.Eric Anholt2017-02-101-1/+1
| | | | | We probably shouldn't be emitting different scaled viewport coordinates between vertex and coord.
* vc4: Drop VS inputs to 8.Eric Anholt2017-02-101-4/+1
| | | | | | | In the hardware we only get to declare 8 vertex elements (GLES2's minimum), so we should be exposing that number here. Fixes an assertion failure in piglit texrect-many, at the expense of various GL 2.0-ish minmax tests now complaining that our count is too low.
* vc4: Avoid emitting small immediates for UBO indirect load address guards.Eric Anholt2017-02-105-4/+20
| | | | | | | | | | | | The kernel will reject our shader if we emit one here, and having 4, 8, or 12 as the top end of our UBO clamp rare is enough that it's not worth making the kernel let us. Fixes piglit fs-const-array-of-struct and fs-const-array-of-struct-of-array since recent GLSL linking changes made us get this as an indirect load of a uniform, instead of a tempoary. Cc: "13.0 17.0" <[email protected]>
* gallium/radeon: use staging for texture read mappings from GTT WCMarek Olšák2017-02-101-4/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: ignore the level parameter in buffer_transfer_mapMarek Olšák2017-02-101-5/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix performance of buffer readbacksMarek Olšák2017-02-101-8/+9
| | | | | | | | | | | | | | | We want cached GTT for all non-persistent read mappings. Set level = 0 on purpose. Use dma_copy, because resource_copy_region causes a failure in the PBO read of piglit/getteximage-luminance. If Rocket League used the READ flag, it should get cached GTT. v2: mask out UNSYNCHRONIZED Cc: 13.0 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: align vertex buffer descriptor list size for optimal prefetchMarek Olšák2017-02-104-2/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: align shader binaries to CP DMA alignment for optimal prefetchMarek Olšák2017-02-101-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move CP_DMA_ALIGNMENT definitionMarek Olšák2017-02-102-10/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>