aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: move pctx initialisation to avoid a null dereferenceChristian Gmeiner2017-03-011-5/+5
| | | | | | | | | | | | In case ctx->stream == NULL the fail label gets executed where pctx gets dereferenced - too bad pctx is NULL in that case. Caught by Coverity, reported to me by imirkin. Cc: "17.0" <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (cherry picked from commit a0b16a08905d68da07668a42eeb464b4f30bf3e5)
* radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIKNicolai Hähnle2017-03-016-19/+43
| | | | | | | | | | | The same PS epilog workaround as for 8-bit integer formats is required, since the CB doesn't do clamping. Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 066a117be77fdc2b29c8eafabb4e2c2fa902a18e)
* radeonsi: handle MultiDrawIndirect in si_get_draw_start_countNicolai Hähnle2017-03-011-7/+53
| | | | | | | | | | | | | | | | | | | | | | Also handle the GL_ARB_indirect_parameters case where the count itself is in a buffer. Use transfers rather than mapping the buffers directly. This anticipates the possibility that the buffers are sparse (once ARB_sparse_buffer is implemented), in which case they cannot be mapped directly. Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type on <= CIK. v2: - unmap the indirect buffer correctly - handle the corner case where we have indirect draws, but all of them have count 0. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> (cherry picked from commit 6a1d9684f4ec1e1eed49bc14749be7b7784277ec)
* gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionallyMarek Olšák2017-03-013-3/+5
| | | | | | | | | It's OK for r300g (because r300g can't write to buffers via the GPU), but not later hardware. This issue was spotted randomly. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit c8ef5123980f9f538c79e626b0092660a2256ae6)
* radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)Marek Olšák2017-03-011-2/+2
| | | | | | | | | | | | | | | | | start can only be non-zero with MultiDrawElements, which is unlikely to occur with UNSIGNED_BYTE indices. v2: Also fix the util_shorten_ubyte_elts_to_userptr call. Tested with the new piglit. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit a264fee6245856340fab9024e1a428626e966335) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/radeonsi/si_state_draw.c
* nvc0: disable linked tsc mode in compute launch descriptorIlia Mirkin2017-02-232-2/+6
| | | | | | | | | | | Empirically, this makes things work. Presumably this was originally copied from the blob, which does make use of linked tsc mode. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected] (cherry picked from commit 956556b3c30ce3d38d0af795f9383df3bc2cf8a2)
* nvc0: set the render condition in the compute objectIlia Mirkin2017-02-231-2/+10
| | | | | | | | Fixes GL45-CTS.compute_shader.conditional-dispatching Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 48f04862c1d74844db9534b32ef73e5a2bc0ae74)
* gm107/ir: fix address offset bitfield for ATOMSIlia Mirkin2017-02-231-1/+1
| | | | | | | | Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 7e75f0913ab545be14feb233d1ed74dc48116fb8)
* vc4: Avoid emitting small immediates for UBO indirect load address guards.Eric Anholt2017-02-235-4/+20
| | | | | | | | | | | | | The kernel will reject our shader if we emit one here, and having 4, 8, or 12 as the top end of our UBO clamp rare is enough that it's not worth making the kernel let us. Fixes piglit fs-const-array-of-struct and fs-const-array-of-struct-of-array since recent GLSL linking changes made us get this as an indirect load of a uniform, instead of a tempoary. Cc: "13.0 17.0" <[email protected]> (cherry picked from commit b2309393039b2ec0cc00a8e6fd828c60c4ef1e11)
* gallium/radeon: fix performance of buffer readbacksMarek Olšák2017-02-101-8/+9
| | | | | | | | | | | | | | | | We want cached GTT for all non-persistent read mappings. Set level = 0 on purpose. Use dma_copy, because resource_copy_region causes a failure in the PBO read of piglit/getteximage-luminance. If Rocket League used the READ flag, it should get cached GTT. v2: mask out UNSYNCHRONIZED Cc: 13.0 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit d86099df0af7c22c8acfd48b38ad446d9c8df6bd)
* nvc0/ir: fix ubo max clamp, reset file indexIlia Mirkin2017-02-101-1/+3
| | | | | | | | | | | | We just increased the max UBO, so we should also increase the clamp that we do for robustness. Similarly, as we're including the fileIndex in the new indirect value, we should reset fileIndex to 0 so that it is not added in a second time. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected] (cherry picked from commit c95f821cb4286f8163bfdf341be2b0940011585a)
* nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ computeIlia Mirkin2017-02-101-25/+22
| | | | | | | | | | | | | | Kepler and up unfortunately only support up to 8 constbufs. We work around this by loading from constbufs as if they were storage buffers. However we were not consistently applying limits to loads from these buffers. Make sure to do the same thing we do for storage buffers. Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected] (cherry picked from commit 1acdd62847cf0da8a8e9c7915d698208d73a5be8)
* nvc0: increase number of ubo binding pointsIlia Mirkin2017-02-101-3/+2
| | | | | | | | | | | | | | Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but it's unclear what it refers to). We need to expose an extra binding point for the "program parameters", which means this must be 15. Remove the last vestige of the "use c14 for immediates" idea. Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected] (cherry picked from commit 59ca352fc573a37f9f70c1f6217e85dd3e31d38e)
* swr: [rasterizer core] Remove dead code Clipper::ClipScalar()Bruce Cherniak2017-02-101-39/+0
| | | | | | | | | | | | | | Clipper::ClipScalar() is dead code and should be removed. It is causing an error with gcc-7 because it references a now defunct member. v2: includes bugzilla reference, same code change Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633 CC: "13.0 17.0" <[email protected]> Tested-by: Vinson Lee <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]> (cherry picked from commit bf29495dcdb290c8b15cacd2001603b8ae5d36c8)
* Revert "radeonsi: decrease the number of texture slots to 24"Marek Olšák2017-02-081-1/+1
| | | | | | | | | | | This reverts commit bdd860e3076655519d45bd66936ef7be9b7dda63. Requested by a game developer. Cc: 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit dfe111368d11aaffae7f8738c858c335cdec1e9d)
* etnaviv: force vertex buffers through the MMULucas Stach2017-02-031-1/+4
| | | | | | | | | | | This fixes a vertex data corruption issue if some of the vertex streams go through the MMU and some don't. Signed-off-by: Lucas Stach <[email protected]> Tested-by: Philipp Zabel <[email protected]> Acked-by: Christian Gmeiner <[email protected]> (cherry picked from commit e158b7497103f145a9236a70183e07c37a9e13f7) Nominated-by: Christian Gmeiner <[email protected]>
* etnaviv: Set SE.CLIP registers, add margins for scissor/clip registersWladimir J. van der Laan2017-02-033-20/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes rendering of full-screen quads (and other screen-filling geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op on other hardware. - It looks like SE_CLIP registers were not set at all. I'm amazed that rendering worked without them. Emit them to avoid issues on gc3000. - Define constants ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119) ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111) ETNA_SE_CLIP_MARGIN_RIGHT (0xffff) ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff) These demarcate the margin (fixp16) between the computed sizes and the value sent to the chip. I have set these to the numbers used by the Vivante driver for gc2000. I am not sure whether any old hardware was relying on the old numbers, or whether those were just a guess. But if so, these need to be moved to the _specs structure. CC: <[email protected]> Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Christian Gmeiner <[email protected]> (cherry picked from commit 56314f5bafdfeb514adf8401c52f216bd430bbb2)
* etnaviv: Generate new sin/cos instructions on GC3000Wladimir J. van der Laan2017-02-033-1/+40
| | | | | | | | | | | | | | | | | | | | | | Shaders using sin/cos instructions were not working on GC3000. The reason for this turns out to be that these chips implement sin/cos in a different way (but using the same opcodes): - Need their input scaled by 1/pi instead of 2/pi. - Output an x and y component, which need to be multiplied to get the result. - tex_amode needs to be set to 1. Add a new bit to the compiler specs and generate these instructions as necessary. CC: <[email protected]> Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Christian Gmeiner <[email protected]> (cherry picked from commit fe3bb8cdb519a01e6315ce6f142827aece3d4a41)
* etnaviv: Cannot render to rb-swapped formatsWladimir J. van der Laan2017-02-031-2/+5
| | | | | | | | | | | | | | | Exposing rb swapped (or other swizzled) formats for rendering would involve swizzing in the pixel shader. This is not the case at the moment, so reject requests for creating such surfaces. (GPUs that need an extra resolve step anyway due to multiple pixel pipes, such as gc2000, might also do this swap in the resolve operation. But this would be tricky to keep track of) CC: <[email protected]> Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Christian Gmeiner <[email protected]> (cherry picked from commit 658568941d5e232d690e1ffbcddbd6ea9685693a)
* etnaviv: Avoid infinite loop in find_frame()Christian Gmeiner2017-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | Use of unsigned loop control variable with '>= 0' would lead to infinite loop. Reported by clang: etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression >= 0 is always true [-Wtautological-compare] for (unsigned sp = c->frame_sp; sp >= 0; sp--) ~~ ^ ~ v2: Simply use the same datatype as c->frame_sp is using. CC: <[email protected]> Reported-by: Rhys Kidd <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Rhys Kidd <[email protected]> (cherry picked from commit 82fe240a9912d78bc2eec513c1139c918c5f189f)
* radeonsi: don't invoke DCC decompression in update_all_texture_descriptorsMarek Olšák2017-02-031-5/+6
| | | | | | | | | | | | | | This fixes a bug uncovered by the 17-part patch series, specifically: "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter" If dirty_tex_counter has been updated and set_shader_image invokes DCC decompression, the DCC decompression itself checks the counter and updates descriptors, which in turn invokes the same DCC decompression. The blitter can't handle the recursion and the driver eventually crashes. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit a0740d59aa97a08d89998cb57138e8217a331af6)
* r600: Fix stack overflowBartosz Tomczyk2017-02-031-1/+1
| | | | | | | | | | Commit 7b5878ee0491e7a93914389a8369cd6752b9757d increased number of outputs to 64, but left output array intact. This caused stack overflow when number of outputs is bigger then 32. Found by ASAN. Cc: "12.0 13.0 17.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit a41f2527ae8ae5432b99c88863fbdf2f0b5f04ad)
* freedreno: automake: correctly set MKDIR_GENEmil Velikov2017-02-031-0/+1
| | | | | | | | | | | | | Analogous to previous commit. Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos" Cc: "12.0 13.0" <[email protected]> Cc: Rob Clark <[email protected]> Cc: Nicolas Dechesne <[email protected]> Reported-by: Nicolas Dechesne <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Tested-by: Nicolas Dechesne <[email protected]> (cherry picked from commit a922c821255bfac22cf705244e5bd303a626bb55)
* radeonsi: handle first_non_void correctly in si_create_vertex_elementsMarek Olšák2017-02-031-3/+3
| | | | | | | | | This fixes R11G11B10_FLOAT, because it's in the category of "OTHER", meaning that it doesn't have any channel description. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit eac7df43ca05abd9992b305e078e88fe7b7f8c91)
* radeonsi: always set the TCL1_ACTION_ENA when invalidating L2Marek Olšák2017-01-241-1/+2
| | | | | | | | | Some CIK-VI docs say this is the default behavior on SI. That doesn't answer whether it's also the default behavior on CIK-VI. Cc: 17.0 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit 573bf0940a08e18a511e338de478f30fd95a1590)
* swr: Align query results allocationGeorge Kyriazis2017-01-242-4/+5
| | | | | | | | | | | | Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]> (cherry picked from commit 00847e4f14dd237dfcdb2c3d15be1325a08ccf5a)
* swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak2017-01-241-0/+9
| | | | | | | | | | | | | CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <[email protected]> CC: <[email protected]> (cherry picked from commit b829206b0739925501bcc68233437d6d03b79795)
* r600: implement DDIVNicolai Hähnle2017-01-241-0/+59
| | | | | | | Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit e4f8f9a638c1ffb9b76840b088290f11f0f91813)
* r600: factor out cayman_emit_unary_double_rawNicolai Hähnle2017-01-241-20/+42
| | | | | | | | | We will use it for DDIV. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit 488560cfe6ee2206f7a7f894694ebc43b419be61)
* r600: double multiply can handle only one multiply at a timeNicolai Hähnle2017-01-241-17/+19
| | | | | | | | | | | It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit 76b02d2fe1df5351f67f53d07b37952043f0a84c)
* freedreno/a5xx: set frag shader threadsizeRob Clark2017-01-241-2/+7
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 31daeb5bf14334bc0d39f28c9102cd15d834abfc)
* freedreno/a5xx: set fragcoordxy properlyRob Clark2017-01-241-1/+1
| | | | | | | | | | | | What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 8d6af93e76bb9e592293b632b22b2b756cc0cae8)
* freedreno/a5xx: fix psizeRob Clark2017-01-242-8/+5
| | | | | | | | | Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 6cc93bedc15d09395ab6a92a0a129d06a8cd8ae8)
* freedreno/a5xx: srgb fixRob Clark2017-01-241-1/+2
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 141a4f86d6b9c0c4dbde511b741576a103f8f7ff)
* freedreno/a5xx: fix int vbosRob Clark2017-01-241-1/+3
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 69fbb458cf59fbab5f6675ad256a266b04d54700)
* freedreno/a5xx: fix clear for uint/sint formatsRob Clark2017-01-241-19/+28
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 16671e970444f154ffa60d2aaadee4d065eb6103)
* freedreno/a5xx: fix cull stateRob Clark2017-01-241-5/+5
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 4d9aa4f67d6316feea93901bf29b76a68c4333cd)
* freedreno: update generated headersRob Clark2017-01-246-13/+36
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 4c39458460075f6c1ea9e4607769513b96c6dd82)
* radeonsi: don't forget to add HTILE to the buffer list for texturingMarek Olšák2017-01-201-6/+13
| | | | | | | | | | | | | This fixes VM faults. Discovered by Samuel Pitoiset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450 Cc: 17.0 13.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> (cherry picked from commit e490b7812cae778c61004971d86dc8299b6cd240)
* radeonsi: fix texture gather on stencil texturesNicolai Hähnle2017-01-201-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | At least on VI, texture gather doesn't work with a 24_8 data format, so use 8_8_8_8 and a modified swizzle instead. A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select the X24S8 pipe format because we don't support stencil-only render targets properly. With mip-mapping this can lead to a setup where the tiling is incompatible with stencil texturing, and a flushed stencil texture is used. For the flushed stencil, a literal X24S8 is used because there were issues with an 8bpp DB->CB copy. Longer term, it would be good if we could get away from these workarounds, i.e. properly support an S8 format for stencil-only rendering and flushed stencil. Since stencil texturing is somewhat rare, it's not a high priority. Fixes GL45-CTS.texture_cube_map_array.sampling. Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> (cherry picked from commit 3cd092c41508dde2e6259f09df1736911a828548)
* radeonsi: Always leave poly_offset in a valid stateZachary Michaels2017-01-201-1/+3
| | | | | | | | | | | This commit makes si_update_poly_offset set poly_offset to NULL if uses_poly_offset is false. This way poly_offset either points into the currently queued rasterizer, or it is NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451 Cc: "13.0 17.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit d7d32b3bfe86bd89d94d59393907bce1cb9dab7c)
* radeonsi: determine in advance which VBOs should be added to the buffer listMarek Olšák2017-01-183-4/+11
| | | | | | v2: now it should be correct Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptorsMarek Olšák2017-01-181-8/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: reject invalid vertex buffer indices at state creationMarek Olšák2017-01-182-5/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a global dirty mask for shader pointersMarek Olšák2017-01-184-41/+51
| | | | | | Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a bitmask-based loop in si_decompress_texturesMarek Olšák2017-01-183-7/+31
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip an unnecessary mutex lock for L2 prefetchesMarek Olšák2017-01-181-5/+7
| | | | | | the mutex lock is inside util_range_add. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: si_cp_dma_prepare is a no-op for L2 prefetchesMarek Olšák2017-01-182-5/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATEMarek Olšák2017-01-182-10/+15
| | | | | | | the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the correct target machine when building shader variantsMarek Olšák2017-01-182-14/+29
| | | | | | | | | | If the shader selector is created with a different context than the shader variant, we should use the calling context's target machine for the shader variant. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419 Reviewed-by: Nicolai Hähnle <[email protected]>