summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600
Commit message (Collapse)AuthorAgeFilesLines
* Fix setting indent-tabs-mode in the Emacs .dir-locals.el filesNeil Roberts2018-10-171-1/+1
| | | | | | | Some of the .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. Reviewed-by: Ilia Mirkin <[email protected]>
* r600/sb: Fix constant-logical-operand warning.Vinson Lee2018-10-121-1/+1
| | | | | | | | | | | | | | | | | sb/sb_bc_parser.cpp:620:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (cf->bc.op_ptr->flags && FF_GDS) ^ ~~~~~~ sb/sb_bc_parser.cpp:620:27: note: use '&' for a bitwise operation if (cf->bc.op_ptr->flags && FF_GDS) ^~ & sb/sb_bc_parser.cpp:620:27: note: remove constant to silence this warning if (cf->bc.op_ptr->flags && FF_GDS) ~^~~~~~~~~ Fixes: da977ad90747 ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600: use build-id when available for disk cacheTimothy Arceri2018-10-031-7/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* r600/sb: use safe math optimizations when TGSI contains precise operationsGert Wollny2018-09-153-1/+5
| | | | | | | | | | Fixes: dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: fix HTILE for NPOT textures with mipmappingMarek Olšák2018-09-101-2/+2
| | | | | Cc: 18.1 18.2 <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGETMarek Olšák2018-09-071-0/+4
|
* gallium: enable GL_AMD_depth_clamp_separate on r600, radeonsiMarek Olšák2018-09-063-2/+3
|
* gallium: split depth_clip into depth_clip_near & depth_clip_farMarek Olšák2018-09-062-4/+4
| | | | for AMD_depth_clamp_separate.
* gallium: add PIPE_CAP_RASTERIZER_SUBPIXEL_BITSMarek Olšák2018-09-061-0/+1
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER{S,_BUFFERS}Erik Faye-Lund2018-09-051-0/+10
| | | | | | | | | | | | | | This moves the evergreen-specific max-sizes out as a driver-cap, so other drivers with less strict requirements also can use hw-atomics. Remove ssbo_atomic as it's no longer needed. We should now be able to use hw-atomics for some stages and not for other, if needed. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* gallium: add PIPE_CAP_MAX_COMBINED_SHADER_BUFFERSErik Faye-Lund2018-09-051-0/+4
| | | | | | | | | | | This gets rid of a r600 specific hack in the state-tracker, and prepares for other drivers to be able to use hw-atomics. While we're at it, clean up some indentation in the various drivers. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* gallium: Add a helper for implementing PIPE_CAP_* default values.Eric Anholt2018-09-041-1/+3
| | | | | | | | | | | | | | | | | | One of the pains of implementing a gallium driver is filling in a million pipe caps you don't know about yet when you're just starting out. One of the pains of working on gallium is copy-and-pasting your new PIPE_CAP into each driver. We can fix both of these by having each driver call into the default helper from their default case, so that both sides can ignore each other until they need to. v2: fix i915g build, revert swr change to avoid breaking scons build (https://travis-ci.org/anholt/mesa/jobs/419739857) v3: Rebase on 3 new gallium caps. Reviewed-by: Marek Olšák <[email protected]> (v1) Cc: Bruce Cherniak <[email protected]> Cc: George Kyriazis <[email protected]> Cc: Kenneth Graunke <[email protected]>
* gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE.Kenneth Graunke2018-08-241-0/+1
| | | | | | | | | | | | | Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER. Drivers for such hardware would like to advertise support for ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp. This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit, changes the extension enable to be based on that, and enables it in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP (so they continue supporting this mode).
* Revert "configure: allow building with python3"Emil Velikov2018-08-241-1/+1
| | | | | | | | | | | | | | This reverts commit ae7898dfdbe5c8dab7d11c71862353f1ae43feb0. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
* gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZEMarek Olšák2018-08-231-0/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_GS_INVOCATIONSMarek Olšák2018-08-231-0/+3
| | | | Tested-by: Dieter Nützel <[email protected]>
* configure: allow building with python3Emil Velikov2018-08-231-1/+1
| | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* r600/eg: rework atomic counter emission with flushesDave Airlie2018-08-216-31/+54
| | | | | | | | | | | | | | | | | | | | With the current code, we didn't do the space checks prior to atomic counter setup emission, but we also didn't add atomic counters to the space check so we could get a flush later as well. These flushes would be bad, and lead to problems with parallel tests. We have to ensure the atomic counter copy in, draw emits and counter copy out are kept in the same command submission unit. This reworks the code to drop some useless masks, make the counting separate to the emits, and make the space checker handle atomic counter space. [airlied: want this in 18.2] Fixes: 06993e4ee (r600: add support for hw atomic counters. (v3))
* meson: Build with Python 3Mathieu Bridon2018-08-101-1/+1
| | | | | | | | | | | | Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Use the unicode_escape codecMathieu Bridon2018-08-011-1/+1
| | | | | | | | | | | | Python 2 had string_escape and unicode_escape codecs. Python 3 only has the latter. These work the same as far as we're concerned, so let's use the future-proof one. However, the reste of the code expects unicode strings, so we need to decode them again. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radeonsi: use storage_samples instead of color_samples in most placesMarek Olšák2018-07-311-2/+2
| | | | | | | and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <[email protected]>
* gallium: add storage_sample_count parameter into is_format_supportedMarek Olšák2018-07-314-1/+11
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTSMarek Olšák2018-07-311-0/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* r600: reduce num compute threads to 1024.Dave Airlie2018-07-311-1/+1
| | | | | | | | | | I copied this value from radeonsi, but it was wrong, 1024 seems to be correct answer from looking at gpuinfo. This should fix a few compute shader related hangs. (at least in CTS) Cc: <[email protected]> (airlied: pushed because it avoids hangs)
* r600: Scale integer valued texture border colors to float (v2)Gert Wollny2018-07-251-1/+44
| | | | | | | | | | | | | | | | | | | | | | | | It seems the hardware always expects floating point border color values [0,1] for unsigned, and [-1,1] for signed texture component, regardless of pixel type, but the border colors are passed according to texture component type. Hence, before submitting the border color, convert and scale it these ranges accordingly. This doesn't seem to work for textures with 32 bit integer components though, here, it seems that the border color is always set to zero, regardless of the BORDER_COLOR_TYPE state set in Q_TEX_SAMPLER_WORD0_0. v2: Simplyfy logic as suggested by Roland Schneidegger Fixes: dEQP-GLES31.functional.texture.border_clamp.formats.compressed* dEQP-GLES31.functional.texture.border_clamp.formats.r* (non 32 bit integer) dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d* and a number of piglits out of piglit run gpu -t texture -t gather -t formats Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: enable tess_input_info for TESDave Airlie2018-07-231-14/+6
| | | | | | | | | | | There might be a nicer way to do this, but this is at least correct. This fixes: KHR-GL44.tessellation_shader.single.max_patch_vertices KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn Reviewed-By: Gert Wollny <[email protected]> Cc: [email protected]
* r600: Correct evaluation of cube array index and faceGert Wollny2018-07-201-1/+33
| | | | | | | | | | | | | | | | | | | | The array index needs to be corrected and it must be insured that it is rounded and its value is non-negative before it is combined with the face id. v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) v6: Fix type (Roland Scheidegger) Fixes 182 from android/cts/master/gles31-master.txt: dEQP-GLES31.functional.texture.filtering.cube_array.formats.* dEQP-GLES31.functional.texture.filtering.cube_array.sizes.* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: correct texture offset for array index lookupGert Wollny2018-07-201-5/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the array index for TEXTURE_*1D_ARRAY, and TEXTURE_*2D_ARRAY The standard says the array index is evaluated according to floor(z + 0.5) but RNDNE is sufficient also for the test cases were z is close to 1.5 and it is likely to hit 1.5, the corner case were RNDNE gives a result different from above formula. v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) - update commit message Fixes 325 tests from android/cts/master/gles3-master.txt: dEQP-GLES3.functional.shaders.texture_functions.texture.*sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.textureoffset.*sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.*sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.*sampler2darray* dEQP-GLES3.functional.texture.filtering.2d_array.formats.* dEQP-GLES3.functional.texture.filtering.2d_array.sizes.* dEQP-GLES3.functional.texture.filtering.2d_array.combinations.* dEQP-GLES3.functional.texture.shadow.2d_array.* dEQP-GLES3.functional.texture.vertex.2d_array.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: Delay emission of texture gradients and lookup offsetsGert Wollny2018-07-201-44/+48
| | | | | | | | | | | | | | Gradients used in texture lookups and the offsets must reside in the same fetch clause (the first is imposed by the hardware and the second is expected by sb). In order to ensure that no ALU clause is inserted between emission and use of these, delay the emission of these instructions until the texture instruction using them is also emitted. This is needed in preparation for the correction of the texture array indices. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: silence the signed overflow warning like radeonsiMarek Olšák2018-07-181-1/+1
| | | | | | r600_gpu_load.c: In function ‘r600_gpu_load_thread’: ../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow] if (start <= end)
* r600: fix warnings when unref'ing pool->boMarek Olšák2018-07-171-3/+3
|
* r600g: some -Wsign-compare fixesKonstantin Kharlamov2018-07-176-14/+13
| | | | | Signed-off-by: Konstantin Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: constify some variablesKonstantin Kharlamov2018-07-175-10/+10
| | | | | | | Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: do not use "fast-clear" for small textures (v3)Konstantin Kharlamov2018-07-171-0/+10
| | | | | | | | | | | | | | | | | | | | Ported from radeonsi. Improves windowed glxgears ran as vblank_mode=0 glxgears -info -geometry 0+0+512+512 from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS. v2: turned out glxgears ignores the option above, the correct way would be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS. 300×300 however wins around a hundred FPS, and to leave some room in case results may differ for other cards I want not to nitpick in search of an optimum but to simply leave 300×300 in the code. v3: remove redundant braces, and try harder for the mail to stick to the rest of the series. Signed-off-by: Konstantin Kharlamov <[email protected]> Reviewed-by: Gert Wollny <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600: fix build after the removal of RADEON_PRIO_* flagsMarek Olšák2018-07-167-21/+12
|
* radeonsi: merge DCC/CMASK/HTILE priority flagsMarek Olšák2018-07-162-3/+3
| | | | | | For a later simplification. Reviewed-by: Samuel Pitoiset <[email protected]>
* r600: Add spill output to group only if register or target index changesGert Wollny2018-07-131-24/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current spill code checks in each instruction of an instruction group whether spilling is needed and if so, it adds spilling for each component as a seperate instruction and it allocates a new temporary for each component and since it takes the write mask from the TGSI representation, all components might be written each time and as a result already written components might be overwritten with garbage like: ... y: MOV R9.y, [0x42140000 37].x t: MOV R8.x, [0x42040000 33].y ... MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3 MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3 ... To resolve this isse accumulate spills to the same memory location so that only one memory write instruction is emitted for an instruction group that writes up to all four components. This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/): spec/glsl-1.30/execution fs-large-local-array-vec2.shader_test fs-large-local-array-vec3.shader_test fs-large-local-array-vec4.shader_test v2: fix some typos and add comment about piglits (Roland Scheidegger) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1)
* r600: report incorrect max-vertex-attrib for GL 4.4Erik Faye-Lund2018-07-091-1/+2
| | | | | | | | | | | | | | OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but r600 only supports 2047. Technically, this makes it an GL4.3 GPU, but it's currently exposing GL4.4. To avoid regressing the GL version supported in the following patches, let's just lie and pretend like we support 2048. Any applications using 2048 are already broken anyway. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* r600/sb: fix crash in fold_alu_op3Roland Scheidegger2018-07-091-0/+2
| | | | | | | | | | | | | | | | | | | | fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-061-25/+26
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* r600: compare structure elements instead of doing a memcmpGert Wollny2018-07-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Structures might be padded by the compiler and these padding bytes remain un-initialized which in turn makes memcmp return a difference where from the logical point of view there is none. Fixes valgrind: Conditional jump or move depends on uninitialised value(s) at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099) by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573) by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129) by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153) by 0xB3B92CB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3948FD: st_update_array (st_atom_array.c:388) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formatsGert Wollny2018-07-051-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Below tests would fail with an error message "Vertex format (R4G4B4A4|R5G5B5A1) not supported." Add the formate to the translation routine to enable these formats. Fixes: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: force LOD range to be only one value when mip.min filter is NONEGert Wollny2018-07-051-1/+9
| | | | | | | | | | | | | | | | For a texture that has only one LOD defined, but for which GL_TEXTURE_MAX_LEVEL is the default (1000) and GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does not properly resolve the LOD level and texture lookup might fail. Hence, when no mipmap filter is given (indicating that no mip-mapping takes place), force the LOD range to contain only value. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture*.(i|u)sampler2d* dEQP-GLES3.functional.texture.format.sized.cube.rgb* out of VK_GL_CTS/android/cts/master/gles3-master.txt Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600/sb: cleanup if_conversion iterator to be legal C++Dave Airlie2018-07-041-7/+4
| | | | | | | | | | | | | | The current code causes: /usr/include/c++/8/debug/safe_iterator.h:207: Error: attempt to copy from a singular iterator. This is due to the iterators getting invalidated, fix the reverse iterator to use the return value from erase, and cast it properly. (used Mathias suggestion) Cc: <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* gallium/util: remove dummy function util_format_is_supportedMarek Olšák2018-06-292-6/+0
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* r600/sb: give the scheduler more margin to find valid instructions groupsGert Wollny2018-06-251-3/+10
| | | | | | | | | | | | | | | For instruction sequences that change the address register with every load the current limit to bail out of the scheduler and reject the optimisation was too tight, i.e. it was expected that at least one pending instruction would be scheduled each time. Give the scheduler more margin to sort out these load sequences by allowing a number of rounds where no instruction is scheduled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600/sb: fix rotated register in while loopGert Wollny2018-06-251-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is based on https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html Dave Airlie: "A bunch of CTS tests led me to write tests/shaders/ssa/fs-while-loop-rotate-value.shader_test which r600/sb always fell over on. GCM seems to move some of the copies into other basic blocks, if we don't allow this to happen then it doesn't seem to schedule them badly. Everything I've read on SSA/phi copies say they have to happen in parallel, so keeping them in the same basic block seems like a good way to keep some of that property." This patch differs from the one proposed by Dave in that it only adds the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi* and that are located in loops. Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test (no regressions in the shader set). It also fixes all failing tests from dEQP-GLES3.functional.shaders.loops.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600: fix copy/paste bug for sampleMaskIn workaroundRoland Scheidegger2018-06-211-1/+1
| | | | | | | | | | | The sampleMaskIn workaround (b936f4d1ca0d2ab1e828ff6a6e617f12469687fa) tries to figure out if the shader is running at per-sample frequency, but there's a typo bug so it will only recognize per-sample linar inputs, not per-sample perspective ones. Spotted by Eric Engestrom <[email protected]> Fixes: b936f4d1ca0d2ab1e828a "r600: partly fix sampleMaskIn value"
* gallium: add scalar isa shader capChristian Gmeiner2018-06-201-0/+2
| | | | | | | | | | | | | | | | v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbufMarek Olšák2018-06-1918-111/+111
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>