summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* swr: fix index buffers with non-zero indicesGeorge Kyriazis2017-02-235-6/+62
| | | | | | | | | | | | | Fix issue with index buffers that do not contain a 0 index. 0 index can be a non-valid index if the (copied) vertex buffers are a subset of the user's (which happens because we only copy the range between min & max). Core will use an index passed in from the driver to replace invalid indices. Only do this for calls that contain non-zero indices, to minimize performance Reviewed-by: Bruce Cherniak <[email protected]> cost.
* swr: add fetch shader cacheGeorge Kyriazis2017-02-236-15/+50
| | | | | | | | | For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak <[email protected]>
* radeon: fix r600 builds when old version of llvm is presentTimothy Arceri2017-02-231-2/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* r600/radeonsi: enable glsl/tgsi on-disk cacheTimothy Arceri2017-02-232-0/+46
| | | | | | | | | | For gpu generations that use LLVM we create a timestamp string containing both the LLVM and Mesa build times, otherwise we just use the Mesa build time. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughsTimothy Arceri2017-02-233-0/+39
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: isolate util_queue_fence implementationMarek Olšák2017-02-222-4/+4
| | | | | | it's cleaner this way. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix issues with monolithic shadersMarek Olšák2017-02-211-1/+2
| | | | | | | | | | | | | | | | R600_DEBUG=mono has had no effect since: commit 1fabb297177069e95ec1bb7053acb32f8ec3e092 Author: Marek Olšák <[email protected]> Date: Tue Feb 14 22:08:32 2017 +0100 radeonsi: have separate LS and ES main shader parts in the shader selector Also, this assertion was failing: si_state_shaders.c:1307: si_shader_select_with_key: Assertion `!shader->is_optimized' failed. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set no-signed-zeros-fp-mathMarek Olšák2017-02-212-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommended by Matt Arsenault. 46757 shaders in 28742 tests Totals: SGPRS: 2068851 -> 2066907 (-0.09 %) VGPRS: 1604056 -> 1602676 (-0.09 %) Spilled SGPRs: 1402 -> 1382 (-1.43 %) Spilled VGPRs: 113 -> 113 (0.00 %) Private memory VGPRs: 1332 -> 1332 (0.00 %) Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread Code Size: 58815520 -> 58716788 (-0.17 %) bytes LDS: 1162 -> 1162 (0.00 %) blocks Max Waves: 354616 -> 354905 (0.08 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 786452 -> 784508 (-0.25 %) VGPRS: 530000 -> 528620 (-0.26 %) Spilled SGPRs: 958 -> 938 (-2.09 %) Spilled VGPRs: 85 -> 85 (0.00 %) Private memory VGPRs: 636 -> 636 (0.00 %) Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread Code Size: 26349936 -> 26251204 (-0.37 %) bytes LDS: 304 -> 304 (0.00 %) blocks Max Waves: 108962 -> 109251 (0.27 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)Marek Olšák2017-02-211-1/+5
| | | | | | v2: define lp_float_mode Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read themMarek Olšák2017-02-213-15/+77
| | | | | | | | | | We were unconditionally storing these outputs, sometimes even one component at a time, but apps never read them in TES. Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can easily disable them on demand. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip LDS stores in TCS if there are no LDS output readsMarek Olšák2017-02-211-1/+16
| | | | | | | | | | | This removes a lot of useless LDS stores. A few games read TESSINNER/OUTER, but not any other outputs. Most games don't read any outputs. The only app doing LDS output reads is UE4 Lightsroom Interior. Reviewed-by: Nicolai Hähnle <[email protected]>
* etnaviv: remove number of pixel pipes validationChristian Gmeiner2017-02-211-10/+0
| | | | | | | | | | | | | This validation was added before the etnaviv drm driver landed in the linux kernel. Due some pre-merge API changes we had to fix-up this value but with a mainline kernel this is not a problem anymore. Lets remove that validation which also gets rid of problem caught by Coverity, reported to me by imirkin. Cc: "17.0" <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* etnaviv: move pctx initialisation to avoid a null dereferenceChristian Gmeiner2017-02-211-6/+6
| | | | | | | | | | | In case ctx->stream == NULL the fail label gets executed where pctx gets dereferenced - too bad pctx is NULL in that case. Caught by Coverity, reported to me by imirkin. Cc: "17.0" <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* etnaviv: add missing fallthrough annotationChristian Gmeiner2017-02-211-0/+1
| | | | | | | Caught by Coverity, reported to me by imirkin. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIKNicolai Hähnle2017-02-216-19/+43
| | | | | | | | | | The same PS epilog workaround as for 8-bit integer formats is required, since the CB doesn't do clamping. Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: handle MultiDrawIndirect in si_get_draw_start_countNicolai Hähnle2017-02-211-7/+53
| | | | | | | | | | | | | | | | | | | | | Also handle the GL_ARB_indirect_parameters case where the count itself is in a buffer. Use transfers rather than mapping the buffers directly. This anticipates the possibility that the buffers are sparse (once ARB_sparse_buffer is implemented), in which case they cannot be mapped directly. Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type on <= CIK. v2: - unmap the indirect buffer correctly - handle the corner case where we have indirect draws, but all of them have count 0. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* nvc0: use PascalB for most Pascal boardsBen Skeggs2017-02-212-1/+9
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* r300g: only allow byteswapped formats on big endianGrazvydas Ignotas2017-02-211-0/+5
| | | | | | | | | They cause regressions on little endian. Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869 Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* android: radeonsi: fix sid_table.h generated header include pathMauro Rossi2017-02-201-1/+3
| | | | | | | | | | | | | | | generated-sources-dir-for macro replaces intermediates-dir-for and LOCAL_MODULE_CLASS is defined as required by new macro, in order to avoid the following building error: external/mesa/src/gallium/drivers/radeonsi/si_debug.c:29:10: fatal error: 'sid_tables.h' file not found ^ 1 error generated. Fixes: 730574c58e8 ("android: ac/debug: move sid_tables.h generation and IB decode to amd/common") Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionallyMarek Olšák2017-02-193-3/+5
| | | | | | | | It's OK for r300g (because r300g can't write to buffers via the GPU), but not later hardware. This issue was spotted randomly. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)Marek Olšák2017-02-191-2/+2
| | | | | | | | | | | start can only be non-zero with MultiDrawElements, which is unlikely to occur with UNSIGNED_BYTE indices. v2: Also fix the util_shorten_ubyte_elts_to_userptr call. Tested with the new piglit. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: remove TGSI_OPCODE_CLAMPMarek Olšák2017-02-185-18/+3
| | | | | | | Not used and not widely supported. Use MIN+MAX instead. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/commonMarek Olšák2017-02-183-24/+6
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirableMarek Olšák2017-02-185-26/+50
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLEMarek Olšák2017-02-182-2/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: change r600_aligned_buffer_create to take flags, not bindMarek Olšák2017-02-182-4/+4
| | | | | | All call sites set bind = 0. The next commit will use this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: upload constants into VRAM instead of GTTMarek Olšák2017-02-184-10/+18
| | | | | | | | This lowers lgkm wait cycles by 30% on VI and normal conditions. The might be a measurable improvement when CE is disabled (radeon) or under L2 thrashing. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use TCC line size as alignment in other placesMarek Olšák2017-02-184-5/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a clever alignment for index buffer uploadsMarek Olšák2017-02-181-4/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a clever alignment for descriptor uploadsMarek Olšák2017-02-181-4/+7
| | | | | | | Non-VBO descriptors won't be smaller than the cache line, so simply use the cache line size. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a clever alignment for constant buffer uploadsMarek Olšák2017-02-183-1/+19
| | | | | | This results in a very tiny decrease in lgkm wait cycles. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move index buffer flushing into a non-upload indexed caseMarek Olšák2017-02-181-7/+6
| | | | | | The other codepaths don't need this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use SI_MAX_ATTRIBS where it should be usedMarek Olšák2017-02-184-5/+5
| | | | | | for consistency; no change in behavior Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: sort members of si_shader_key::partMarek Olšák2017-02-181-6/+6
| | | | | | and improve some comments Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: have separate LS and ES main shader parts in the shader selectorMarek Olšák2017-02-183-5/+49
| | | | | | | This might reduce the on-demand compilation if the initial VS/LS/ES determination is wrong. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't compile pure monolithic shaders asynchronouslyMarek Olšák2017-02-181-2/+6
| | | | | | there is no point, we have to wait anyway. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VIMarek Olšák2017-02-181-3/+9
| | | | | | | | | So that we can disable u_vbuf for GL core profiles. This is a v2 of the previous VI-only patch. It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove the fix_size3 workaroundMarek Olšák2017-02-183-36/+0
| | | | | | not needed with the shader fallback Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loadsMarek Olšák2017-02-185-6/+60
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make fix_fetch an array of uint8_tMarek Olšák2017-02-185-23/+25
| | | | | | so that we can add 3-component fallbacks. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_suballoc: allow setting pipe_resource::flagsMarek Olšák2017-02-183-4/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: remove unneeded extern "C"George Kyriazis2017-02-161-3/+0
| | | | | | the guards have been added to the header files that needed them. Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: use shared emit_umsb helper.Dave Airlie2017-02-161-22/+2
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: use shared emit imsb code.Dave Airlie2017-02-161-25/+3
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/radeon: add a HUD query for monitoring the CS thread activityMarek Olšák2017-02-153-1/+25
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement uploading zero-stride vertex attribsMarek Olšák2017-02-141-8/+23
| | | | | | This is the only kind of user buffer we can get with the GL core profile. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: include SDMA in the GPU load queryMarek Olšák2017-02-142-1/+12
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add an assertion to texture_transfer_map for app bugsMarek Olšák2017-02-141-0/+1
| | | | | Tested-by: Kai Wasserbäch <[email protected]> Reviewed-by: Kai Wasserbäch <[email protected]>
* radeonsi: implement legacy GL_DOUBLE vertex formatsMarek Olšák2017-02-143-21/+117
| | | | | | so that we can disable u_vbuf for GL core profiles. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up si_get_paramMarek Olšák2017-02-141-19/+11
| | | | | | has_streamout is always true Reviewed-by: Nicolai Hähnle <[email protected]>