summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* r600: add support for ARB_shader_clock.Dave Airlie2018-01-184-6/+30
| | | | | Reviewed-by: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ws: get rid of useless return valueDave Airlie2018-01-181-3/+2
| | | | | | | This also used boolean, so nice to kill that. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Initialize DCC on transition from preinitialized.Bas Nieuwenhuizen2018-01-181-1/+3
| | | | | | | | | Looks like the decompress does not handle invalid encodings well, which happens with random memory. Of course apps should not use it with random memory, but they are allowed to .... Fixes: 44fcf58744 "radv: Disable DCC for GENERAL layout and compute transfer dest." Reviewed-by: Dave Airlie <[email protected]>
* ac: fix buffer overflow bug in 64bit SSBO loadsTimothy Arceri2018-01-181-1/+4
| | | | | | Fixes: 441ee1e65b04 "radv/ac: Implement Float64 SSBO loads" Reviewed-by: Marek Olšák <[email protected]>
* ac: fix nir_intrinsic_get_buffer_size for radeonsiTimothy Arceri2018-01-181-2/+2
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Pass brw_growing_bo to grow_buffer().Kenneth Graunke2018-01-171-11/+9
| | | | | | Cleaner. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Make a helper for recreating growing buffers.Kenneth Graunke2018-01-171-13/+17
| | | | | | | | | | | Now that we have two of these, we're duplicating a bunch of this logic. The next commit will add more logic, which would make the duplication seem worse. This ends up setting EXEC_OBJECT_CAPTURE on the batch, which isn't necessary (it's already captured), but it should be harmless. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Replace cpu_map pointers with a "use_shadow_copy" boolean.Kenneth Graunke2018-01-172-21/+20
| | | | | | | | Having a boolean for "we're using malloc'd shadow copies for all buffers" is cleaner than having a cpu_map pointer for each. It was okay when we had one buffer, but this is more obvious. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Optimize and simplify the copy propagation dataflow logic.Francisco Jerez2018-01-171-24/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the dataflow propagation algorithm would calculate the ACP live-in and -out sets in a two-pass fixed-point algorithm. The first pass would update the live-out sets of all basic blocks of the program based on their live-in sets, while the second pass would update the live-in sets based on the live-out sets. This is incredibly inefficient in the typical case where the CFG of the program is approximately acyclic, because it can take up to 2*n passes for an ACP entry introduced at the top of the program to reach the bottom (where n is the number of basic blocks in the program), until which point the algorithm won't be able to reach a fixed point. The same effect can be achieved in a single pass by computing the live-in and -out sets in lock-step, because that makes sure that processing of any basic block will pick up the updated live-out sets of the lexically preceding blocks. This gives the dataflow propagation algorithm effectively O(n) run-time instead of O(n^2) in the acyclic case. The time spent in dataflow propagation is reduced by 30x in the GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP test-case on my CHV system (the improvement is likely to be of the same order of magnitude on other platforms). This more than reverses an apparent run-time regression in this test-case from my previous copy-propagation undefined-value handling patch, which was ultimately caused by the additional work introduced in that commit to account for undefined values being multiplied by a huge quadratic factor. According to Chad this test was failing on CHV due to a 30s time-out imposed by the Android CTS (this was the case regardless of my undefined-value handling patch, even though my patch substantially exacerbated the issue). On my CHV system this patch reduces the overall run-time of the test by approximately 12x, getting us to around 13s, well below the time-out. v2: Initialize live-out set to the universal set to avoid rather pessimistic dataflow estimation in shaders with cycles (Addresses performance regression reported by Eero in GpuTest Piano). Performance numbers given above still apply. No shader-db changes with respect to master. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271 Reported-by: Chad Versace <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* gallium: remove PIPE_CAP_USER_CONSTANT_BUFFERSMarek Olšák2018-01-1718-26/+0
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/mesa: assume that user constant buffers are always supportedMarek Olšák2018-01-174-34/+6
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* nine: assume that user constant buffers are always supportedMarek Olšák2018-01-174-156/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAPMarek Olšák2018-01-1719-27/+3
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/mesa: expose ARB_sync unconditionallyMarek Olšák2018-01-171-5/+2
| | | | | | | All drivers support it. Reviewed-by: Roland Scheidegger <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium: remove PIPE_CAP_TWO_SIDED_STENCILMarek Olšák2018-01-1720-27/+3
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* glsl: remove unneeded extern "C" {} bracketing around Mesa includesBrian Paul2018-01-171-4/+2
| | | | | | The two headers already have the right extern "C" annotations. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: move gl_external_samplers() to program.[ch]Brian Paul2018-01-173-17/+22
| | | | | | | The function is only called from a couple places. It doesn't make sense to have it in mtypes.h Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: include util/bitscan.h in st_glsl_to_tgsi_temprename.cppBrian Paul2018-01-171-5/+6
| | | | | | And use "" instead of <> for including Mesa headers, as we do elsewhere. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: include util/bitscan.h in serialize.cppBrian Paul2018-01-171-0/+1
| | | | | | Instead of relying on indirect inclusion of the header. Reviewed-by: Nicolai Hähnle <[email protected]>
* util: include string.h in u_dynarray.hBrian Paul2018-01-171-0/+1
| | | | | | To get memset() prototype. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: remove unneeded #includes of main/compiler.hBrian Paul2018-01-1716-16/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove unneeded #includes of main/compiler.hBrian Paul2018-01-1713-19/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: include main/compiler.h in st_cb_queryobj.cBrian Paul2018-01-171-0/+1
| | | | | | To get CPU_TO_LE32() macro. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: include util/macros.h in format_fallback.cBrian Paul2018-01-171-0/+1
| | | | | | To get definition of unreachable() macro. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: include compiler.h in disk_cache.cBrian Paul2018-01-171-0/+1
| | | | | | Instead of indirect inclusion to get CPU_TO_LE32() macro. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/program: change validate_inputs() local var 'inputs' to GLbitfield64Brian Paul2018-01-171-1/+1
| | | | | | | | | Both state->prog->info.inputs_read and state->InputsBound are GLbitfield64 so it seems that the OR of those values should be of the same type. I'm not sure this fixes any actual issues though. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vbo: reindent vbo_attrib.h to use 3 spacesBrian Paul2018-01-171-50/+50
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: whitespace, formatting fixes in vbo_exec_api.cBrian Paul2018-01-171-99/+98
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: add assertions, comments in vbo_exec_api.cBrian Paul2018-01-171-1/+7
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: whitespace, formatting fixes in vbo_exec_draw.cBrian Paul2018-01-171-64/+64
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: use inputs_read var to simplify codeBrian Paul2018-01-172-8/+8
| | | | | | | | v2: add some const qualifiers, per Ian. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: whitespace, formatting fixes in vbo_split_copy.cBrian Paul2018-01-171-160/+144
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: use a new local 'array' variable in bind_vertex_list() loopBrian Paul2018-01-171-12/+13
| | | | | | | | Make the code a bit more concise. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: remove unneeded #includes in vbo_context.cBrian Paul2018-01-171-2/+0
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: whitespace, formatting fixes in vbo_context.cBrian Paul2018-01-171-24/+35
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: change vbo_context attribute map arrays to GLubyteBrian Paul2018-01-174-5/+8
| | | | | | | | | | | The values will never be larger than VBO_ATTRIB_MAX (currently 44). v2: add STATIC_ASSERT to be sure VBO_ATTRIB_MAX can fit in ubyte, per Emil. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: lift common code out of switch casesBrian Paul2018-01-172-18/+12
| | | | | | | | Both switch cases began with the same code. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vbo: optimize some display list drawing (v2)Brian Paul2018-01-173-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vbo_save_vertex_list structure records one or more glBegin/End primitives which all have the same vertex format. To draw these primitives, we setup the vertex array state, then issue the drawing command. Before, the 'start' vertex was typically zero and we used the vertex array pointer to indicate where the vertex data starts. This patch checks if the vertex buffer offset is an exact multiple of the vertex size. If so, that means we can use zero-based vertex array pointers and use the draw's start value to indicate where the vertex data starts. This means a series of display list drawing commands may have identical vertex array state. This will get filtered out by the Gallium CSO module so we can issue a tight series of drawing commands without state changes to the device. Note that this also works for a series of glCallList commands (not just one list that contains multiple glBegin/End pairs). No Piglit or conform changes. v2: minor fixes suggested by Ian. Reviewed-by: Ian Romanick <[email protected]>
* vbo: rewrite some code in playback_copy_to_current()Brian Paul2018-01-171-8/+6
| | | | | | I think this is a little easier to understand. Reviewed-by: Ian Romanick <[email protected]>
* vbo: add some comments in vbo_save_api.cBrian Paul2018-01-171-0/+17
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: rename some functions in vbo_save_api.cBrian Paul2018-01-171-37/+37
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: rename some functions in vbo_save_draw.cBrian Paul2018-01-171-9/+9
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: add comment that vbo_save_vertex_list::buffer_offset is in bytesBrian Paul2018-01-171-1/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: minor code simplification in _save_compile_vertex_list()Brian Paul2018-01-171-4/+5
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: rename prim to primsBrian Paul2018-01-173-47/+47
| | | | | | | Using a plural name makes it easier to see that this is an array and not a pointer to a single object. Reviewed-by: Ian Romanick <[email protected]>
* vbo: removed unused ctx parameter for alloc_prim_store()Brian Paul2018-01-171-4/+3
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: rename vbo_save_context::buffer to buffer_mapBrian Paul2018-01-172-9/+9
| | | | | | And move the field and improve comments. Reviewed-by: Ian Romanick <[email protected]>
* vbo: remove unused vbo_save_context::count fieldBrian Paul2018-01-171-1/+0
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: s/GLuint/GLbitfield/ for vbo_save_context::replay_flagsBrian Paul2018-01-171-1/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* vbo: rename vbo_save_vertex_list::count to vertex_countBrian Paul2018-01-173-12/+13
| | | | Reviewed-by: Ian Romanick <[email protected]>