summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: cap cleanupsRob Clark2015-08-122-16/+16
| | | | | | | | Move a few things around to group stuff that is common to a3xx/a4xx together. Also, introduce is_ir3() for things that are more specific to the compiler / shader-ISA than to the gpu generation. Signed-off-by: Rob Clark <[email protected]>
* gallium/radeon: fix r600g build if LLVM is disabledMarek Olšák2015-08-111-4/+5
| | | | | | | | MESA_LLVM_VERSION_PATCH is undefined. Reviewed-by: Edward O'Callaghan <eocallaghan at alterapraxis.com> Tested-by: Benjamin Bellec <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: use a bitfield to track dirty atomsGrazvydas Ignotas2015-08-114-10/+56
| | | | | | | | | | | | | | | r600 currently has 73 atoms and looping through their dirty flags has become costly because checking each flag requires a pointer dereference before the read. To avoid having to do that add additional bitfield which can be checked really quickly thanks to tzcnt instruction. id field was added to struct r600_atom but that doesn't affect memory usage for both 32 and 64 bit CPUs because it was stuffed into padding. The performance improvement is ~2% for benchmarks that can have FPS in the thousands but is hardly measurable in "real" programs. Signed-off-by: Marek Olšák <[email protected]>
* r600g: don't mark unused atom dirtyGrazvydas Ignotas2015-08-111-1/+3
| | | | | | On evergreen config_state is not used, so don't mark it dirty. Signed-off-by: Marek Olšák <[email protected]>
* r600g: use a helper to add an initialized atomGrazvydas Ignotas2015-08-114-8/+16
| | | | | | | Instead of writing to rctx->atoms directly use a helper to take advantage of assert checks. Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: use helper functions to mark atoms dirtyGrazvydas Ignotas2015-08-1119-145/+182
| | | | | | | | | | This is analogous to r300_mark_atom_dirty() used by r300, and will be used by later patches. For common radeon code, appropriate helper is called through a function pointer. No functional changes. Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: add a debug flag not to use write combining (v2)Marek Olšák2015-08-103-0/+5
| | | | | | v2: just clear the flag before the allocation Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/a4xx: add s8/z32/z32_s8x24 supportRob Clark2015-08-104-37/+151
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-08-105-5/+183
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix vpsrepl for blit shadersRob Clark2015-08-101-5/+14
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: clear cached fp when switching blit progRob Clark2015-08-101-0/+2
| | | | | | | | | For gmem restore (mem2gmem), we swap blit programs, in order to have a different frag shader for depth vs color restore. But we weren't actually clearing the cached fp, so it would not actually change the frag shader as expected. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: clear cached fp when switching blit progRob Clark2015-08-101-0/+2
| | | | | | | | | For gmem restore (mem2gmem), we swap blit programs, in order to have a different frag shader for depth vs color restore. But we weren't actually clearing the cached fp, so it would not actually change the frag shader as expected. Signed-off-by: Rob Clark <[email protected]>
* gallium: GCC 4.9 allows to include tmmintrin.h without -msse3.Jose Fonseca2015-08-091-2/+2
| | | | | | | Fixes build with MinGW x86_64 build with GCC 4.9, due to conflicting definition _mm_shuffle_epi8 of u_sse.h and system headers. Trivial.
* vc4: add missing nir include, to fix the buildEmil Velikov2015-08-071-0/+1
| | | | | | Cc: 10.6 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: automake: remove unused includeEmil Velikov2015-08-071-1/+0
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* clover: Stub missing CL 1.2 functions.Serge Martin (EdB)2015-08-076-8/+65
| | | | | | | | | | | | | | | | | | | | | | As sugested by Tom a long time ago and in order to be able to create Piglit tests v2: replace NOT_SUPPORTED_BY_CL_1_1 macro with an inline function remove extra space in clLinkProgram arg v3: use __func__ v4: back to a macro, it make more sense to use it with __func__ [ Francisco Jerez: Rename to CLOVER_NOT_SUPPORTED_UNTIL and pass the minimum API version required by the entry point so the error messages don't become stale when support for additional CL versions is introduced. ] Reviewed-by: Francisco Jerez <[email protected]>
* winsys/radeon: add a specific error message for cs_submit -> -ENOMEMMarek Olšák2015-08-071-4/+8
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
* winsys/radeon: add an interface for contextsMarek Olšák2015-08-0710-14/+55
| | | | | | | | | | | | | Same idea as in libdrm_amdgpu. A command stream can only be created for a specific context and it's always submitted to that context. This will mainly be used by amdgpu and it's required by the GPU reset status query too. (radeon only has a basic version of the query and thus doesn't need this) Reviewed-by: Christian König <[email protected]>
* gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interfaceMarek Olšák2015-08-079-57/+46
| | | | | | The timeout parameter covers both cases. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: rename enable_s3tc -> enable_compressed_formatsMarek Olšák2015-08-061-5/+4
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gallium/radeon: add DRM and LLVM version to the renderer stringMarek Olšák2015-08-062-4/+24
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: always flush framebuffer caches at the beginning of IBsMarek Olšák2015-08-061-1/+2
| | | | | | | better safe than sorry Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: don't count the exact needed CS space if the CS is large enoughMarek Olšák2015-08-061-2/+11
| | | | Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: don't crash when cleaning up after an incomplete contextMarek Olšák2015-08-061-7/+11
| | | | Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: add a HUD query showing the number of shaders createdMarek Olšák2015-08-064-0/+17
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add a HUD query showing the number of compiler invocationsMarek Olšák2015-08-064-1/+19
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: display cumulative results for some driver queriesMarek Olšák2015-08-061-2/+4
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: switch the buffer-wait-time query to microsecondsMarek Olšák2015-08-062-3/+3
| | | | | | | This display the units in the HUD. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: change some driver query types to HzMarek Olšák2015-08-061-2/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/hud: automatically print % if max_value == 100Marek Olšák2015-08-061-6/+11
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/hud: fix printing % next to panesMarek Olšák2015-08-061-1/+1
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/hud: replace assertions with clamping the unit indexMarek Olšák2015-08-061-19/+23
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium,hud: allow displaying cumulative values instead of averageMarek Olšák2015-08-064-8/+36
| | | | | | | | | The cumulative value is useful for queries like the number of shader compilations. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/hud: fix printing byte unitsMarek Olšák2015-08-061-1/+1
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium,hud: add support for Hz units in driver queriesMarek Olšák2015-08-062-0/+8
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: before storing tess levels, load them from LDS instead of temporaryMarek Olšák2015-08-061-79/+57
| | | | | | | | | Also use only one store if stride <= 4. All the fetches from and stores to temporaries can be removed now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91461 Reviewed-by: Michel Dänzer <[email protected]>
* winsys/radeon: loosen up the requirements for how much memory IBs can useMarek Olšák2015-08-061-4/+9
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: always use the llvm. prefix in intrinsic namesMarek Olšák2015-08-061-6/+16
| | | | | Acked-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeon/winsys: increase the IB size for VMMarek Olšák2015-08-064-6/+17
| | | | | | | Luckily, there is a kernel query, so use the size from that. It currently returns 256KB. It can be increased in the kernel. Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: allow the winsys to choose the IB sizeMarek Olšák2015-08-0611-18/+18
| | | | | | | Picked from the amdgpu branch. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: suspend timer queries between IBsMarek Olšák2015-08-065-25/+66
| | | | | | | When we are measuring the time spent in a draw call, an unexpected flush can distort the result. Reviewed-by: Michel Dänzer <[email protected]>
* vc4: Use nir_lower_load_const_to_scalar().Eric Anholt2015-08-041-0/+1
|
* vc4: Don't bother de-SSAing values that aren't part of phi webs.Eric Anholt2015-08-041-15/+44
| | | | We can just support them the same way we do load_const's SSA values.
* vc4: Don't bother saturating the dst color for blending.Eric Anholt2015-08-041-8/+2
| | | | | | | | | Since we just pulled it out of the destination as 8-bit unorm, we know it's in [0, 1] already. shader-db: total instructions in shared programs: 100040 -> 98208 (-1.83%) instructions in affected programs: 14084 -> 12252 (-13.01%)
* vc4: Make r4-writes implicitly move to a temp, and allocate temps to r4.Eric Anholt2015-08-048-107/+106
| | | | | | | | | | | Previously, SFU values always moved to a temporary, and TLB color reads and texture reads always lived in r4. Instead, we can have these results just be normal temporaries, and the register allocator can leave the values in r4 when they don't interfere with anything else using r4. shader-db results: total instructions in shared programs: 100809 -> 100040 (-0.76%) instructions in affected programs: 42383 -> 41614 (-1.81%)
* vc4: Drop a dead prototype.Eric Anholt2015-08-041-8/+0
|
* freedreno/a4xx: add independent blend function supportRob Clark2015-08-042-8/+10
| | | | | | needed for MRT Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: MRT supportRob Clark2015-08-0412-132/+212
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: move the half-precision logic into coreRob Clark2015-08-044-31/+38
| | | | | | | | Both a3xx and a4xx need the same logic to decide if half-precision can be used for blit shaders. So move it to core and simplify things a bit with a helper that considers all render targets. Signed-off-by: Rob Clark <[email protected]>
* freedreno: simplify/cleanup resource status trackingRob Clark2015-08-044-48/+71
| | | | | | | | | Collapse dirty/reading bools into status bitmask (and drop writing which should really be the same as dirty). And use 'used_resources' list for all tracking, including zsbuf/cbufs, rather than special casing the color and depth/stencil buffers. Signed-off-by: Rob Clark <[email protected]>