summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/query: Cache whether the batch references the query BO.Kenneth Graunke2014-12-162-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Wilson noted that repeated calls to CheckQuery() would call drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation, which is expensive. Once we've flushed, we know that future batches won't reference query->bo, so there's no point in asking more than once. This patch adds a brw_query_object::flushed flag, which is a conservative estimate of whether the batch has been flushed. On the first call to CheckQuery() or WaitQuery(), we check if the batch references query->bo. If not, it must have been flushed for some reason (such as being full). We record that it was flushed. If it does reference query->bo, we explicitly flush, and record that we did so. Any subsequent checks will simply see that query->flushed is set, and skip the drm_intel_bo_references() call. Inspired by a patch from Chris Wilson. According to Eero, this does not affect the performance of Witcher 2 on Haswell, but approximately halves the userspace CPU usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Use brw_bo_map to handle stall warnings.Kenneth Graunke2014-12-161-7/+1
| | | | | | | | | | This is less code and also measures the duration of the stall for us. Our old code predates the existance of brw_bo_map(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Remove redundant drm_intel_bo_references call in CheckQuery.Kenneth Graunke2014-12-161-7/+8
| | | | | | | | | | | | | | | | | | | | | | CheckQuery calls drm_intel_bo_references to see if the batch references the query BO, and if so, flushes. It then checks if the query BO is busy, and if not, calls gen6_queryobj_get_results(). Stupidly, gen6_queryobj_get_results() immediately did a second redundant drm_intel_bo_references check, even though we know the buffer is not referenced and in fact idle. This patch moves the batch-flush check out of gen6_queryobj_get_results and into WaitQuery() (the other caller). That way, both callers do a single batch-flush check. This should only be a minor improvement, since it would only affect the first CheckQuery call where the result is actually available. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Add query->bo == NULL early return in CheckQuery hook.Kenneth Graunke2014-12-161-2/+8
| | | | | | | | | | | If query->bo == NULL, this is a redundant CheckQuery call, and we should simply return. We didn't do anything anyway - we skipped the batch flushing block, and although we called get_results(), it has an early return and does nothing. Why bother? Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Set Ready flag in gen6_queryobj_get_results().Kenneth Graunke2014-12-161-2/+2
| | | | | | | | | | | | | | | q->Ready means that the results are in, and core Mesa is free to return them to the application. gen6_queryobj_get_results() is a natural place to set that flag; doing so means callers don't have to. The older non-hardware-context aware code couldn't do this, because we had to call brw_queryobj_get_results() to gather intermediate results when we ran out of space for snapshots in the query buffer. We only gather complete results in the Gen6+ code, however. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vc4: Add support for turning add-based MOVs to muls for pairing.Eric Anholt2014-12-161-2/+49
| | | | | total instructions in shared programs: 43053 -> 40795 (-5.24%) instructions in affected programs: 37996 -> 35738 (-5.94%)
* vc4: Add a helper for changing a field in an instruction.Eric Anholt2014-12-162-11/+12
|
* vc4: Fix the name of qpu_waddr_ignores_ws().Eric Anholt2014-12-161-5/+5
| | | | We're deciding about the WS bit, not PM.
* util: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gbm: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium: remove support for GCC older than 4.1.0Timothy Arceri2014-12-172-5/+5
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* egl: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: remove support for GCC older than 3.3.0Timothy Arceri2014-12-171-1/+1
| | | | | | | | | GCC >=3.3 has been required since 9aa3aa71386394725ce88df463d6183f62777ee5 Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add a comment explaining what saturate propagation does.Matt Turner2014-12-161-0/+14
|
* vc4: Add support for enabling early Z discards.Eric Anholt2014-12-161-0/+18
| | | | This is the same basic logic from the original Broadcom driver.
* st/mesa: remove extern "C" around #includes in st_glsl_to_tgsi.cppBrian Paul2014-12-161-4/+2
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* program: remove extern "C" usage in sampler.cppBrian Paul2014-12-161-5/+4
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* program: remove extern "C" around #includesBrian Paul2014-12-161-4/+3
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* glsl: remove extern "C" around #includesBrian Paul2014-12-163-7/+2
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: add extern "C" to st_context.hBrian Paul2014-12-161-0/+10
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: add extern "C" to st_program.hBrian Paul2014-12-161-0/+9
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* main: remove extern C around #includes in ff_fragment_shader.cppBrian Paul2014-12-161-5/+3
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* mesa: move #include of mtypes.h outside __cplusplus checkBrian Paul2014-12-161-2/+1
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* program: add #ifndef SAMPLER_H wrapperBrian Paul2014-12-161-0/+7
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* mesa: put extern "C" in src/mesa/program/*h header filesBrian Paul2014-12-164-0/+43
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* mesa: put extern "C" in header filesBrian Paul2014-12-164-0/+41
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* mapi: add glapi-test and shared-glapi-test to .gitignoreJuha-Pekka Heikkila2014-12-163-3/+2
| | | | | | | | On the same go remove src/mapi/shared-glapi/tests/.gitignore and src/mapi/glapi/tests/.gitignore as useless. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* util: add u_atomic_test to .gitignoreJuha-Pekka Heikkila2014-12-161-0/+1
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glx: remove __glXstrdup()Juha-Pekka Heikkila2014-12-162-20/+0
| | | | | | | I didn't find this being used anywhere Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: add test_vf_float_conversions to .gitignoreJuha-Pekka Heikkila2014-12-161-0/+1
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make validate_reg tables constantJuha-Pekka Heikkila2014-12-161-4/+4
| | | | | | | | Declare local tables constant. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* glsl: remove commented out codeTimothy Arceri2014-12-161-2/+0
| | | | | | | MaxGeometryOutputComponents is used as the value for gl_MaxGeometryVaryingComponents Acked-by: Matt Turner <[email protected]>
* i965: remove commented out codeTimothy Arceri2014-12-161-2/+0
| | | | Acked-by: Matt Turner <[email protected]>
* nvc0: add missed PIPE_CAP_VERTEXID_NOBASEIlia Mirkin2014-12-151-0/+1
| | | | | | Commit ade8b26bf missed adding this cap to nvc0. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: use vertex id lowering according to pipe cap bit.Roland Scheidegger2014-12-162-2/+10
| | | | | Tested with llvmpipe by setting the cap bit temporarily, seems to work, though no driver requests it for now.
* draw: implement support for the VERTEXID_NOBASE and BASEVERTEX semantics.Roland Scheidegger2014-12-164-19/+47
| | | | | | This fixes 4 vertexid related piglit tests with llvmpipe due to switching behavior of vertexid to the one gl expects. (Won't fix non-llvm draw path since we don't get the basevertex currently.)
* gallium: add TGSI_SEMANTIC_VERTEXID_NOBASE and TGSI_SEMANTIC_BASEVERTEXRoland Scheidegger2014-12-1619-6/+84
| | | | | | | | | | | | | | | | | | | Plus a new PIPE_CAP_VERTEXID_NOBASE query. The idea is that drivers not supporting vertex ids with base vertex offset applied (so, only support d3d10-style vertex ids) will get such a d3d10-style vertex id instead - with the caveat they'll also need to handle the basevertex system value too (this follows what core mesa already does). Additionally, this is also useful for other state trackers (for instance llvmpipe / draw right now implement the d3d10 behavior on purpose, but with different semantics it can just do both). Doesn't do anything yet. And fix up the docs wrt similar values. v2: incorporate feedback from Brian and others, better names, better docs. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g/sb: implement r600 gpr index workaround. (v3.1)Dave Airlie2014-12-164-9/+57
| | | | | | | | | | | | | | | | | | | | | | | r600, rv610 and rv630 all have a bug in their GPR indexing and how the hw inserts access to PV. If the base index for the src is the same as the dst gpr in a previous group, then it will use PV instead of using the indexed gpr correctly. The workaround is to insert a NOP when you detect this. v2: add second part of fix detecting DST rel writes followed by same src base index reads. v3: forget adding stuff to structs, just iterate over the previous node group again, makes it more obvious. v3.1: drop local_nop. Fixes ~200 piglit regressions on rv635 since SB was introduced. Reviewed-By: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g/sb: fix issues with loops created for switchVadim Girlin2014-12-165-4/+16
| | | | Signed-off-by: Dave Airlie <[email protected]>
* Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"Dave Airlie2014-12-161-38/+12
| | | | | | This reverts commit 7b0067d23a6f64cf83c42e7f11b2cd4100c569fe. Vadim's patch fixes this a lot better.
* vc4: Add support for 32-bit signed norm/scaled vertex attrs.Eric Anholt2014-12-152-0/+18
| | | | | 32-bit unsigned would require some adjustments to handle values >= 0x80000000.
* vc4: Add support for 16-bit signed/unsigned norm/scaled vertex attrs.Eric Anholt2014-12-156-6/+94
|
* vc4: Rename the 16-bit unpack #define.Eric Anholt2014-12-152-4/+4
| | | | | It's only an f16 conversion if you're doing a float operation, otherwise it's 16 bit signed to 32-bit signed.
* vc4: Add support for 8-bit unnormalized vertex attrs.Eric Anholt2014-12-156-11/+69
|
* vc4: Refactor vertex attribute conversions a bit.Eric Anholt2014-12-151-25/+40
| | | | There was just way too much indentation.
* vc4: Fix use of r3 as a temp in 8-bit unpacking.Eric Anholt2014-12-151-10/+7
| | | | | We're actually allocating out of r3 now, and I missed it because I'd typed this one as qpu_rn(3) instead of qpu_r3().
* vc4: Rename UNPACK_8* to UNPACK_8*_F.Eric Anholt2014-12-155-20/+20
| | | | | There is an equivalent unpack function without conversion to float if you use an integer operation instead.
* vc4: Add support for UMAD.Eric Anholt2014-12-151-0/+9
|
* vc4: 0-initialize the screen again.Eric Anholt2014-12-151-1/+1
| | | | I typoed this when rebasing the memory leak fixes.