summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Only render tiles where the scissor ever intersected them.Eric Anholt2014-12-304-10/+52
| | | | | This gives a 2.7x improvement in x11perf -rect100, since we only end up load/storing the x11perf window, not the whole screen.
* vc4: Move draw call reset handling to a helper function.Eric Anholt2014-12-301-23/+31
| | | | | | This will be more important in the next commit, when there's more state to reset to nonzero values, and I want an early exit from the submit function.
* vc4: Drop the content of vc4_flush_resource().Eric Anholt2014-12-301-4/+4
| | | | | The callers all follow it with a flush of the context, and the flush of the context gives us more information about how things are being flushed.
* gallium/target: Drop no longer needed Haiku viewport overrideAlexander von Gluck IV2014-12-271-30/+1
| | | | | * Drop no longer needed mesa headers * Haiku LLVM pipe working with LLVM 3.5.0 on x86_64
* gallium/st: Clean up Haiku depth mapping, fix colorspace errorsAlexander von Gluck IV2014-12-271-29/+19
|
* vc4: Handle unaligned accesses in CL emits.Eric Anholt2014-12-252-26/+78
| | | | | | | As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS from unaligned accesses on the hardware, for reasons I haven't figured out. However, we should be avoiding unaligned accesses anyway, and our CL setup certainly would have produced them.
* vc4: Don't bother zero-initializing the shader reloc indices.Eric Anholt2014-12-251-2/+2
| | | | | They should all be set to real values by the time they're read, and ideally if you used valgrind you'd see uninitialized value uses.
* vc4: Fix the argument type for cl_u16().Eric Anholt2014-12-251-1/+1
| | | | It doesn't matter, since it just got truncated to 16 inside, anyway.
* radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0Michel Dänzer2014-12-251-2/+4
| | | | | | | | | | | E.g. this could happen on older kernels which don't support the RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in si_write_harvested_raster_configs() doesn't deal with this correctly and would probably mangle the value badly. Cc: "10.4 10.3" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* vc4: Optimize CL emits by doing size checks up front.Eric Anholt2014-12-245-16/+66
| | | | | | | | The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).
* vc4: Avoid repeated hindex lookups in the loop over tiles.Eric Anholt2014-12-242-15/+24
| | | | | Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673% (n=20).
* freedreno/ir3: split out legalize passRob Clark2014-12-235-154/+214
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ra debugRob Clark2014-12-233-17/+61
| | | | | | Some compile time RA debug Signed-off-by: Rob Clark <[email protected]>
* egl: Add Haiku code and supportAlexander von Gluck IV2014-12-231-1/+1
| | | | | | | * This is the cleaned up work of the Haiku GCI student Adrián Arroyo Calle [email protected] * Several patches were consolidated to prevent unnecessary touching of non-related code
* radeonsi: force NaNs to 0Marek Olšák2014-12-211-4/+8
| | | | | | | | | This fixes incorrect rendering in Unreal Engine demos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510 Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* st/nine: fix DBG typo (trivial)David Heidelberg2014-12-211-1/+1
| | | | | Signed-off-by: David Heidelberg <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r300g: implement ARR opcodeDavid Heidelberg2014-12-214-4/+16
| | | | | | | | | | Same as ARL, just has extra rounding. Useful for st/nine. Tested-by: Pavel Ondračka <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: David Heidelberg <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* freedreno/a4xx: blend-colorRob Clark2014-12-201-0/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: alpha-testRob Clark2014-12-201-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2014-12-206-61/+151
|
* freedreno/ir3: trans_kill cleanupRob Clark2014-12-201-12/+7
| | | | | | | trans_kill() only handles the single opcode. Drop the remnant of a time when both KILL and KILL_IF were handled by the same fxn. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: hack for standalone compilerRob Clark2014-12-201-1/+5
| | | | | | | | | Standalone compiler doesn't have screen or context. We need to come up with a better way to control the target arch (ie. something that we can control from cmdline w/ standalone compiler) but for now this hack keeps it from segfault'ing. Signed-off-by: Rob Clark <[email protected]>
* vc4: Coalesce MOVs into VPM with the instructions generating the values.Eric Anholt2014-12-184-15/+143
| | | | | total instructions in shared programs: 41168 -> 40976 (-0.47%) instructions in affected programs: 18156 -> 17964 (-1.06%)
* vc4: Redefine VPM writes as a (destination) QIR register file.Eric Anholt2014-12-173-7/+19
| | | | | This will let me coalesce the VPM writes into the instructions generating the values.
* gallium: remove support for GCC older than 4.2.0Timothy Arceri2014-12-181-1/+1
| | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* vc4: Add support for turning constant uniforms into small immediates.Eric Anholt2014-12-1713-46/+283
| | | | | | | | | | | | | | | | | | | | | | Small immediates have the downside of taking over the raddr B field, so you might have less chance to pack instructions together thanks to raddr B conflicts. However, it also reduces some register pressure since it lets you load 2 "uniform" values in one instruction (avoiding a previous load of the constant value to a register), and increases some pairing for the same reason. total uniforms in shared programs: 16231 -> 13374 (-17.60%) uniforms in affected programs: 10280 -> 7423 (-27.79%) total instructions in shared programs: 40795 -> 41168 (0.91%) instructions in affected programs: 25551 -> 25924 (1.46%) In a previous version of this patch I had a reduction in instruction count by forcing the other args alongside a SMALL_IMM to be in the A file or accumulators, but that increases register pressure and had a bug in handling FRAG_Z. In this patch is I just use raddr conflict resolution, which is more expensive. I think I'd rather tweak allocation to have some way to slightly prefer good choices for files in general, rather than risk failing to register allocate by forcing things into register classes.
* vc4: Move follow_movs() to common QIR code.Eric Anholt2014-12-173-11/+12
| | | | I want this from other passes.
* vc4: Fix missing newline for load immediate instruction disasm.Eric Anholt2014-12-171-4/+4
|
* mesa: Remove unnecessary -f from $(RM).Matt Turner2014-12-173-5/+5
| | | | $(RM) includes -f.
* gallium: Add egl and gbm to distribution.Matt Turner2014-12-171-0/+4
|
* targets/xvmc: Add uninstall hooks to handle megadriver hardlinks.Matt Turner2014-12-171-0/+5
|
* targets/vdpau: Add uninstall hooks to handle megadriver hardlinks.Matt Turner2014-12-171-0/+5
|
* targets/vdpau: Add clean-local rule to remove .lib links.Matt Turner2014-12-171-0/+6
|
* vc4: Add a userspace BO cache.Eric Anholt2014-12-174-4/+175
| | | | | | | | | | Since our kernel BOs require CMA allocation, and the use of them requires new mmaps, it's pretty expensive and we should avoid it if possible. Copying my original design for Intel, make a userspace cache that reuses BOs that haven't been shared to other processes but frees BOs that have sat in the cache for over a second. Improves glxgears framerate on RPi by around 30%.
* vc4: Add dmabuf support.Eric Anholt2014-12-174-24/+78
| | | | | | This gets DRI3 working on modesetting with glamor. It's not enabled under simulation, because it looks like handing our dumb-allocated buffers off to the server doesn't actually work for the server's rendering.
* vc4: Drop a weird argument in the BOs-from-handles API.Eric Anholt2014-12-173-7/+5
|
* draw: revert using correct order for prim decomposition.Roland Scheidegger2014-12-171-1/+3
| | | | | | | | | This reverts db3dfcfe90a3d27e6020e0d3642f8ab0330e57be. The commit was correct but we've got some precision problems later in llvmpipe (or possibly in draw clip) due to the vertices coming in in different order, causing some internal test failures. So revert for now. (Will only affect drivers which actually support constant-interpolated attributes and not just flatshading.)
* vc4: Add support for turning add-based MOVs to muls for pairing.Eric Anholt2014-12-161-2/+49
| | | | | total instructions in shared programs: 43053 -> 40795 (-5.24%) instructions in affected programs: 37996 -> 35738 (-5.94%)
* vc4: Add a helper for changing a field in an instruction.Eric Anholt2014-12-162-11/+12
|
* vc4: Fix the name of qpu_waddr_ignores_ws().Eric Anholt2014-12-161-5/+5
| | | | We're deciding about the WS bit, not PM.
* gallium: remove support for GCC older than 4.1.0Timothy Arceri2014-12-172-5/+5
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* vc4: Add support for enabling early Z discards.Eric Anholt2014-12-161-0/+18
| | | | This is the same basic logic from the original Broadcom driver.
* nvc0: add missed PIPE_CAP_VERTEXID_NOBASEIlia Mirkin2014-12-151-0/+1
| | | | | | Commit ade8b26bf missed adding this cap to nvc0. Signed-off-by: Ilia Mirkin <[email protected]>
* draw: implement support for the VERTEXID_NOBASE and BASEVERTEX semantics.Roland Scheidegger2014-12-164-19/+47
| | | | | | This fixes 4 vertexid related piglit tests with llvmpipe due to switching behavior of vertexid to the one gl expects. (Won't fix non-llvm draw path since we don't get the basevertex currently.)
* gallium: add TGSI_SEMANTIC_VERTEXID_NOBASE and TGSI_SEMANTIC_BASEVERTEXRoland Scheidegger2014-12-1619-6/+84
| | | | | | | | | | | | | | | | | | | Plus a new PIPE_CAP_VERTEXID_NOBASE query. The idea is that drivers not supporting vertex ids with base vertex offset applied (so, only support d3d10-style vertex ids) will get such a d3d10-style vertex id instead - with the caveat they'll also need to handle the basevertex system value too (this follows what core mesa already does). Additionally, this is also useful for other state trackers (for instance llvmpipe / draw right now implement the d3d10 behavior on purpose, but with different semantics it can just do both). Doesn't do anything yet. And fix up the docs wrt similar values. v2: incorporate feedback from Brian and others, better names, better docs. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g/sb: implement r600 gpr index workaround. (v3.1)Dave Airlie2014-12-164-9/+57
| | | | | | | | | | | | | | | | | | | | | | | r600, rv610 and rv630 all have a bug in their GPR indexing and how the hw inserts access to PV. If the base index for the src is the same as the dst gpr in a previous group, then it will use PV instead of using the indexed gpr correctly. The workaround is to insert a NOP when you detect this. v2: add second part of fix detecting DST rel writes followed by same src base index reads. v3: forget adding stuff to structs, just iterate over the previous node group again, makes it more obvious. v3.1: drop local_nop. Fixes ~200 piglit regressions on rv635 since SB was introduced. Reviewed-By: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g/sb: fix issues with loops created for switchVadim Girlin2014-12-165-4/+16
| | | | Signed-off-by: Dave Airlie <[email protected]>
* Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"Dave Airlie2014-12-161-38/+12
| | | | | | This reverts commit 7b0067d23a6f64cf83c42e7f11b2cd4100c569fe. Vadim's patch fixes this a lot better.
* vc4: Add support for 32-bit signed norm/scaled vertex attrs.Eric Anholt2014-12-152-0/+18
| | | | | 32-bit unsigned would require some adjustments to handle values >= 0x80000000.
* vc4: Add support for 16-bit signed/unsigned norm/scaled vertex attrs.Eric Anholt2014-12-156-6/+94
|