summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: force NaNs to 0Marek Olšák2014-12-211-4/+8
| | | | | | | | | This fixes incorrect rendering in Unreal Engine demos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510 Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* st/nine: fix DBG typo (trivial)David Heidelberg2014-12-211-1/+1
| | | | | Signed-off-by: David Heidelberg <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r300g: implement ARR opcodeDavid Heidelberg2014-12-214-4/+16
| | | | | | | | | | Same as ARL, just has extra rounding. Useful for st/nine. Tested-by: Pavel Ondračka <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: David Heidelberg <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* freedreno/a4xx: blend-colorRob Clark2014-12-201-0/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: alpha-testRob Clark2014-12-201-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2014-12-206-61/+151
|
* freedreno/ir3: trans_kill cleanupRob Clark2014-12-201-12/+7
| | | | | | | trans_kill() only handles the single opcode. Drop the remnant of a time when both KILL and KILL_IF were handled by the same fxn. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: hack for standalone compilerRob Clark2014-12-201-1/+5
| | | | | | | | | Standalone compiler doesn't have screen or context. We need to come up with a better way to control the target arch (ie. something that we can control from cmdline w/ standalone compiler) but for now this hack keeps it from segfault'ing. Signed-off-by: Rob Clark <[email protected]>
* i965/fs: Add missing const qualifier.Matt Turner2014-12-191-1/+1
|
* vc4: Coalesce MOVs into VPM with the instructions generating the values.Eric Anholt2014-12-184-15/+143
| | | | | total instructions in shared programs: 41168 -> 40976 (-0.47%) instructions in affected programs: 18156 -> 17964 (-1.06%)
* vc4: Redefine VPM writes as a (destination) QIR register file.Eric Anholt2014-12-173-7/+19
| | | | | This will let me coalesce the VPM writes into the instructions generating the values.
* docs: note change in minimum GCC version to 4.2.0Timothy Arceri2014-12-181-1/+1
| | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
* gallium: remove support for GCC older than 4.2.0Timothy Arceri2014-12-181-1/+1
| | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: bump required GCC version to 4.2.0Timothy Arceri2014-12-181-3/+3
| | | | | | | | | It turns out Mesa hasn't compiled on less then 4.2 for a while so update conf to reflect this. Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* vc4: Add support for turning constant uniforms into small immediates.Eric Anholt2014-12-1713-46/+283
| | | | | | | | | | | | | | | | | | | | | | Small immediates have the downside of taking over the raddr B field, so you might have less chance to pack instructions together thanks to raddr B conflicts. However, it also reduces some register pressure since it lets you load 2 "uniform" values in one instruction (avoiding a previous load of the constant value to a register), and increases some pairing for the same reason. total uniforms in shared programs: 16231 -> 13374 (-17.60%) uniforms in affected programs: 10280 -> 7423 (-27.79%) total instructions in shared programs: 40795 -> 41168 (0.91%) instructions in affected programs: 25551 -> 25924 (1.46%) In a previous version of this patch I had a reduction in instruction count by forcing the other args alongside a SMALL_IMM to be in the A file or accumulators, but that increases register pressure and had a bug in handling FRAG_Z. In this patch is I just use raddr conflict resolution, which is more expensive. I think I'd rather tweak allocation to have some way to slightly prefer good choices for files in general, rather than risk failing to register allocate by forcing things into register classes.
* vc4: Move follow_movs() to common QIR code.Eric Anholt2014-12-173-11/+12
| | | | I want this from other passes.
* vc4: Fix missing newline for load immediate instruction disasm.Eric Anholt2014-12-171-4/+4
|
* mesa: Remove unnecessary -f from $(RM).Matt Turner2014-12-174-8/+8
| | | | $(RM) includes -f.
* mesa: Remove tarballs/checksum rules.Matt Turner2014-12-171-75/+0
|
* gallium: Add egl and gbm to distribution.Matt Turner2014-12-171-0/+4
|
* mesa: Set DISTCHECK_CONFIGURE_FLAGS.Matt Turner2014-12-171-0/+13
| | | | Enable some non-default options that distros are likely to use.
* targets/xvmc: Add uninstall hooks to handle megadriver hardlinks.Matt Turner2014-12-171-0/+5
|
* targets/vdpau: Add uninstall hooks to handle megadriver hardlinks.Matt Turner2014-12-171-0/+5
|
* targets/vdpau: Add clean-local rule to remove .lib links.Matt Turner2014-12-171-0/+6
|
* vc4: Add a userspace BO cache.Eric Anholt2014-12-174-4/+175
| | | | | | | | | | Since our kernel BOs require CMA allocation, and the use of them requires new mmaps, it's pretty expensive and we should avoid it if possible. Copying my original design for Intel, make a userspace cache that reuses BOs that haven't been shared to other processes but frees BOs that have sat in the cache for over a second. Improves glxgears framerate on RPi by around 30%.
* vc4: Add dmabuf support.Eric Anholt2014-12-174-24/+78
| | | | | | This gets DRI3 working on modesetting with glamor. It's not enabled under simulation, because it looks like handing our dumb-allocated buffers off to the server doesn't actually work for the server's rendering.
* vc4: Drop a weird argument in the BOs-from-handles API.Eric Anholt2014-12-173-7/+5
|
* draw: revert using correct order for prim decomposition.Roland Scheidegger2014-12-171-1/+3
| | | | | | | | | This reverts db3dfcfe90a3d27e6020e0d3642f8ab0330e57be. The commit was correct but we've got some precision problems later in llvmpipe (or possibly in draw clip) due to the vertices coming in in different order, causing some internal test failures. So revert for now. (Will only affect drivers which actually support constant-interpolated attributes and not just flatshading.)
* util: Silence signed-unsigned comparison warningsJan Vesely2014-12-171-6/+6
| | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* i965: Require pixel alignment for GPU copy blitCody Northrop2014-12-162-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | The blitter will start at a pixel's natural alignment. For PBOs, if the provided offset if not aligned, bits will get dropped. This change adds offset alignment check for src and dst, kicking back if the requirements are not met. The change is based on following verbiage from BSPEC: Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp). All pixels are naturally aligned. Found in the following locations: page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf page 29 of ivb_ihd_os_vol1_part4.pdf page 29 of snb_ihd_os_vol1_part5.pdf This behavior was observed with Steam Big Picture rendering incorrect icon colors. The fix has been tested on Ubuntu and SteamOS on Haswell. Signed-off-by: Cody Northrop <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908 Reviewed-by: Neil Roberts <[email protected]>
* i965: remove includes of sampler.h from extern "C" blocksMark Janes2014-12-164-5/+4
| | | | | | | | | C linkage was removed from functions in program/sampler.cpp. However, some cpp files include program/sampler.h within extern "C" blocks, causing link errors for test_vec4_copy_propagation. Reviewed-by: Brian Paul <[email protected]> Tested-by: Ian Romanick <[email protected]>
* i965/query: Cache whether the batch references the query BO.Kenneth Graunke2014-12-162-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Wilson noted that repeated calls to CheckQuery() would call drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation, which is expensive. Once we've flushed, we know that future batches won't reference query->bo, so there's no point in asking more than once. This patch adds a brw_query_object::flushed flag, which is a conservative estimate of whether the batch has been flushed. On the first call to CheckQuery() or WaitQuery(), we check if the batch references query->bo. If not, it must have been flushed for some reason (such as being full). We record that it was flushed. If it does reference query->bo, we explicitly flush, and record that we did so. Any subsequent checks will simply see that query->flushed is set, and skip the drm_intel_bo_references() call. Inspired by a patch from Chris Wilson. According to Eero, this does not affect the performance of Witcher 2 on Haswell, but approximately halves the userspace CPU usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Use brw_bo_map to handle stall warnings.Kenneth Graunke2014-12-161-7/+1
| | | | | | | | | | This is less code and also measures the duration of the stall for us. Our old code predates the existance of brw_bo_map(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Remove redundant drm_intel_bo_references call in CheckQuery.Kenneth Graunke2014-12-161-7/+8
| | | | | | | | | | | | | | | | | | | | | | CheckQuery calls drm_intel_bo_references to see if the batch references the query BO, and if so, flushes. It then checks if the query BO is busy, and if not, calls gen6_queryobj_get_results(). Stupidly, gen6_queryobj_get_results() immediately did a second redundant drm_intel_bo_references check, even though we know the buffer is not referenced and in fact idle. This patch moves the batch-flush check out of gen6_queryobj_get_results and into WaitQuery() (the other caller). That way, both callers do a single batch-flush check. This should only be a minor improvement, since it would only affect the first CheckQuery call where the result is actually available. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Add query->bo == NULL early return in CheckQuery hook.Kenneth Graunke2014-12-161-2/+8
| | | | | | | | | | | If query->bo == NULL, this is a redundant CheckQuery call, and we should simply return. We didn't do anything anyway - we skipped the batch flushing block, and although we called get_results(), it has an early return and does nothing. Why bother? Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/query: Set Ready flag in gen6_queryobj_get_results().Kenneth Graunke2014-12-161-2/+2
| | | | | | | | | | | | | | | q->Ready means that the results are in, and core Mesa is free to return them to the application. gen6_queryobj_get_results() is a natural place to set that flag; doing so means callers don't have to. The older non-hardware-context aware code couldn't do this, because we had to call brw_queryobj_get_results() to gather intermediate results when we ran out of space for snapshots in the query buffer. We only gather complete results in the Gen6+ code, however. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vc4: Add support for turning add-based MOVs to muls for pairing.Eric Anholt2014-12-161-2/+49
| | | | | total instructions in shared programs: 43053 -> 40795 (-5.24%) instructions in affected programs: 37996 -> 35738 (-5.94%)
* vc4: Add a helper for changing a field in an instruction.Eric Anholt2014-12-162-11/+12
|
* vc4: Fix the name of qpu_waddr_ignores_ws().Eric Anholt2014-12-161-5/+5
| | | | We're deciding about the WS bit, not PM.
* docs: note change in minimum GCC version to 4.1.0Timothy Arceri2014-12-171-0/+1
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* util: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gbm: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium: remove support for GCC older than 4.1.0Timothy Arceri2014-12-172-5/+5
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* egl: remove support for GCC older than 4.1.0Timothy Arceri2014-12-171-1/+1
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: bump required GCC version to 4.1.0Timothy Arceri2014-12-171-3/+3
| | | | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: remove support for GCC older than 3.3.0Timothy Arceri2014-12-173-8/+3
| | | | | | | | | GCC >=3.3 has been required since 9aa3aa71386394725ce88df463d6183f62777ee5 Signed-off-by: Timothy Arceri <[email protected]> Reviewed-By: Jose Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add a comment explaining what saturate propagation does.Matt Turner2014-12-161-0/+14
|
* vc4: Add support for enabling early Z discards.Eric Anholt2014-12-161-0/+18
| | | | This is the same basic logic from the original Broadcom driver.
* st/mesa: remove extern "C" around #includes in st_glsl_to_tgsi.cppBrian Paul2014-12-161-4/+2
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>