summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965: Use unsynchronized maps for the program cache on LLC platforms.Kenneth Graunke2014-10-131-7/+28
| | | | | | | | | | | | | | | | | | | | There's no reason to stall on pwrite - the CPU always appends to the buffer and never modifies existing contents, and the GPU never writes it. Further, the CPU always appends new data before submitting a batch that requires it. This code predates the unsynchronized mapping feature, so we simply didn't have the option when it was written. Ideally, we would do this for non-LLC platforms too, but unsynchronized mapping support only exists for LLC systems. Saves a bunch of stall avoidance copies when uploading shaders. v2: Rebase on changes to previous patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
* i965: Issue performance warnings when copying the program cache BO.Kenneth Graunke2014-10-131-0/+3
| | | | | | | | | | | | | We don't really want unnecessary buffer copying, so it'd be nice to know when it's happening. v2: Drop stall warnings when doing a read-only CPU mapping of the cache BO. The GPU also uses it in a read-only fashion, so there won't be any stalls, even though the buffer is busy. (Thanks to Chris Wilson for catching this mistake.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
* i965: Issue performance warnings on MapBufferRange stalls.Kenneth Graunke2014-10-131-3/+4
| | | | | | | | This is easy: we just need to use brw_map_bo instead of mapping it directly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* vc4: Match VS outputs to FS inputs.Eric Anholt2014-10-133-18/+135
| | | | | | | | | If the VS doesn't output a value that the FS needs, we still need to read the right contents for the remaining FS inputs, by emitting padding. And if the VS outputs something the FS doesn't need, we shouldn't put it in the VPM at all (so the code producing it can get DCEed). Fixes 77 piglit tests.
* configure: use $libdir/dri as default for VA-APIChristian König2014-10-131-2/+2
| | | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* configure: remove superflous VA-API line from configure.acChristian König2014-10-131-1/+0
| | | | | | | | We don't have GALLIUM_STATE_TRACKERS_DIRS any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* configure: respect $libdir for the OMX installation dirChristian König2014-10-131-5/+2
| | | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* configure: Revert "ask vdpau.pc for the default location of the vdpau drivers"Christian König2014-10-131-8/+3
| | | | | | | | This reverts commit bbe6f7f865cd4316b5f885507ee0b128a20686eb. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Add support for the CEIL opcode.Eric Anholt2014-10-131-0/+22
| | | | Not as big of a deal as SSG, but still +9 piglit tests.
* vc4: Add support for the SSG opcode.Eric Anholt2014-10-131-0/+12
|
* docs: add news item and link release notesEmil Velikov2014-10-132-0/+14
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: Add sha256 sums for the 10.3.1 releaseEmil Velikov2014-10-131-1/+3
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit fa98c74692634de4f87694a40a299b59c4716ee5)
* Add release notes for the 10.3.1 releaseEmil Velikov2014-10-131-0/+156
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 088d3501786a2ff0833de45951b63acbe6560a0f)
* docs: Add sha256 sums for the 10.2.9 releaseEmil Velikov2014-10-131-1/+3
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 52bd154980e306b8bc9b9d2edc0e728a9f8f3bf6)
* Add release notes for the 10.2.9 releaseEmil Velikov2014-10-131-0/+99
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 9f1149876f2d010c871751a53d02d4d2b6aef1fe)
* r600g: Implement GL_ARB_sample_shadingGlenn Kennard2014-10-1212-120/+385
| | | | | | | | Also fixes two sided lighting which was broken at least on pre-evergreen by commit b1eb00. Signed-off-by: Glenn Kennard <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: use tgsi_shader_info in si_llvm_emit_fs_epilogueMarek Olšák2014-10-121-71/+61
| | | | | | | | | This is the last use tgsi_parse_token in radeonsi. It looks ugly because the code was re-indented, but there is really no change in behavior. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove si_shader_output_values::indexMarek Olšák2014-10-121-17/+6
| | | | | | | | It's redundant now. It led to a simplification in si_llvm_emit_streamout, because outidx == reg. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in si_llvm_emit_vs_epilogueMarek Olšák2014-10-121-26/+13
| | | | | | That code was really ugly. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove shader->input[] and output[] arrays and dependenciesMarek Olšák2014-10-123-89/+2
| | | | | | | | | They were reinventing tgsi_shader_info. They are unused now. radeon_llvm_context::load_input can be NULL if input fetching is implemented in some other way. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move param_offset out of shader->input[] and output[]Marek Olšák2014-10-123-7/+10
| | | | | | Those are going away. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info to get a list of GS outputsMarek Olšák2014-10-122-14/+12
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in si_update_spi_mapMarek Olšák2014-10-121-9/+13
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: simplify dereferences in si_update_spi_mapMarek Olšák2014-10-121-2/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in si_shader_vsMarek Olšák2014-10-121-2/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in si_shader_psMarek Olšák2014-10-123-5/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in fetch_input_gsMarek Olšák2014-10-121-4/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't rely on shader->output in si_llvm_emit_fs_epilogueMarek Olšák2014-10-121-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use tgsi_shader_info in si_llvm_emit_es_epilogueMarek Olšák2014-10-121-17/+5
| | | | | | tgsi_shader_info contains everything we need. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't recompile shaders when changing nr_cbufs from 0 to 1Marek Olšák2014-10-123-4/+4
| | | | | | Both cases are equivalent. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove vs.ucps_enabled from the shader keyMarek Olšák2014-10-123-15/+0
| | | | | | Written CLIPDIST outputs are simply disabled in PA_CL_VS_OUT_CNTL. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: assume ClipDistance usage mask is always 0xfMarek Olšák2014-10-122-8/+2
| | | | | | | | | | | | No code in Mesa sets the usage mask to any other value. The final mask is AND'ed with enable bits from the rasterizer state anyway. If somebody implements setting usage masks in st/mesa, we can use tgsi_shader_info to get it more easily. This is a prerequisite for the following commit. Reviewed-by: Michel Dänzer <[email protected]>
* clover: Fix unintended fall-through in kernel::argument::bind.Francisco Jerez2014-10-121-0/+3
|
* clover: Append implicit arguments to the kernel argument list.Jan Vesely2014-10-121-13/+29
| | | | | | | [ Francisco Jerez: Split off from a larger patch, and take a slightly different approach for passing the implicit arguments around. ] Reviewed-by: Francisco Jerez <[email protected]>
* clover: Pass execution dimensions and offset to the kernel as implicit ↵Francisco Jerez2014-10-122-25/+70
| | | | | | arguments. Reviewed-by: Jan Vesely <[email protected]>
* clover: Add semantic information to module::argument for implicit parameter ↵Francisco Jerez2014-10-121-4/+12
| | | | | | passing. Reviewed-by: Jan Vesely <[email protected]>
* clover: Use unreachable() from util/macros.h instead of assert(0).Francisco Jerez2014-10-113-4/+4
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* gallium: Add tokens for DragonFly BSD.Vinson Lee2014-10-101-0/+6
| | | | | Signed-off-by: Vinson Lee <[email protected]> Acked-by: Brian Paul <[email protected]>
* ilo: disassemble compacted instructionsChia-I Wu2014-10-114-2/+453
| | | | Signed-off-by: Chia-I Wu <[email protected]>
* glsl: improve accuracy of atan()Erik Faye-Lund2014-10-101-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper "Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis" (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.9999793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vc4: Use the fnv1 hash function instead of gallium util's crc32.Eric Anholt2014-10-101-2/+3
| | | | | Improves simulated norast performance on a little benchmark by 13.4012% +/- 2.08459% (n=13).
* vc4: Don't look up the compiled shaders unless state has changed.Eric Anholt2014-10-103-0/+28
| | | | | Improves simulated norast performance on a little benchmark by 38.0965% +/- 3.27534% (n=11).
* vc4: Actually clear the context's dirty flags.Eric Anholt2014-10-101-0/+1
| | | | | I was trying to skip state updates when !dirty, and suspiciously everything was always dirty.
* vc4: Optimize the other case of SEL_X_Y wih a 0 -> SEL_X_0(a).Eric Anholt2014-10-101-1/+23
| | | | Cleans up some output to be more obvious in a piglit test I'm looking at.
* mesa: fix error reported on gTexSubImage2D when level not validTapani Pälli2014-10-101-1/+1
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965: Fix register write checks.Kenneth Graunke2014-10-101-0/+2
| | | | | | | | | | | | | When mapping the buffer a second time, we need to use the new pointer, not the one from the previous mapping. Otherwise, we will most likely crash. Apparently, we've just been getting lucky and getting the same bo->virtual pointer in both cases. libdrm probably has a hand in that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: [email protected]
* vc4: Optimize out adds of 0.Eric Anholt2014-10-091-0/+26
|
* vc4: Optimize fmul(x, 0) and fmul(x, 1).Eric Anholt2014-10-091-0/+45
| | | | | This was being generated frequently by matrix multiplies of 2 and 3-channel vertex attributes (which have the 0 or 1 loaded in the shader).
* vc4: Factor out the turn-it-into-a-mov in opt_algebraic.Eric Anholt2014-10-091-10/+12
| | | | This will be used more in the next commits.
* vc4: Eliminate unused texture instructions.Eric Anholt2014-10-091-1/+21
|