| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no reason to stall on pwrite - the CPU always appends to the
buffer and never modifies existing contents, and the GPU never writes
it. Further, the CPU always appends new data before submitting a batch
that requires it.
This code predates the unsynchronized mapping feature, so we simply
didn't have the option when it was written.
Ideally, we would do this for non-LLC platforms too, but unsynchronized
mapping support only exists for LLC systems.
Saves a bunch of stall avoidance copies when uploading shaders.
v2: Rebase on changes to previous patch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't really want unnecessary buffer copying, so it'd be nice to know
when it's happening.
v2: Drop stall warnings when doing a read-only CPU mapping of the cache
BO. The GPU also uses it in a read-only fashion, so there won't be
any stalls, even though the buffer is busy. (Thanks to Chris Wilson
for catching this mistake.)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
|
|
|
|
|
|
|
|
| |
This is easy: we just need to use brw_map_bo instead of mapping it
directly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the VS doesn't output a value that the FS needs, we still need to read
the right contents for the remaining FS inputs, by emitting padding. And
if the VS outputs something the FS doesn't need, we shouldn't put it in
the VPM at all (so the code producing it can get DCEed).
Fixes 77 piglit tests.
|
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't have GALLIUM_STATE_TRACKERS_DIRS any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit bbe6f7f865cd4316b5f885507ee0b128a20686eb.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Not as big of a deal as SSG, but still +9 piglit tests.
|
| |
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit fa98c74692634de4f87694a40a299b59c4716ee5)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 088d3501786a2ff0833de45951b63acbe6560a0f)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 52bd154980e306b8bc9b9d2edc0e728a9f8f3bf6)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 9f1149876f2d010c871751a53d02d4d2b6aef1fe)
|
|
|
|
|
|
|
|
| |
Also fixes two sided lighting which was broken at least
on pre-evergreen by commit b1eb00.
Signed-off-by: Glenn Kennard <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is the last use tgsi_parse_token in radeonsi.
It looks ugly because the code was re-indented, but there is really no change
in behavior.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
It's redundant now.
It led to a simplification in si_llvm_emit_streamout, because outidx == reg.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
That code was really ugly.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
They were reinventing tgsi_shader_info. They are unused now.
radeon_llvm_context::load_input can be NULL if input fetching is implemented
in some other way.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Those are going away.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
tgsi_shader_info contains everything we need.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Both cases are equivalent.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Written CLIPDIST outputs are simply disabled in PA_CL_VS_OUT_CNTL.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
No code in Mesa sets the usage mask to any other value.
The final mask is AND'ed with enable bits from the rasterizer state anyway.
If somebody implements setting usage masks in st/mesa, we can use
tgsi_shader_info to get it more easily.
This is a prerequisite for the following commit.
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
[ Francisco Jerez: Split off from a larger patch, and take a slightly
different approach for passing the implicit arguments around. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
arguments.
Reviewed-by: Jan Vesely <[email protected]>
|
|
|
|
|
|
| |
passing.
Reviewed-by: Jan Vesely <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Vinson Lee <[email protected]>
Acked-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our current atan()-approximation is pretty inaccurate at 1.0, so
let's try to improve the situation by doing a direct approximation
without going through atan.
This new implementation uses an 11th degree polynomial to approximate
atan in the [-1..1] range, and the following identitiy to reduce the
entire range to [-1..1]:
atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)
This range-reduction idea is taken from the paper "Fast computation
of Arctangent Functions for Embedded Applications: A Comparative
Analysis" (Ukil et al. 2011).
The polynomial that approximates atan(x) is:
x * 0.9999793128310355 - x^3 * 0.3326756418091246 +
x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 +
x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
This polynomial was found with the following GNU Octave script:
x = linspace(0, 1);
y = atan(x);
n = [1, 3, 5, 7, 9, 11];
format long;
polyfitc(x, y, n)
The polyfitc function is not built-in, but too long to include here.
It can be downloaded from the following URL:
http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m
This fixes the following piglit test:
shaders/glsl-const-folding-01
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Improves simulated norast performance on a little benchmark by 13.4012%
+/- 2.08459% (n=13).
|
|
|
|
|
| |
Improves simulated norast performance on a little benchmark by 38.0965%
+/- 3.27534% (n=11).
|
|
|
|
|
| |
I was trying to skip state updates when !dirty, and suspiciously
everything was always dirty.
|
|
|
|
| |
Cleans up some output to be more obvious in a piglit test I'm looking at.
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Juha-Pekka Heikkila <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When mapping the buffer a second time, we need to use the new pointer,
not the one from the previous mapping. Otherwise, we will most likely
crash.
Apparently, we've just been getting lucky and getting the same
bo->virtual pointer in both cases. libdrm probably has a hand in that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: [email protected]
|
| |
|
|
|
|
|
| |
This was being generated frequently by matrix multiplies of 2 and
3-channel vertex attributes (which have the 0 or 1 loaded in the shader).
|
|
|
|
| |
This will be used more in the next commits.
|
| |
|