| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This required the following changes:
- WM setup now makes the appropriate set of barycentric coordinates
(perspective vs. noperspective) available to the fragment shader,
based on whether the shader requires perspective interpolation,
noperspective interpolation, both, or neither.
- The fragment shader backend now uses the appropriate set of
barycentric coordiantes when interpolating, based on the
interpolation mode returned by
ir_variable::determine_interpolation_mode().
- SF setup now uses gl_fragment_program::InterpQualifier to determine
which attributes are to be flat shaded (as opposed to the old logic,
which only flat shaded colors).
- CLIP setup now ensures that the clipper outputs non-perspective
barycentric coordinates when they are needed by the fragment shader.
Fixes the remaining piglit tests of interpolation qualifiers that were
failing:
- interpolation-flat-*-smooth-none
- interpolation-flat-other-flat-none
- interpolation-noperspective-*
- interpolation-smooth-gl_*Color-flat-*
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes how fs_visitor::emit_general_interpolation()
decides what kind of interpolation to do. Previously, it used the
shade model to determine how to interpolate colors, and used smooth
interpolation on everything else. Now it uses
ir_variable::determine_interpolation_mode(), so that it respects GLSL
1.30 interpolation qualifiers.
Fixes piglit tests interpolation-flat-*-smooth-{distance,fixed,vertex}
and interpolation-flat-other-flat-{distance,fixed,vertex}.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies the fragment shader back-end so that instead of
using a single delta_x/delta_y register pair to store barycentric
coordinates, it uses an array of such register pairs, one for each
possible intepolation mode.
When setting up the WM, we intstruct it to only provide the
barycentric coordinates that are actually needed by the fragment
shader--that is computed by brw_compute_barycentric_interp_modes().
Currently this function returns just
BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only
interpolation mode we support. However, that will change in a later
patch.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies the special case in
fs_visitor::split_virtual_grfs() that prevents splitting from being
applied to the delta_x/delta_y register pair (this register pair needs
to remain contiguous so that it can be used by the PLN instruction).
When gen>=6, this register pair is in a fixed location, not a virtual
register, so it was in no danger of being split. And
split_virtual_grfs' attempt not to split it was preventing some other
unrelated register from being split.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The condmod instruction ends up generating garbage condition codes,
because apparently the comparison happens on the accumulator value (33
bits for UD), not the truncated value that would be written.
Fixes fs-op-neg-*
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chad Versace <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When updating a register reference to reflect the fact that we were
taking its absolute value, the fragment shader back-end failed to
clear the negate flag, resulting in abs(-x) getting computed as
-abs(x).
I also found (and fixed) a similar problem in brw_eu.h, but I'm not
aware of an actual manifestation of that problem.
Fixes piglit test glsl-fs-abs-neg-with-intermediate.
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Apparently on Gen4 and 5, the denominator comes first.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Ian Romanick <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Ian Romanick <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, this conversion was duplicated in several places in the
i965 driver. This patch moves it to a common location in mtypes.h,
near the declaration of gl_vert_result and gl_frag_attrib.
I've also added comments to remind us that we may need to revisit the
conversion code when adding elements to gl_vert_result and
gl_frag_attrib.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace each occurence of
#include "../glsl/*.h"
with
#include "glsl/*.h"
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ctx->Const.NativeIntegers is set, Core Mesa loads integer/boolean
uniforms directly, rather than loading the floating point equivalent.
So, when that's set, we don't need to perform any conversions.
Unfortunately, we can't properly support native integers with the old
vertex shader backend, so this patch leaves them disabled for now.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Link failure is something that shouldn't happen, but we sometimes want
it during development. The precompile also allows analysis of shader
codegen with shader-db.
|
|
|
|
| |
I want to just use the same enums in the VS.
|
|
|
|
|
| |
This should make gdbing more pleasant, and it might be used in sharing
part of the codegen between the VS and FS backends.
|
|
|
|
|
|
|
| |
"reg" was set in only one case, virtual GRFs pre register allocation,
and would be unset and have hw_reg set after allocation. Since we
never bothered with looking at virtual GRF number after allocation
anyway, just use the same storage and avoid confusion.
|
|
|
|
|
|
|
| |
This was a debugging aid at one point -- virtual grf 0 should never be
allocated, and it would be used if undefined register access occurred
in codegen. However, it made the confusing register allocation code
even more confusing by indexing things off of 1 all over.
|
|
|
|
|
|
|
|
|
| |
This saves both register space and upload bandwidth for unused values.
Note that previously we were relying on the visitor not initially
generating references to different sets of uniforms between the 8-wide
and 16-wide code generation, and now we're relying on them dead-code
eliminating the same stuff, too.
|
|\
| |
| |
| |
| |
| | |
Conflicts:
src/mesa/state_tracker/st_atom_pixeltransfer.c
src/mesa/state_tracker/st_program.c
|
| | |
|
| |
| |
| |
| | |
Removes 0.8% of the fragment shader instructions on Unigine Tropics.
|
| |
| |
| |
| |
| | |
This appears in our instruction stream as a result of the
brw_vs_constval.c handling.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This is part of fixing a ~1% performance regression in OpenArena when
changing the fixed function fragment shader to using the new backend.
Right now this just avoids the LINTERP of the projector, not the math
using it.
|
| |
| |
| |
| |
| | |
The old style has gone out of favor in the project, but I kept copy
and pasting from existing iterator code.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Our hardware doesn't have a sample_d_c message, so we have to do a
regular sample_d and emit instructions to manually perform the
comparison.
This requires a state dependent recompile whenever the sampler's compare
mode or function change. This adds the per-sampler comparison functions
to brw_wm_prog_key, but only sets them when the sampler's compare mode
is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prior to this patch, it would attempt to optimize and allocate registers
for the program even if it failed to compile. This seems wasteful.
More importantly, the "message length > 11" failure seems to choke the
instruction scheduler, making it somehow use an undefined value and
segmentation fault.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|/
|
|
|
|
|
|
|
|
| |
There will be a little bit of thrashing of the program cache BO as the
cache warms up, but once the application is in steady state, this
reduces relocations on gen5 and later.
On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6%
+/- 1.3% (n=6). No statistically significant performance difference
on nexuiz (n=5).
|
|
|
|
|
|
|
| |
A trivial fix for error: format not a string literal and no format
arguments with compiling with -Werror=format-security flags.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids the extra CMP and the predication on SEL, so in addition
to one less instruction, it makes scheduling less constrained.
Improves glbenchmark Egypt performance 0.6% +/- 0.2% (n=3). Reduces
FS instruction count across affected shaders in shader-db by 1.3%
without regressing any.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
At glLinkShaders time, a fail() call in FS compile in 8-wide (the one
that's required to succeed, though we may relax that at some point for
pre-Ironlake performance) will now report out as a link error.
|
|
|
|
|
|
|
|
|
| |
We now have:
brw_fs.cpp handles calling out to everything and optimization.
brw_fs_visitor.cpp handles translating to our LIR.
brw_fs_emit.cpp handles emitting from our LIR to native code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is all separate from the visitor and the optimization passes
which feed into it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
These will be used by the VS backend as well.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
These only existed in brw_fs.cpp because it was the only .cpp file in
the area when I wrote them.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
No statistically significant difference measured in 3dbenchmark
egypt/pro. It does reduce fragment shader instructions across
shader-db by 0.3%.
|
|
|
|
|
|
|
| |
No net code size change, but unit update is down 0.8% code size
pre-gen6.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's an assumption here that fixed GRFs will never intersect with
the allocated GRFs. That's true today, though it might change some
day if we decide to register-allocate the regs containing push
constants once they're dead.
This fixes a regression in 0f7325b89038937bd428f7c89ed9859189a0ab0b in
Lightsmark from the texture instructions now containing g0 references
instead of having that be implied. Performance is improved 15.2% +/-
3.6% (n=3).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34968
|
|
|
|
|
|
|
|
|
|
| |
From the GL 2.1 spec:
"Required perspective-correct interpolation for all fragment
attributes except depth in sections 3.4.1 and 3.5.1, effectively
making GL PERSPECTIVE CORRECT HINT a no-op."
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
I was using undefined values to create an unused value. Go me.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37366
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes a regression since 90e922267a89fa9bef254bb257405531ceff7356.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The coordinate offsets set in the m1 header are for textureOffset;
they have nothing to do with textureGrad (TXD).
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Improves glbenchmark egypt performance 0.6% +/- 0.4% (n=6).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
In the 16-wide rework, I missed that we were setting some things to be
SIMD16 mode (corresponding to their setup in emit_texture_gen4()).
Reviewed-by: Kenneth Graunke <[email protected]>
|