| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Different platforms require the offset to be in different units. However,
the generator fixes all of this up for us and only requires an offset in
bytes. Previously, we were getting this wrong all over the place. Some
computed/used it correctly as bytes while others treated the offset as
whole registers or computed it as bytes or bytes*2 in SIMD16 mode. This
commit cleans all this up and makes us properly treat it as bytes
everywhere.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I thought this would be a clever way to make spilling less expensive.
However, it appears that the oword read/write messages we are using for
spilling ignore the execution size and assume SIMD16 whenever working with
more than one register.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We only get here if the VS/GS compiled programs change, but we can even
skip it if the VS/GS size didn't change.
Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just remove the parameter. Silences:
brw_program.c: In function 'brw_dump_ir':
brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter]
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally I just fixed some unused parameter warnings in this
function. However, Ken pointed out:
"You could instead remove this driver hook. If the dd pointer is
NULL, arbprogram.c will return true. I think I'd prefer that."
Way, way back in time, I think _mesa_GetProgramivARB had the opposite
behavior. Given that it works the way it now works, I also prefer
removing the driver hook.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It was identical to the default implementation in
_mesa_new_shader_program.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes many piglit failures on IVB since 85edaa8.
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85425
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were allowing the register allocation code to do the
computation for us in ra_set_finalize. However, the runtime for this
computation is O(c^4 * g) where c is the number of classes and g is the
number of GRF registers. However, these q-values are directly computable
based on the way we lay out our register classes so there is no need for
the aweful runtime algorithm.
We were doing ok until commit 7210583eb where we bumped the number of
register classes from 11 to 16. While startup times don't normally matter,
this caused piglit to take 4 times as long to run on Bay Trail. This patch
should make generating the ra_set much faster and melt the piglit run
times.
v2: Fixed a couple of bugs. I have now verified that the same q-values are
generated both ways.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On older GENs in SIMD16 mode, we were accidentally building too much
interference into our register classes. Since everything is divided by 2,
the reigster allocator thinks we have 64 base registers instead of 128.
The actual GRF mapping still needs to be doubled, but as far as the ra_set
is concerned, we only have 64. We were accidentally adding way too much
interference.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For GEN6 SIMD16 mode, we have to 2-align all the registers, so we only have
the even-numbered ones. This means that we have to divide the register
number by 2 when we precolor. This wasn't a problem before because we were
setting up the interference between ra_node registers wrong. This will be
fixed in the next commit.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Restore clip control to the default state if MESA_META_VIEWPORT
or MESA_META_DEPTH_TEST is requested.
v3:
Handle clip control state with MESA_META_TRANSFORM.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is for preparation of ARB_clip_control.
v3:
Add comments.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
The compiler isn't privy to the knowledge that we're doing at least one
framebuffer write.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g6<1>D g1<0,4,1>F 0F
cmp.nz.f0(8) null g6<4,4,1>D 0D
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g1<0,4,1>F 0F
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
No difference in shader-db.
v2: Remember to delete the old code (thanks Matt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that. This should
be equivalent, but less code.
Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nothing uses the vector_elements temporary variable.
Setting this->result.file is dead because we overwrite this->result a
few lines later.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g4<1>D g2<0,1,0>F 0F
cmp.nz.f0(8) null g4<8,8,1>D 0D
(+f0) sel(8) g120<1>F g2.4<0,1,0>F g3<0,1,0>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g2<0,1,0>F 0F
(+f0) sel(8) g124<1>F g2.4<0,1,0>F g3<0,1,0>F
total instructions in shared programs: 5473459 -> 5473253 (-0.00%)
instructions in affected programs: 6219 -> 6013 (-3.31%)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Suppresses a bunch of warning noise about sample_map possibly being used
uninitialized.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we used the a signed d-word for booleans and the immedates we
emitted varried between signed and unsigned. This commit changes the type
to unsigned (I think that makes more sense) and makes immediates more
consistent. This allows copy propagation to work better cleans up some
instructions.
total instructions in shared programs: 5473519 -> 5465864 (-0.14%)
instructions in affected programs: 432849 -> 425194 (-1.77%)
GAINED: 27
LOST: 0
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
gl_SampleID is a built-in variable that always is of type "int".
Suggested by Connor Abbott.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
NewBufferObject took a "target" parameter, which it blindly passed to
_mesa_initialize_buffer_object(), which ignored it.
Not much point in passing it around.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
This is used to implement GLSL's atomicCounter() intrinsic. Previously
it *worked*, but the disassembly was bogus.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
This would have *almost never* actually been an issue, since other state
tends to get flagged at the same time as new ABOs -- but still bogus.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
This didn't make any sense, but papered over the missing TexBO flagging
we've just fixed, in a bunch of cases.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we've made all the texture emit code mostly independent of GLSL
IR, this isn't necessary any more.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we had 3 different emit functions for various different gen's,
as well as some ancilliary work that was the same across all gen's which
was either contained in functions or duplicated across the GLSL IR and
Mesa IR backends. Now, we have a single method, emit_texture(), that
takes all the information needed to make a texture instruction and
handles all the setup, and all we have to do to emit a texture
instruction while converting from GLSL IR, Mesa IR, or any new backend
is to extract the information emit_texture() needs and then call it.
v2: Significant rebasing (by Ken).
Signed-off-by: Connor Abbott <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This happened to work before, but it would convert the output to a float
and then back to an integer which seems bad.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
At this point, the only thing it's used for is the opcode.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
We already have the type from the original destination.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This drops a dependency on ir_texture objects.
v2 (Ken): Rename lod_components to grad_components, as it only has a
meaningful value for ir_txd. We could set it to 1 for TXL,
but there's no real need.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
This drops a dependency on ir_texture objects.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Our new IR won't have ir_texture objects, but using glsl_type is fine.
v2 (Ken): Drop redundant ir->coordinate NULL check; rebase.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
This is slightly clearer. Based on a patch by Connor Abbott.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Ken): Refactor the Gen7 code separately; rebase.
Signed-off-by: Connor Abbott <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the handling of non-constant texel offset subexpression trees
to the place where we visit other such subtrees. It also removes some
uses of ir->offset in emit_texture_gen7, which will be useful when we
write the backend for our new upcoming IR.
Based on a patch by Connor Abbott.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the
texelFetchOffset workarounds, so there's no need to special case it
here---there won't be an offset for ir_txf.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|