| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
The intel_context and tiling parameters were not used by any if the
i9[14]5_miptree_layout or the functions they call, and the tiling parameter was
not used by brw_miptree_layout. Remove the unnecessary parameters.
|
|
|
|
|
| |
Further reduces instruction count by 4.0% in 40.7% of the vertex
shaders.
|
|
|
|
|
| |
This only occurs for GRFs, and hasn't mattered until now because we
only copy propagated non-GRFs.
|
|
|
|
| |
Removes 2.0% of the instructions from 35.7% of vertex shaders in shader-db.
|
|
|
|
|
|
|
|
|
|
|
| |
This differs from the FS in that we track constants in each
destination channel, and we we have to look at all the swizzled source
channels. Also, the instruction stream walk is done in an O(n) manner
instead of O(n^2).
Across shader-db, this reduces 8.0% of the instructions from 60.0% of
the vertex shaders, leaving us now behind the old backend by 11.1%
overall.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tracking virtual GRFs has tension between using a packed array per
virtual GRF (which is good for register allocation), and sparse arrays
where there's an element per actual register (so the first and second
column of a mat2 can be distinguished inside of an optimization pass).
The FS mostly avoided the need for this second sparse array by doing
virtual GRF splitting, but that meant that instances where virtual GRF
splitting didn't work, instructions using those registers got much
less optimized.
|
|
|
|
|
|
|
|
|
|
|
| |
Now instead of env INTEL_NEW_VS=1 to get it, you need INTEL_OLD_VS=1
to not get it. While it's not quite to the same codegen efficiency as
the old backend, it is not regressing piglit on G965 and G45, and
actually fixing bugs on gen6, and the remaining codegen quality
regressions all appear tractable.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes glsl-vs-uniform-array-4.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33742
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't expect uniform accesses to generally go away from being dead
code at this point, and we will want to have uniforms packed before
spilling them out to pull constants when we are forced to do that.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes assertion failure from double-free in oglc
glsl-arrayobject constructor.declaration.structure
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The offset to the arrays after the first was mis-scaled, so we'd go
access off the end of the surface and read 0s. Fixes
glsl-vs-uniform-array-3.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
While we had nice debug output for most of the instruction stream, it
was terminated by a series of anonymous MOVs and a send.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It maps to MESA_FORMAT_RGBA8888_REV. Surfaces of the format can only be
sampled from but not render to.
Only i915 is tested.
Reviewed-by: Eric Anholt <[email protected]>
[olv: add a check in intel_image_target_renderbuffer_storage]
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The opcodes and strings were reversed. Quotient means division, and
modulus means remainder.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In particular, S3TC compressed textures need align_h == 4.
Fixes skybox errors in Quake 4 and FEAR.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34628
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes glsl-vs-point-size.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is required to ensure ordering between reads and writes within a
thread.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We were failing to relocate, so on the first draw run our scratch
would tend to get written to 0x0.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We were passing an MRF as the source argument, instead of using the
implied move and putting the MRF number in the proper place in the
instruction encoding.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The second vertex was getting a garbage index.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes a giant pile of VS tests on gen4.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
On the old backend, we used scalar mode because Mesa IR math is
result.xyzw = math(op0.xxxx), which matched up well. However, in GLSL
IR we do things like result.xy = math(op0.xy), so we want vector mode.
For the common case of result.x = math(op0.x), performance will be the
same (no cost for un-executed channels), though result.xyzw =
math(op0.xxxx) would be worse.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes vs-pow-float-float and friends.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
When we tried to retype a brw_null_reg() in CMP(), the retyping didn't
take effect because HW_REG just ignores the type field.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
If you get your total GRF count wrong, you write over some other
shader's g0, and the GPU fails shortly thereafter.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Since we now lay out the VUE the same way regardless of whether
two-sided color is enabled, brw_compute_vue_map() no longer needs to
know whether two-sided color is enabled. This allows the two-sided
color flag to be removed from the clip, GS, and VS keys, so that fewer
GPU programs need to be recompiled when turning two-sided color on and
off.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When doing two-sided color on GEN6+, we use the SF unit's
INPUTATTR_FACING mode to cause front colors to be used on front-facing
triangles, and back colors to be used on back-facing triangles. This
mode requires that the front and back colors be adjacent in the VUE.
Previously, we would only place front and back colors adjacent in the
VUE when two-sided color was enabled. Now we place them adjacent in
the VUE whether two-sided color is enabled or not. (We still only
swizzle the colors when two-sided color is enabled, so there should be
no user-visible change).
This simplifies the implementation of the VUE map and reduces the
amount of code that is dependent on two-sided color mode.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The previous computation had two bugs: (a) it used a formula based on
Gen5 for Gen6 and Gen7 as well. (b) it failed to account for the fact
that PSIZ is stored in the VUE header. Fortunately, both bugs caused
it to compute a URB size that was too large, which was benign. This
patch computes the URB size directly from the VUE map, so it gets the
result correct in all circumstances.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The variables offset[], idx_to_attr[], nr_bytes, nr_attrs, and
header_regs were all serving purposes which are now served by the VUE
map.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, brw_clip_interp_vertex() iterated only through the
"non-header" elements of the VUE when performing interpolation
(because header elements don't need interpolation). This code now
refers exclusively to the VUE map to figure out which elements need
interpolation, so that brw_clip_interp_vertex() doesn't need to know
the header size.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch replaces some ad-hoc computations using ATTR_SIZE and the
offset[] array to use the VUE map functions
brw_vert_result_to_offset() and brw_vue_slot_to_offset().
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously we would examine the offset[] array (since an offset of 0
meant "not in use"). This paves the way for removing the offset[]
array.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This makes header_regs available for computing VUE offsets within clip code.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The offsets within the VUE of HPOS and NDC are needed only in a few
auxiliary clipping functions. This patch moves computation of those
offsets into the functions that need them, and does the computation
using the VUE map.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
map.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch changes get_attr_override() (which computes the
relationship between vertex shader outputs and fragment shader inputs)
to use the VUE map.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the variables nr_attrs and nr_setup_attrs, whose
purpose is now being served by the VUE map. nr_attr_regs and
nr_setup_regs are still needed, however they are now computed using
the VUE map rather than by counting the number of vertex shader
outputs (which caused subtle bugs when gl_PointSize was written).
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously, the SF used nr_setup_attrs to determine whether it was
looking at the last element of the VUE. Changed this code to use the
VUE map.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
These data structures were serving the same purpose as the VUE map,
but were buggy. Now that the code has been transitioned to use the
VUE map, they are not needed.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously, SF code used the idx_to_attr[] array to compute the
location of entries in the VUE map. This array didn't properly
account for gl_PointSize. Now we use the VUE map directly.
Reviewed-by: Eric Anholt <[email protected]>
|