| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We used to steal it out of the brw_compile struct, but that won't be
initialized in time soon (and is eventually going away).
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
We used to steal it out of the brw_compile struct...but vec4_visitor
isn't going to have one of those in the future.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This leaves only the final code generation stage in brw_vec4_emit.cpp,
moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp.
The fragment shader backend puts these functions in brw_fs.cpp, so this
patch also helps with consistency.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I ran across this while running a glGenerateMipmap() test.
_meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes
_mesa_meta_begin to try and set a default orthographic projection.
Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width
and Height are 0, which just causes an GL_INVALID_VALUE error.
Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and
mipmap.manualIterateTexTargets.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes a ton of piglit regressions since the depthstencil fixes for gen6+.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965,
and the new piglit test glsl-fs-clamp-5.
We were trying to emit a saturating move into a uniform, which the code
generator appropriately choked on. This was broken in the change in
32ae8d3b321185a85b73ff703d8fc26bd5f48fa7.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields. These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.
This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.
v2: rzalloc p, as suggested by Eric.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Compiling shaders requires several main steps:
1. Generating FS IR from either GLSL IR or Mesa IR
2. Optimizing the IR
3. Register allocation
4. Generating assembly code
This patch splits out step 4 into a separate class named "fs_generator."
There are several reasons for doing so:
1. Future hardware has a different instruction encoding. Splitting
this out will allow us to replace fs_generator (which relies
heavily on the brw_eu_emit.c code and struct brw_instruction) with
a new code generator that writes the new format.
2. It reduces the size of the fs_visitor monolith. (Arguably, a lot
more should be split out, but that's left for "future work.")
3. Separate namespaces allow us to make helper functions for
generating instructions in both classes: ADD() can exist in
fs_visitor and create IR, while ADD() in fs_generator() can
create brw_instructions. (Patches for this upcoming.)
Furthermore, this patch changes the order of operations slightly.
Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now:
- Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16
- Generate final assembly code for both modes together
This is because the frontend work can be done independently, but final
assembly generation needs to pack both into a single program store to
feed the GPU.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Final code generation should never fail. This is a bug, and there
should be no user-triggerable cases where this could occur.
Also, we're not going to have a fail() method in a moment.
v2: Just abort() rather than assert, to cover the NDEBUG case
(suggested by Eric).
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
All we really need is a memory context and the instruction list; passing
a backend_visitor is just convenient at times.
This will be necessary two patches from now.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_compile structure is closely tied to the Gen4-7 hardware
encoding. However, do_wm_prog is very generic: it just calls out to
get a compiled program and then uploads it.
This isn't ultimately where we want it, but it's a step in the right
direction: it's now closer to the code generator.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
We used to steal it out of the brw_compile struct...but fs_visitor
isn't going to have one of those in the future.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Also change it from a brw_fragment_program to a gl_fragment_program,
since that seems to be what everything wants anyway.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We can easily recover it from prog, and this makes it clear that we
aren't passing additional information in.
v2: Use an if-statement rather than the ?: operator (suggested by Eric).
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, rather than having brw_wm_fs_emit poke at it directly, make it a
parameter to the fs_visitor constructor.
All other changes generated by search and replace (with occasional
whitespace fixup).
v2: Make dispatch_width const (as suggested by Paul); fix doxygen
mistake (pointed out by Eric); update for rebase.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
This necessitates compiling brw_wm_iz.c as C++.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that we only have the one backend, there's no real point in keeping
this separate. Moving it should allow some future simplifications.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Everybody determines this by checking if fp's OutputsWritten field
contains the FRAG_RESULT_DEPTH bit. Rather than having payload setup
check this and set the computes_depth flag, we can just do the check in
the only place that actually used it: emit_fb_writes().
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Kayden): Move the enable into an existing intel->gen >= 4 block
(as suggested by Ian).
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements BGRA swizzle, sign recovery, and normalization
as required by ARB_vertex_type_10_10_10_2_rev.
V2: Ported to the new VS backend, since that's all that's left;
fixed normalization.
V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too.
V4 (Kayden): Rework ES3 normalization, don't heap allocate registers;
tidy comments.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Flag the need for various workarounds to be applied by
the vertex shader.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Next few patches build on this to add other workarounds
for packed formats.
V2: rename BRW_ATTRIB_WA_COMPONENTS to BRW_ATTRIB_WA_COMPONENT_MASK;
V3 (Kayden): remove separate bit for ES3 signed normalization
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Always use R10G10B10A2_UINT; Most of the other formats we'd like
don't actually work on the hardware. Will emit w/a for scaling,
sign recovery and BGRA swizzle in the VS.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We can't support IF statements in 16-wide on these. To get back to 16-wide
for these shaders, we need to support predicate on discard instructions in the
backend IR, which is something we've sort of got on the list to do anyway.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55828
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Commit 774fb90db3e83d5e7326b7a72e05ce805c306b24 introduced a ralloc context to
each user of struct brw_compile, but for this one a NULL context was used,
causing the later ralloc_free(mem_ctx) to not do anything.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175
NOTE: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
|
| |
We have a special case where non-shadow comparison with LOD requires using a
SIMD16 vec4 in an 8-wide shader, which appears in the register allocator as a
size 8 vgrf.
Fixes assertions in various piglit tests and webgl conformance.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56521
|
|
|
|
| |
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The bug was found by Coverity.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
This gives us checking of our arguments (no more passing 1 operand to
BRW_OPCODE_MUL!), at the cost of a couple of extra parens.
v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a regression in the brw_fs_fp.cpp change. We just need to return
something good enough to get the IR generation to the end without crashing,
but ir->type isn't initialized and we wanted something of the coordinate's
type anyway.
Fixes around 30 piglit cases on my ilk system in drawpixels and framebuffer
blit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The theory of the guardband is that you extend the clip volume to avoid
expensive clipping computation, and just let fragments outside the viewport
get clipped by the drawable's bounds. But if a smaller-than-window-size
viewport is set, and we don't also happen to have a scissor set, then
rendering could incorrectly extend outside of the viewport when it should have
been clipped to the viewport.
Fixes the new piglit triangle-guardband-viewport test.
Reviewed-by: Kenneth Graunke <[email protected]>
NOTE: This is a candidate for the 9.0 branch.
|
|
|
|
|
|
|
|
| |
When you're comparing to the spec, you're trying to immediately see what
numbered dword of the packet your bit ends up in.
Reviewed-by: Kenneth Graunke <[email protected]>
NOTE: This is a candidate for the 9.0 branch.
|
|
|
|
|
|
|
| |
This reverts commit cf0bbb30f6bd9d3fa61b5207320e8f34c563a2c6. It
was just papering over the bug fixed in the previous commit.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes oglconform shad-compiler advanced.TestLessThani.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
NOTE: This is a candidate for the 9.0 branch.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All Intel code is compiled with -std=c99. There is no excuse to not use
designated initializers.
As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep. I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dri directory is compiled with -std=c99. There is no excuse to not use
designated initializers.
As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep. I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a packed depth/stencil buffer on separate stencil hardware, the
separate depth miptree is set up with alignment of 4,4 and the separate
stencil miptree is setup with alignment of 8,8. We can't just use the
irb->draw_{x,y} offsets for stencil, since that is the offset in the
depth miptree.
Fixes 12 piglit depthstencil testcases on ivb.
Acked-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Given that we have the mask information here (assuming the rebase is to
the same tiling, which is safe), we can just save a set of miptrees and
offsets and the global intra-tile offset in the context and cut out a
bunch of logic. This will also save emitting the next fix I need to do
twice.
Acked-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a theoretical problem where we had an aligned depth buffer and a
misaligned stencil buffer with a matching tile offset, so we would fail
to rebase depth even after the needed tile offset changed due to the
rebase of stencil.
It should also fix double-rebase of a misaligned packed depth/stencil
renderbuffer, which may have been a performance issue.
Acked-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
We were always passing 0 for one of the two fields, and the code just used
whichever one wasn't 0.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
I noticed these in the next patch where these paths were using the Face
of a teximage but didn't have array handling.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
The kind of data you're copying is definitely an interesting variable.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
I removed that code almost a year ago.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new brw_reg always had type BRW_REGISTER_TYPE_F, rather than
inheriting the original type of the ATTR file register.
In the past, this hasn't been a problem since we only execute this code
when fixing up GL_FIXED attributes, which always have float types.
However, we'll soon be using it for ARB_vertex_type_10_10_10_2 support,
which uses D and UD types.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For GLES1 and GLES2, brwCreateContext neglected to validate the requested
context version received from the DRI layer. If DRI requested an OpenGL
ES2 context with version 3.9, we provided it one.
Before this fix, the switch statement that validated the requested GL
context flavor was an ugly #ifdef copy-paste mess. Instead of reproducing
the copy-past-mess for GLES1 and GLES2, I first refactored it. Now the
switch statement is readable.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|