| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
For most drivers this if statement is always going to fail so check the constant value first.
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Note that this covers the Begin/End rendering path, but not user vertex
arrays (so we can't drop copy_array_to_vbo_array() code). Improves
performance of isosurf GLVERTEX|TRIANGLES by 16.7506% +/- 4.98934%
(n=20). No difference on openarena (n=10), which was why this was reverted
back in cbde2765804a4fc62bcf092230a01376aedbf2cd.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_meta_setup_blit_shader() currently generates a fragment shader
which, irrespective of the number of draw buffers, writes the color
to only one 'out' variable. Current shader rely on an undefined
behavior and possibly works by chance.
From OpenGL 4.0 spec, page 256:
"If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
set of draw buffers into which the single fragment color defined by
gl_FragColor is written. If a fragment shader writes to gl_FragData,
or a user-defined varying out variable, DrawBuffers specifies a set
of draw buffers into which each of the multiple output colors defined
by these variables are separately written. If a fragment shader writes
to none of gl_FragColor, gl_FragData, nor any user defined varying out
variables, the values of the fragment colors following shader execution
are undefined, and may differ for each fragment color."
OpenGL 4.4 spec, page 463, added an additional line in this section:
"If some, but not all user-defined output variables are written, the
values of fragment colors corresponding to unwritten variables are
similarly undefined."
V2: Write color output to gl_FragColor instead of writing to multiple
'out' variables. This'll avoid recompiling the shader every time
draw buffers count is updated.
Cc: <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Cc: <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modern applications frequencly use both UNORM buffers and FLOAT buffers
with color clamping disabled. (FLOAT with clamping explicitly enabled
and SNORM buffers appear to be less common.) We don't need to emit
saturates in the fragment shader in either of the common cases.
Mesa sets ctx->Color._ClampFragmentColor to false if all the color
buffers are UNORM. Also, for GL_FIXED_ONLY mode (the default in
legacy OpenGL), it will be false if any FLOAT buffers are bound.
Since the common case is false, that should be our default.
Thanks to Roland Scheidegger for pointing out some faulty logic
in v1 of this patch (unnecessary code and incorrect explanations).
v2: Drop superfluous code and reword commit message.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a replacement for bd44ac8b5ca08016bb064b37edaec95eccfdbcd5
that should actually work.
Fixes Piglit's copyteximage-border on swrast, as well as one of
es3conform's packed_pixels_pixelstore test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separating the software fallbacks from the rest of the meta path (which
is usually hardware accelerated) gives callers better control over their
blitting options.
For example, i965 might want to try meta blit, hardware blits, then
swrast as a last resort. Splitting it makes that possible.
This updates all callers to maintain the existing behavior (even in the
few cases where it isn't desirable behavior - later patches can change
that).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These aren't necessary - all of the following code is predicated on mask
being non-zero, so no code will get executed anyway.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Courtney Goeltzenleuchter <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bd44ac8b5ca08016bb064b37edaec95eccfdbcd5.
Fixes:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843
Re-breaks:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
but that will be fixed properly in a few commits.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I don't have an ILK at hand but the fix should be trivial.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872
Cc: "10.2" <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-and-tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's not properly implemented in the meta code, and we don't have time
to fix it for 10.2.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes build error introduced with commit
4b04152db055babb8b06929a0c9ebea5c7f4fb92.
CC test_eu_compact.o
test_eu_compact.c: In function ‘test_compact_instruction’:
test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]
brw_disasm(stderr, &src, brw->gen, false);
^
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78888
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We already have a perfectly good copy of the program key, and nobody is
going to modify it. The only reason we copied it was because the
brw_wm_compile structure embedded the key rather than pointing to it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead, just pass the key and prog_data as separate parameters.
This moves it up a level - one step further toward getting rid of it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
'c' is going away, but we still need a memory context that lives
for the duration of the compile.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the memory context situation was a bit of a mess:
fs_visitor allocated its own memory context, and freed it in the
destructor. However, some data produced by fs_visitor (such as the list
of instructions) needs to live beyond when fs_visitor is "done", so the
caller can pass it to fs_generator.
Everything worked out because brw_wm_fs_emit's fs_visitor variables
happen to not go out of scope until the end of the function. But that
meant that moving the declaration of, say, the SIMD16 fs_visitor
instance, could cause everything to explode.
Using a memory context that exists for the duration of the compile is
clearer, and should be equivalent.
Ultimately, we don't want to use 'c', but this matches the behavior of
fs_generator and gen8_fs_generator, so it'll be simple to change later.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We throw away the data generated during compilation on the success path,
so we really ought to on the failure path as well. The caller has no
access to it anyway, so it's purely leaked.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
'c' is going away. This is also a bit shorter.
Marking the key pointer as const will also deter people from changing
it in these classes, as that's absolutely not OK.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
'c' is going away. This is also shorter.
Marking the key pointer as const will also deter people from changing
it in fs_visitor, as it's absolutely not OK to modify it there.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
'c' is going away. This is also a bit shorter.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
'c' is going away. This is also a bit shorter.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
runtime_check_aads_emit isn't actually used currently, but I believe
we should be using it on Gen4-5, so I haven't eliminated it.
See https://bugs.freedesktop.org/show_bug.cgi?id=78679 for details.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This data is created by fs_visitor and only used when emitting code,
so keeping it in fs_visitor makes sense. I decided it would be
reasonable to group these all together in a struct, since they're
highly related.
v2: s/nr_payload_regs/payload.num_regs/ in some comments (chrisf).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, there's no point in allocating an extra register
and generating a MOV---we can just use the copy provided as part of our
thread payload directly. It's already in the right format.
Of course, there are zero Piglit tests for this. We don't actually ship
the extension (GL_ARB_gpu_shader5) that exposes this functionality
either.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
This is actually for gl_SampleMaskIn, which is quite different than
gl_SampleMask. Renaming should help avoid confusion.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing outside of fs_visitor uses it, so we may as well keep it
internal.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
With this one use gone, c->last_scratch is now only used inside
fs_visitor. The rest of the driver uses prog_data->total_scratch.
We already compute similar prog_data fields in fs_visitor, so this
seems reasonable.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The if (!allocated_without_spills) block is an obvious spot for this
performance warning message.
In the Vec4 backend, scratch is also used for indirect access of
temporary arrays. The FS backend doesn't implement that yet, but
if it did, this message would be inaccurate, since scratch access
wouldn't necessarily mean spilling. Moving it preemptively fixes that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
"Disassemble" is an accurate description of what this function does.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
We're going to use "disassemble" for the function that disassembles
the whole program.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
dump_prog_cache has interpreted compacted instructions as full size
instructions, decoding garbage and complaining about invalid values.
We can just use brw_dump_compile to handle this correctly in less code.
The output format changes slightly, but it's still perfectly acceptable.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looping over the instructions and calling brw_disasm doesn't handle
compacted instructions. In most cases, this hasn't been a problem since
we don't compact prior to Sandybridge.
However, Sandybridge's transform feedback GS program should already be
compacted, and so this ought to fix decoding of that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We allocate dispatch tables for BeginEnd and OutsideBeginEnd. But
when we destroy the context we were freeing the BeginEnd and Exec
tables. If Exec==BeginEnd we did a double-free. This would happen
if the context was destroyed while inside a glBegin/End pair. Now
free the BeginEnd and OutsideBeginEnd pointers.
Cc: "10.1", "10.2" <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the valgrind report below and random crashes with piglit on radeonsi.
==30005== Conditional jump or move depends on uninitialised value(s)
==30005== at 0xB13584E: st_translate_program (st_glsl_to_tgsi.cpp:5100)
==30005== by 0xB14698B: st_translate_fragment_program (st_program.c:747)
==30005== by 0xB14777D: st_get_fp_variant (st_program.c:824)
==30005== by 0xB11219C: get_color_fp_variant (st_cb_drawpixels.c:1042)
==30005== by 0xB1131AE: st_DrawPixels (st_cb_drawpixels.c:1154)
==30005== by 0xAFF8806: _mesa_DrawPixels (drawpix.c:162)
==30005== by 0x4EB86DB: stub_glDrawPixels (generated_dispatch.c:6640)
==30005== by 0x4F1DF08: piglit_visualize_image (piglit-util-gl.c:1574)
==30005== by 0x40691D: draw_image_to_window_system_fb(int, bool) (draw-buffers-common.cpp:733)
==30005== by 0x406C8B: draw_reference_image(bool, bool) (draw-buffers-common.cpp:854)
==30005== by 0x40722A: piglit_display (alpha-to-coverage-dual-src-blend.cpp:117)
==30005== by 0x4EA7168: run_test (piglit_fbo_framework.c:52)
Cc: "10.1 10.2" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
gen8_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
brw_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Has been misaligned since we added instruction offset prefixes.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
brw_disasm doesn't disassemble compacted instructions, so we uncompact
before disassembling them which would unset the compaction control bit.
Instead pass it as a separate argument.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
In order to remove bblock_link's inheritance of exec_node. Also makes
linked list walk code much nicer.
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Only bblock_link's inheritance left.
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Cc: "10.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Cc: "10.2" <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is effective only on gen8 for now as previous generations still
go through blorp.
Cc: "10.2" <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Create the intel renderbuffer with level hardcoded to zero instead
of overriding it in the surface state configuration. Also moved the
dimension adjustments for tiling, mip level, msaa into the render
buffer creation. Finally prepares for another blit path needed for
miptree updownsampling.
v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()"
Cc: "10.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|