| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're starting to get apps utilizing more than 16 varyings and
most current hardware supports 32 anyway.
Tested with r600g.
swrast, softpipe and llvmpipe still advertise 16 varyings.
This fixes a WebGL crash after launching this demo:
https://developer.mozilla.org/en-US/demos/detail/falling-cubes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54402
NOTE: This is a candidate for the stable branches.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
elements"
This reverts commit ebd8df7a3152e34805e2863c8471ee1a2de38fe1.
accidentally pushed.
|
|
|
|
|
|
|
| |
This is a leftover from when we had to split those two functions due to
the separate BO validation step.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
"Active" is an already-used term for the query being between
glBeginQuery() and glEndQuery(), while this is tracking whether the
start of the packet pair for emitting state has been inserted into the
current batchbuffer.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Helpful for debugging.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following code, which reinterprets a register as a
different type:
mov(8) g6<1>F g1.4<0,4,1>.xF
and(8) g5<1>.xUD g6<4,4,1>.xUD 0x7fffffffUD
Copy propagation would notice that we can replace the use of g6 with
g1.4 and eliminate the MOV. Unfortunately, it failed to preserve the UD
type, incorrectly generating:
and(8) g5<1>.xUD g6<4,4,1>.xF 0x7fffffffUD
Found while debugging Ian's uncommitted ARB_vertex_program LOG opcode
test with my new Mesa IR -> Vec4 IR translator.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following code sequence:
mul(8) g4<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mov.sat(8) m1<1>.xyF g4<4,4,1>F
mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
mov.sat(8) m1<1>.zwF g4<4,4,1>F
The compute-to-MRF pass will discover the first mov.sat and attempt to
replace it by rewriting earlier instructions. Everything works out,
so it replaces scan_inst's destination file, reg, and reg_offset,
resulting in:
mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
mov.sat(8) m1<1>.zwF g4<4,4,1>F
Unfortunately, it loses the .xy writemask on the mov.sat's MRF
destination. While this doesn't pose an immediate problem, it then
proceeds to transform the second mov.sat, resulting in:
mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mul(8) m1<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
Instead of writing both halves of the vector (like the original code),
it overwrites the full vector both times, clobbering the desired .xy
values.
When encountering a MOV, the compute-to-MRF code scans for instructions
which generate channels of the MOV source. It ensures that all
necessary channels are available (possibly written by several
instructions). In this case, *more* channels are available than
necessary, so we want to take the subset that's actually used.
Taking the bitwise and of both writemasks should accomplish that.
This was discovered by analyzing an ARB_vertex_program test
(glean/vertProg1/MUL test (with swizzle and masking)) with my new
Mesa IR -> Vec4 IR translator code. However, it should be possible
with GLSL programs as well.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While copying the values into the batch space, we advance the param
pointer. The debug code then tries to iterate over all the uploaded
values, starting at param...which is now the end of the uploaded data,
rather than the start.
This patch saves a pointer to the start of push constant space before
it gets altered and switches the debug code to use that.
Tested by uncommenting the code and examining the output of
glsl-vs-clamp-1.shader_test. Previously all values appeared to be zero.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Since ES3.0 is backward compatible with 2.0, we check that all the 2.0
functions and additional 3.0 functions exist.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
Will be useful for the next patch, adding GLES 3 testing.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously we just printed the dispatch table index and the user had
to convert it to a function name. That was a pain because when
FEATURE_remap_table is defined, the assignment of functions to
dispatch table entries is done at run time.
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
4bits and 3bits quantitization values differ significantly for
values other than 0 and 1.
Fixes piglit draw-pixels for softpipe/llvmpipe.
NOTE: Probably a candidate for stable branches.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the
wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of
GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for
operations on integer or boolean values dereferenced from an array or
structure. Assertions have been added to get_opcode() to prevent this bug
from reappearing in the future.
|
|
|
|
|
|
|
|
|
|
| |
This silences a zillion GCC warnings like:
../../../src/mesa/main/pack.c: In function '_mesa_pack_rgba_span_from_uints':
../../../src/mesa/main/pack.c:560:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The layer dimension of array textures is not subject to mipmap minification.
OTOH we were missing an assertion for the depth dimension.
Fixes assertion failures with piglit {f,v}s-textureSize-sampler1DArrayShadow.
For some reason, they only resulted in piglit 'warn' results for me, not
failures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56211
NOTE: This is a candidate for the stable branches.
Signed-off-by: Michel Dänzer <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch sets up the dispatch table for the following GLES3
functions when a GLES3 context is in use:
- BeginQuery
- BeginTransformFeedback
- BindSampler
- BindTransformFeedback
- BlitFramebuffer
- ClearBufferfi
- ClearBufferfv
- ClearBufferiv
- ClearBufferuiv
- ClientWaitSync
- CopyBufferSubData
- DeleteQueries
- DeleteSamplers
- DeleteSync
- DeleteTransformFeedbacks
- EndQuery
- EndTransformFeedback
- FenceSync
- FramebufferTextureLayer
- GenQueries
- GenSamplers
- GenTransformFeedbacks
- GetInteger64v
- GetQueryObjectuiv
- GetQueryiv
- GetSamplerParameterfv
- GetSamplerParameteriv
- GetStringi
- GetSynciv
- GetTransformFeedbackVarying
- GetVertexAttribIiv
- GetVertexAttribIuiv
- IsQuery
- IsSampler
- IsSync
- IsTransformFeedback
- PauseTransformFeedback
- RenderbufferStorageMultisample
- ResumeTransformFeedback
- SamplerParameterf
- SamplerParameterfv
- SamplerParameteri
- SamplerParameteriv
- TransformFeedbackVaryings
- VertexAttribDivisor
- VertexAttribIPointer
- WaitSync
And it avoids setting up the dispatch table for these non-GLES3
functions:
- ColorMaski
- GetBooleani_v
- Enablei
- Disablei
- IsEnabledi
- ClearColorIiEXT
- ClearColorIuiEXT
- TextureStorage2DEXT
- TextureStorage3DEXT
- GetActiveUniformName
- GetnUniformdv
- GetnUniformfv
- GetnUniformiv
- GetnUniformuiv
Reviewed-by: Brian Paul <[email protected]>
v2: Make the ctx argument to _mesa_init_transform_feedback_dispatch()
a const pointer. Add a comment to remind us to add
GetBufferParameteri64v once tests exist for it. Also add
VertexAttribDivisor for GLES3, and remove GetActiveUniformName and
GetnUniform{dv,fv,iv,uiv} for GLES3.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This function is only useful for the ARB_{vertex,fragment}_program
extensions, which we don't expose in core contexts.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glGetPointerv was de-deprecated in GL 4.3, because GL 4.3 adds
functionality from KHR_debug and ARB_debug_output, which require
glGetPointerv.
This patch modifies _mesa_create_exec_table() to populate
glGetPointerv in the dispatch table for core contexts.
Technically this is not in compliance with the spec--what we really
ought to do for core contexts is expose glGetPointerv only when a GL
4.3 context is in use or one of the two extensions is present.
However, it seems silly to go to that extra work, since the only
client-visible effect would be for glGetPointerv to raise an
INVALID_OPERATION error instead of an INVALID_ENUM error. Besides,
the other functions set up by _mesa_create_exec_table() only depend on
the API in use, not on the GL version or extensions supported.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There's no reason to have separate slots in the dispatch table for
these two functions, since they are synonymous.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This eliminates a warning in GCC 4.7.1.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
With the previous two commits, this fixes piglit
GL_ARB_occlusion_query2/api.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
v2: Add a comment about what we're checking for.
Reviewed-by: Brian Paul <[email protected]> (v1)
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
There's a similar test below, but it's not the same: that one checks whether
this query object is already active (potentially on another target).
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
v2: Fix mangled sentence in the comment, and make the loop exit early.
Reviewed-by: Ian Romanick <[email protected]> (v1)
|
|
|
|
|
|
| |
since they're allocated by ureg_get_tokens().
NOTE: This is a candidate for the 8.0 and 9.0 branches.
|
|
|
|
|
|
|
|
|
| |
We should use the later since we're freeing the memory with free(),
not the gallium FREE() macro.
This fixes a mismatch when using the gallium debug memory functions.
NOTE: This is a candidate for the 9.0 branch.
|
|
|
|
|
|
|
| |
Given the usecase we have of trying to measure timestamps across individual
draw calls, flushing will totally mess up what people are trying to measure.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The theory I had when I wrote the code was that you wanted to minimize latency
on your queries because the app was going to ask soon. Only, it turns out
that everybody batches up their queries and asks for the results later (often
after the next SwapBuffers!), so this was a pessimization.
Until now, I had no workload where it mattered enough to benchmark. Recently
I started playing some Minecraft, which uses tons of queries to decide whether
to render chunks of the terrain. For that app, avoiding the flush in the
query-generation loop improves performance 22.7% +/- 4.7% (n=3) on an apitrace
capture of it (confirmed in game by watching the fps meter found by pressing
F3, 15/16 -> 20/21 fps).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
I'm amazed that my usual warnings check didn't catch this, and that this
passed piglit.
|
|
|
|
|
|
|
|
| |
otherwise some compilers will throw error
"error: format not a string literal and no format arguments"
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes WebGL texture mips conformance test, no piglit regressions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44912
NOTE: This is a candidate for the stable branches.
Signed-off-by: Michel Dänzer <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
| |
If GL_BASE_LEVEL==0 and GL_MAX_LEVEL==0 that's a pretty good hint that
there'll be a single mipmap level in the texture.
Google Earth sets the texture's state this way before the first glTexImage
call. This saves a bit of texture memory.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit tests "unpack-teximage2d --pbo=* --format=GL_BGRA" on
Sandybridge+.
The fastpath was checking an incomplete set of pixel unpack state. This
patch adds checks for all the fields of gl_pixelstore_attrib that affect
2D texture uploads. Also, it begins permitting the case where
GL_UNPACK_ROW_LENGTH is 0.
Ideally, we would just ask a unicorn to JIT this fastpath for us in
a way that safely handles the unpacking state. Until then, it's safer if
only a small set of situations activate the fastpath.
v2: Use _mesa_is_bufferobj(), per Anholt.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Now that we've replaced all the variable settings other than reg_width, it's
easy to hang on to this (the expensive part of setting up the allocator).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This should also reduce register pressure on gen7+, like the previous commit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improves performance of the Lightsmark penumbra shadows scene by 15.7% +/-
1.0% (n=15), by eliminating register spilling. (tested by smashing the list of
scenes to have all other scenes have 0 duration -- includes additional
rendering of scene description text that normally doesn't appear in that
scene)
v2: Allow allocation of all but g0/g1 of the payload.
v3: Pull count_to_loop_end() out to a helper function.
Reviewed-by: Kenneth Graunke <[email protected]> (v2, recommended v3)
|
|
|
|
|
|
| |
For now, nothing else can get allocated over them, but that will change.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This was to slot in the magic aligned pairs class, but it got moved to a
descriptive name later.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Based on split_virtual_grfs(), we choose the same set every time, so set it in
stone. This will help us avoid regenerating the somewhat expensive
class/register set setup every compile.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is derived from the FS visitor code for the same, but tracks each channel
separately (otherwise, some typical fill-a-channel-at-a-time patterns would
produce excessive live intervals across loops and cause spilling).
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48375
(crash -> failure, can turn into pass by forcing unrolling still)
|
|
|
|
|
|
|
|
| |
These messages always have m0 = g0 and m1 = offset, and write has m2 = data.
Avoids regression in opt_compute_to_mrf() with a change to scratch writes to
set up the data as an MRF write in the IR.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Note that BRW_PREDICATE_NONE is 0 and BRW_PREDICATE_NORMAL is 1, so that's a
lot like the true/false we had in the FS before.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
fs_bblock_link -> bblock_link
fs_bblock -> bblock_t (to avoid conflicting with all the fs_bblock *bblock)
fs_cfg -> cfg_t (to avoid conflicting with all the fs_cfg *cfg)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This will let us reuse brw_fs_cfg.cpp from brw_vec4_*.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes confusion by the upcoming live variable analysis which saw e.g. use
of temp.w when only temp.xyz were initialized in the basic block, and
concluded that temp.w must have come from outside of the block (even though it
was never initialized anywhere).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Both callers were doing basically the same thing, just written differently.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Both callers used (effectively) inst->dst as the argument, so just reference
it.
Reviewed-by: Kenneth Graunke <[email protected]>
|