| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
If the driver actually supports ETC2, don't decode it in software.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: add alignment restrictions to docs, fix indentation in headers
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
everytime I open this file in emacs with show trailing whitespace
or git add from it my screen flares with red.
Just do a general cleanup, makes working on fp64 support not as
jarring.
I'm not saying this is perfect, its just better than before.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.
This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.4 10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.
Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):
Unpatched: 42 frames in 1.023 seconds = 41.056 FPS
Patched: 615 frames in 1.000 seconds = 615.000 FPS
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This saves about 0.5k in the text section for a gallium driver
on amd64.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement
mod(x,y) as y * fract(x/y). This implementation has a down side though:
it introduces precision errors due to the fract() operation. Even worse,
since the result of fract() is multiplied by y, the larger y gets the
larger the precision error we produce, so for large enough numbers the
precision loss is significant. Some examples on i965:
Operation Precision error
-----------------------------------------------------
mod(-1.951171875, 1.9980468750) 0.0000000447
mod(121.57, 13.29) 0.0000023842
mod(3769.12, 321.99) 0.0000762939
mod(3769.12, 1321.99) 0.0001220703
mod(-987654.125, 123456.984375) 0.0160663128
mod( 987654.125, 123456.984375) 0.0312500000
This patch replaces the current lowering pass with a different one
(MOD_TO_FLOOR) that follows the recommended implementation in the GLSL
man pages:
mod(x,y) = x - y * floor(x/y)
This implementation eliminates the precision errors at the expense of
an additional add instruction on some systems. On systems that can do
negate with multiply-add in a single operation this new implementation
would come at no additional cost.
v2 (Ian Romanick)
- Do not clone operands because when they are expressions we would be
duplicating them and that can lead to suboptimal code.
Fixes the following 16 dEQP tests:
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_*
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_*
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In preparation for glBlitNamedFramebuffer, the DD table function
BlitFramebuffer needs to accept two arbitrary framebuffer objects rather
than assuming ctx->ReadBuffer and ctx->DrawBuffer.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
The below code crashes when vector_elements <= 0
Fixes Warray-bounds warnings
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: s/unsigned int/unsigned/ in prog_optimize.c
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
And update some comments.
|
|
|
|
|
|
|
|
| |
The 8888 suggests 8-bit components which is not correct, so
replace that with the actual size of the components in each
format.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Instead of using _mesa_pack_rgba_span_float. This should allow us to remove
that function in a later patch.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If you had a conditional assignment of an array or struct (say, from the
if-lowering pass), we'd try doing swizzle_for_size() on the aggregate
type, and it would assertion fail due to vector_elements==0. Instead,
extend emit_block_mov() to handle emitting the conditional operations,
which also means we'll have appropriate writemasks/swizzles on the CMPs
within a struct containing various-sized members.
Fixes 20 testcases in es3conform on vc4.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This reflects the new naming convention for software fallbacks. To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
This reflects the new naming convention for software fallbacks. To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
| |
Cc: 10.2 10.3 10.4 <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This involved adding a new st_texture_image_const() helper also.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug
Cc: 10.2 10.3 10.4 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Tested with llvmpipe by setting the cap bit temporarily, seems to work,
though no driver requests it for now.
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original idea was to optimize away the condition by integrating it directly
into the CMP instruction. However, with native integers this requires an extra
I2F instruction. It is also fishy because the negation used didn't really honor
ieee754 float comparison rules, not to mention the CMP instruction itself
(being pretty much a legacy instruction) doesn't really have defined special
float value behavior in any case.
So, use UCMP and adjust the code trying to optimize the condition away
accordingly (I have absolutely no idea if such conditions are actually hit
or would be translated away somewhere else already).
v2: cosmetic changes
No piglit regressions on llvmpipe.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For drivers building up to GL(ES)3, only expose the actual extension if
the API will let it be used (e.g. via overrides/debug flags that enable
higher versions).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not all drivers can set gl_Layer from VS. Add a fallback that passes the
instance id from VS to GS, and then uses the GS to set the layer.
Tested by adding
quad_buffers |= clear_buffers;
clear_buffers = 0;
to the st_Clear logic, and forcing set_vertex_shader_layered in all
cases. No piglit regressions (on piglits with 'clear' in the name).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.4 10.3" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sampler_array_size field was added by "mesa/st: add support for
dynamic sampler offsets". But the field wasn't getting copied in
the get_pixel_transfer_visitor() or get_bitmap_visitor() functions.
The count_resources() function then didn't properly compute the
glsl_to_tgsi_visitor::samplers_used bitmask. Then, we didn't declare
all the sampler registers in st_translate_program(). Finally, we
asserted when we tried to emit a tgsi ureg src register with File =
TGSI_FILE_UNDEFINED.
Add the missing assignments and some new assertions to catch the
invalid register sooner.
Cc: "10.3, 10.4" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Almost all drivers ignore them.
|
|
|
|
| |
Tested-by: Nick Sarnie <[email protected]>
|
|
|
|
| |
Tested-by: Nick Sarnie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This removes the need for the gallium rasterizer state
to listen to viewport changes.
Thanks to Marek Olšák <[email protected]>.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
| |
When we're checking if the framebuffer is sRGB capable, call
is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b.
255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.
The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.
This fixes a shader compile failure on R500/SM3. Reported on IRC.
Cc: 10.2 10.3 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It was identical to the default implementation in _mesa_new_shader.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was identical to the default implementation in
_mesa_new_shader_program.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gallium should be prepared fine for ARB_clip_control.
So enable this and mention it in the release notes.
v2:
Only enable for drivers announcing the freshly introduced
PIPE_CAP_CLIP_HALFZ capability.
v3:
Use extension enable infrastructure to connect PIPE_CAP_CLIP_HALFZ
with ARB_clip_control.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is for preparation of ARB_clip_control.
v3:
Add comments.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
| |
This deduplicates some code.
|
|
|
|
|
|
|
|
|
|
|
| |
We must convert it to boolean from the DX9 float encoding that Gallium
specifies.
Later, we should probably define that FACE should be 0 or ~0 if native
integers are supported.
Cc: 10.2 10.3 <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Useful for tessellation.
|
|
|
|
|
|
|
|
| |
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.
v2: return 32 for softpipe and llvmpipe
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a crash when exiting Firefox. I have really no idea how Firefox
does it. It seems to involve multiple contexts and multithreading.
v2: added an XXX comment
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81680
Acked by Christian König.
Cc: 10.2 10.3 <[email protected]>
Tested-by: Benjamin Bellec <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
NewBufferObject took a "target" parameter, which it blindly passed to
_mesa_initialize_buffer_object(), which ignored it.
Not much point in passing it around.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also move num_state_slots inside ir_variable_data for better packing.
The payoff for this will come in a few more patches.
No change Valgrind massif results for a trimmed apitrace of dota2.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
| |
Earlier in the function we assert layers==6 for PIPE_TEXTURE_CUBE so
there's no reason to special-case the pt.array_size = layers assignment.
Reviewed-by: Ilia Mirkin <[email protected]>
|