| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
From Seciton 7.6 (UNIFORM VARIABLES) of the OpenGL 4.5 spec:
"If the value of location is -1, the Uniform* commands will
silently ignore the data passed in, and the current uniform values
will not be changed.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARB_bindless_texture spec says:
"Samplers are represented using 64-bit integer handles."
and,
"Images are represented using 64-bit integer handles."
It seems simpler to always consider sampler and image types
as 64-bit unsigned integer.
This introduces a temporary workaround in _mesa_get_uniform()
because at this point no flag are used to distinguish between
bound and bindless samplers. This is going to be removed in a
separate series. This avoids breaking arb_shader_image_load_store-state.
v3: - update the comment slightly
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.
This will also help for ARB_bindless_texture.
No piglit regressions with RadeonSI.
This time the Intel CI system doesn't report any failures.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This reverts commit 75a31a20af269c047661af33e28f793269537b79.
This breaks thousands of tests on i965 with malloc corruption.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.
This will also help for ARB_bindless_texture.
No piglit regressions with RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we were only making sure types were the same within a
single stage. This looks to have regressed with 953a0af8e3f73.
Fixes: 953a0af8e3f73 ("mesa: validate sampler uniforms during gluniform calls")
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=97524
|
|
|
|
|
|
|
| |
V2: restore lost comment, add static to validate_uniform(),
simplify array offset logic.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
No performance testing has been done, because it makes sense to make this
change regardless of that. Also, _NEW_TEXTURE is still used in many places,
but the obvious occurences are replaced here.
It's now possible to split _NEW_TEXTURE_OBJECT further.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Fixes a compile error with mingw.
|
|
|
|
|
|
| |
V2: actually use PRIu64
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This fixes these like the double version does.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Saves a measly 20 bytes on IA32 and nothing on x64. Depending on
exactly when this is applied, a lot of variation is possible due to
function alignment.
text data bss dec hex filename
6670131 228340 22552 6921023 699b3f lib/i965_dri.so before
6670111 228340 22552 6921003 699b2b lib/i965_dri.so after
6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so before
6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so after
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By putting the parameters first that match the parameters to the call
site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64. On
IA32, the details of the instructions change, but it is the same count
and mix of instructions.
Before:
0000000000000830 <_mesa_Uniform4fv>:
830: 48 83 ec 10 sub $0x10,%rsp
834: 49 89 d0 mov %rdx,%r8
837: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 83e <_mesa_Uniform4fv+0xe>
83e: 89 f8 mov %edi,%eax
840: 89 f1 mov %esi,%ecx
842: 41 b9 02 00 00 00 mov $0x2,%r9d
848: 64 48 8b 3a mov %fs:(%rdx),%rdi
84c: 48 8b 97 c8 01 02 00 mov 0x201c8(%rdi),%rdx
853: 48 8b 72 70 mov 0x70(%rdx),%rsi
857: 6a 04 pushq $0x4
859: 89 c2 mov %eax,%edx
85b: e8 00 00 00 00 callq 860 <_mesa_Uniform4fv+0x30>
860: 48 83 c4 18 add $0x18,%rsp
864: c3 retq
After:
00000000000007f0 <_mesa_Uniform4fv>:
7f0: 48 83 ec 10 sub $0x10,%rsp
7f4: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 7fb <_mesa_Uniform4fv+0xb>
7fb: 41 b9 02 00 00 00 mov $0x2,%r9d
801: 64 48 8b 08 mov %fs:(%rax),%rcx
805: 48 8b 81 c8 01 02 00 mov 0x201c8(%rcx),%rax
80c: 6a 04 pushq $0x4
80e: 4c 8b 40 70 mov 0x70(%rax),%r8
812: e8 00 00 00 00 callq 817 <_mesa_Uniform4fv+0x27>
817: 48 83 c4 18 add $0x18,%rsp
81b: c3 retq
Saves a measly 416 bytes of text on x64. Depending on exactly when this
is applied, a lot of variation is possible due to function alignment.
text data bss dec hex filename
6670131 228340 22552 6921023 699b3f lib/i965_dri.so before
6670131 228340 22552 6921023 699b3f lib/i965_dri.so after
6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so before
6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so after
There is likely to be no performance change with just this patch.
_mesa_uniform immediately calls validate_uniform_parameters with
parameters in the "wrong" (different from the call site) order.
v2: Rebase on GL_ARB_gpu_shader_fp64.
v3: Rebase on GL_ARB_gpu_shader_int64.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By putting the parameters first that match the parameters to the call
site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on
x64. On IA32, the details of the instructions change, but it is the
same count and mix of instructions.
Before:
0000000000001380 <_mesa_UniformMatrix4fv>:
1380: 48 83 ec 10 sub $0x10,%rsp
1384: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 138b <_mesa_UniformMatrix4fv+0xb>
138b: 41 89 f8 mov %edi,%r8d
138e: 41 89 f1 mov %esi,%r9d
1391: 0f b6 d2 movzbl %dl,%edx
1394: 64 48 8b 38 mov %fs:(%rax),%rdi
1398: 48 8b b7 c8 01 02 00 mov 0x201c8(%rdi),%rsi
139f: 48 8b 76 70 mov 0x70(%rsi),%rsi
13a3: 68 06 14 00 00 pushq $0x1406
13a8: 51 push %rcx
13a9: 52 push %rdx
13aa: b9 04 00 00 00 mov $0x4,%ecx
13af: ba 04 00 00 00 mov $0x4,%edx
13b4: e8 00 00 00 00 callq 13b9 <_mesa_UniformMatrix4fv+0x39>
13b9: 48 83 c4 28 add $0x28,%rsp
13bd: c3 retq
After:
0000000000001360 <_mesa_UniformMatrix4fv>:
1360: 48 83 ec 10 sub $0x10,%rsp
1364: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 136b <_mesa_UniformMatrix4fv+0xb>
136b: 0f b6 d2 movzbl %dl,%edx
136e: 64 4c 8b 00 mov %fs:(%rax),%r8
1372: 49 8b 80 c8 01 02 00 mov 0x201c8(%r8),%rax
1379: 68 06 14 00 00 pushq $0x1406
137e: 6a 04 pushq $0x4
1380: 6a 04 pushq $0x4
1382: 4c 8b 48 70 mov 0x70(%rax),%r9
1386: e8 00 00 00 00 callq 138b <_mesa_UniformMatrix4fv+0x2b>
138b: 48 83 c4 28 add $0x28,%rsp
138f: c3 retq
Saves a measly 576 bytes of text on x64.
text data bss dec hex filename
6670131 228340 22552 6921023 699b3f lib/i965_dri.so before
6670131 228340 22552 6921023 699b3f lib/i965_dri.so after
6343924 293872 29880 6667676 65bd9c lib64/i965_dri.so before
6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so after
v2: Rebase on GL_ARB_gpu_shader_fp64.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is C++, so we can mix code and declarations. Doing so allows
constification.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes much more sense and should be more performant in some
critical paths such as SSO validation which is called at draw time.
Previously the CurrentProgram array could have contained multiple
pointers to the same struct which was confusing and we would often
need to fish out the information we were really after from the
gl_program anyway.
Also it was error prone to depend on the _LinkedShader array for
programs in current use because a failed linking attempt will lose
the infomation about the current program in use which is still
valid.
V2: fix validate_io() to compare linked_stages rather than the
consumer and producer to decide if we are looking at inward
facing shader interfaces which don't need validation.
Acked-by: Edward O'Callaghan <[email protected]>
To avoid build regressions the following 2 patches were squashed in to
this commit:
mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use()
These are rewritten to do what the function name suggests, that is
_mesa_shader_program_use() sets the use of all stage and
_mesa_program_use() sets the use of a single stage.
Reviewed-by: Lionel Landwerlin <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
mesa: update active relinked program
This likely fixes a subroutine bug were
_mesa_shader_program_init_subroutine_defaults() would never have been
called for the relinked program as we previously just set
_NEW_PROGRAM as dirty and never called the _mesa_use* functions when
linking.
Acked-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This hooks up the API to the internals for 64-bit integer uniforms.
v2: update to use non-strict aliased alternatives
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This will help allow us to simplify the handling of samplers by
storing them in a single location rather than duplicating them in
both gl_linked_shader and gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Now that we create gl_program earlier there is no need to mess about
copying things to gl_linked_shader then to gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
There is no need to loop over active samplers the code above this
would have already exited if the sampler was inactive, or errored
if the count was larger than the uniforms array size.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
We drop the memset on ImageUnits because gl_program is already
created using rzalloc().
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
gl_shader_program
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This should prevent us from rebuilding the world.
Signed-off-by: Thomas Helland <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When possible, do the memcpy on larger blocks. This reduces cycles
spent in _mesa_propagate_uniforms_to_driver_storage from
1.51 % to 0.62% according to perf during the Unigine Heaven benchmark.
It did not affect the framerate of the benchmark. The system used for
testing was an i5 6600K with a Radeon R9 380.
Piglit hangs randomly on this system both with and without the patch
so i could not make a comparison.
v2: fixed whitespace
Signed-off-by: Nils Wallménius <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
The call to _mesa_update_shader_textures_used() already takes
care of copying for us.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Fix build error cased by 6a524c76f5.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code was inspired from _mesa_update_shader_textures_used
However unlike _mesa_update_shader_textures_used that only check for a single
stage, it will check all stages.
It avoids to loop on all uniforms, only active samplers are checked.
For my use case: high FS frequency switches with few samplers.
Perf event (relative to nouveau_dri.so) goes from 5.01% to 1.68% for
the _mesa_sampler_uniforms_pipeline_are_valid function.
Signed-off-by: Gregory Hainaut <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.
The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.
We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
Acked-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Replaces an iterate and test bit in a bitmask loop by a
loop only iterating over the bits set in the bitmask.
v2: Use _mesa_bit_scan{,64} instead of open coding.
v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
| |
This just moves to the new interfaces in advance of int64.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
When computing the offset in the uniform storage table, take into account
the size multiplier so double precision matrices are handled correctly.
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The only place this was used was in a gallium debug function that
had to be manually enabled.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the case where we convert a double to an int, we should
round the same as we do for floats.
This fixes GL41-CTS.gpu_shader_fp64.state_query
v2: add IROUNDD (Ilia)
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Cc: "11.1" <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Cc: Kenneth Graunke <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=93180
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial glGetUniformdv support didn't cover all the
casting cases that are apparantly legal, and cts seems to
test for them.
I've updated the piglit test to cover these cases now.
v2: fix indentation - it's all broken in this file (Ilia)
fix src/dst index tracking in light of fp64 support (Ilia)
cc: "11.0" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
The uniform will only be of a single type so store the data for
opaque types in a single array.
Cc: Francisco Jerez <[email protected]>
Cc: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would allow glUniformMatrix4fv on a dmat4 and
glUniformMatrix4dv on a mat4. Both are illegal. That later also
overwrites the storage for the mat4 and causes bad things to happen.
Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This matches _mesa_uniform, and it enables the bug fix in the next
patch.
v2: s/type/basicType/ in the assert in _mesa_uniform_matrix.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]> [v1]
Cc: Dave Airlie <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was missed when I did fp64, I've sent a piglit test to cover
the case as well.
Reviewed-by: Timothy Arceri <[email protected]>
Cc: "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
The GLES 3.1 spec removed support for updating the image unit bound to
an image uniform using glUniform1i() calls.
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Cc: Tapani Pälli <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The type stored in gl_uniform_storage is the type of a single array
element not the array type so size was always 1.
V2: Dont validate sampler units pointing to 0
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|