| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
for L2 prefetch
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
it doesn't interact with compute shaders in any way
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
it's exactly the same as the other ones
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a number of transform feedback tests when run with Linux 4.8,
which allows us to use the MI_LOAD_REGISTER_REG command, at which point
we started using this new broken path.
ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
patch, for example.
Thanks to Chris Wilson for catching this mistake!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were failing to zero m0.2 of the sampler message header for TCS and
GS messages in the simple case. fs_generator has done this for about
a year now, but we missed it in vec4_generator.
Fixes ES31-CTS.core.texture_cube_map_array.sampling,
GL45-CTS.texture_cube_map_array.sampling, and many
dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests:
- dynamically_uniform.tessellation_control.isampler3d
- dynamically_uniform.tessellation_control.isamplercube
- dynamically_uniform.tessellation_control.sampler2d
- dynamically_uniform.tessellation_control.usamplercube
- dynamically_uniform.tessellation_control.sampler2darray
- dynamically_uniform.tessellation_control.isampler2darray
- dynamically_uniform.tessellation_control.usampler3d
- dynamically_uniform.tessellation_control.usampler2darray
- dynamically_uniform.tessellation_control.usampler2d
- dynamically_uniform.tessellation_control.sampler3d
- dynamically_uniform.tessellation_control.samplercube
- dynamically_uniform.tessellation_control.isampler2d
- uniform.tessellation_control.isampler3d
- uniform.tessellation_control.isamplercube
- uniform.tessellation_control.usampler2d
- uniform.tessellation_control.usampler3d
- uniform.tessellation_control.sampler2darray
- uniform.tessellation_control.isampler2darray
- uniform.tessellation_control.usampler2darray
- uniform.tessellation_control.sampler2d
- uniform.tessellation_control.usamplercube
- uniform.tessellation_control.sampler3d
- uniform.tessellation_control.samplercube
- uniform.tessellation_control.isampler2d
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Fix incorrect swizzling in SIMD16 Transpose_16_16 breaking the
two-channel 16-bpc formats like R16G16_FLOAT.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Honor the colorHottileEnable mask when accessing colorBuffer pointers.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Fix routines for 8-bit and 16-bit formats used by optimized tile store.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed Transpose_16 methods of following formats:
Transpose8_8_8_8
Transpose8_8
Transpose32_32
Transpose16_16_16_16
Transpose16_16_16
Transpose16_16
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A while ago, we stopped using Luca's GLSL IR lower_jumps pass in favor
of nir_lower_returns(). Marek's commit d3cb79e043338b0e55a3fba8df652f3
put it in do_common_optimization, which resulted in us calling it again.
Dropping the EmitNoMainReturn setting makes us skip that pass again.
Apparently that pass doesn't work properly, because this fixes Piglit's
tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99287
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The T images are composed of effectively swizzled-around blocks of LT (4x4
utile) images, so we can reduce the t_utile_address() calls by 16x by
calling into the simpler LT loop.
This also adds support for calling down with non-utile-aligned
coordinates, which will be part of lifting the utile alignment requirement
on our callers and avoiding the RMW on non-utile-aligned stores.
Improves 1024x1024 TexSubImage by 2.55014% +/- 1.18584% (n=46)
Improves 1024x1024 GetTexImage by 2.242% +/- 0.880954% (n=32)
|
|
|
|
|
|
| |
I want these inlined in the callers, particularly with the tiling
changes coming up, but we're not building with lto so some caller
would suffer.
|
|
|
|
|
| |
They don't have any other callers outside of this file, and I'm hoping
they get inlined soon.
|
|
|
|
|
|
|
|
|
| |
They now have less of a dependency on the cpp, and don't have to do a
divide.
Hacking up mesa-demos teximage to do only one subtest and not draw
points, I saw 1024x1024 glTexSubImage2D() improve by 4.86939% +/-
1.40408% (n=30) and glGetTexImage() by 2.18978% +/- 0.140268% (n=5).
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If we get up toward 256MB (or whatever the CMA area size is),
VC4_GEM_CREATE will start throwing errors. Even if we don't trigger
that, when we flush the kernel's BO allocation for the CLs or bin
memory may end up throwing an error, at which point our job won't get
rendered at all.
Just flush early (half of maximum CMA size) so that hopefully we never
get to that point.
|
|
|
|
|
|
|
|
| |
This will help allow us to simplify the handling of samplers by
storing them in a single location rather than duplicating them in
both gl_linked_shader and gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Now that we create gl_program earlier there is no need to mess about
copying things to gl_linked_shader then to gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
There is no need to loop over active samplers the code above this
would have already exited if the sampler was inactive, or errored
if the count was larger than the uniforms array size.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Making this point to a gl_program struct rather than a gl_shader_program
struct will allow use to later also make the CurrentProgram array hold
gl_program structs which in turn will allow for code simpilifcation.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We no longer need it.
While we are at it we mark the vs, gs, and wm codegen functions as static.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now that we have the is_arb_asm flag we can just skip the
initialisation.
V2: remove hack from standalone compiler where it was never
needed since it only compiles glsl shaders.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set the flag via the _mesa_init_gl_program() and NewProgram()
helpers.
In i965 we currently check for the existance of gl_shader_program
to decide if this is an ARB assembly style program or not.
Adding a flag makes the code clearer and will help removes a
dependency on gl_shader_program in the i965 codegen functions.
Also this will allow use to skip initialising sampler units for
linked shaders, we currently memset it to zero again during linking.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This is the only thing we use from gl_shader_program so pass it directly.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
We can now just get the data needed from the gl_shader_program_data
pointer in gl_program.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
There is no need to pass gl_linked_shader anymore.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
brw_assign_common_binding_table_offsets()
We now get everything we need directly from gl_program so there is
no need for this.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Here we also remove the duplicate field in gl_linked_shader and always
get the value from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This will help allow us to store pointers to gl_program structs in the
CurrentProgram array resulting in a bunch of code simplifications.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This also removes the duplicate field in gl_linked_shader, and
gets num_ubos from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
We drop the memset on ImageUnits because gl_program is already
created using rzalloc().
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This removes another dependency on gl_shader_program from the codegen functions,
this will help allow us to use gl_program for the CurrentProgram array rather
than gl_shader_program.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather then passing gl_shader_program.
The only field use was Name which is the same as the Id field in
gl_program.
For wm and vs we also make the functions static and move them before
the codegen functions.
This change reduces the codegen functions dependency on gl_shader_program.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Fix typo using wrong (uninitialized) build context introduced by
4634cb5921b985f04f2daf00cda2d28036143bd3. (This only affects very rare
small packed formats which have a PIPE_SWIZZLE_0 channel, such as
r4a4, which is never used by mesa/st. Nevertheless it broke lp_test_format.)
|