| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
The function will be used later to create the filedescriptor
for other metrics.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dump values for every selected data source in GALLIUM_HUD.
Every data source has its own file and the filename is
equal to the data source identifier.
Set GALLIUM_HUD_DUMP_DIR to dump values to files in this directory.
No values are dumped if the environment variable is not set, the
directory doesn't exist or the user doesn't have write access.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The SPIR-V capability isn't even marked as enabled, and there are no
tests in Vulkan-CTS. Per Jason Ekstrand, this won't work in anv as such
write-only surfaces require additional setup which is currently not
performed.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
upload_3dstate_streamout can be called when there's no currently bound
transform feedback object. In this case, we get the default object,
which has a NULL shader (previously gl_shader_program, now gl_program).
The old code did something sketchy, but which worked:
const struct gl_transform_feedback_info *linked_xfb_info =
&xfb_obj->shader_program->LinkedTransformFeedback;
Here, if shader_program is NULL, this would be a bogus pointer of 0x60.
But we never actually dereferenced it, so it worked out.
With Timothy's recent reworks, we actually end up dereferencing
xfb_obj->program along the way, which crashes since it's NULL.
The solution is to move this pointer initialization into the "active"
block, where we know it actually exists and won't be bogus.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99231
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We also add the stubs for the standalone compiler in this change.
By adding a reference here we can now refactor some code to use
gl_program where we were previously awkwardly using gl_shader_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
i915 is mixing the use of these fields, for now change this to a
struct and add a FIXME.
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99229
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were passing around a void *mem_ctx and using that to initialize the
builder which was wrong since that pointed to ralloc_parent(impl) which
is the shader but the builder is supposed to be initialized with the
nir_function_impl.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
This lets us get rid of the void *mem_ctx parameter and make things a
bit more type safe.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We rename it to nir_deref_clone, re-order the sources to match the other
clone functions, and expose nir_deref_var_clone. This past part, in
particular, lets us get rid of quite a few lines since we no longer have
to call nir_copy_deref and wrap it in deref_as_var.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
See:
dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_vec2_yyyy_fragment
if we only access (in FS) varying.y then it ends up in slot zero.. I'm
not sure the hw likes that..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I forgot to do this in commit 76b97d544e ("anv: enable storage image
extended formats"). Since both drivers support this now, no need for the
conditional enable.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that the SPIR-V -> NIR translation is in place, no additional logic
is required.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
I had this on transfers due to the clear color cmd, but
it seems like that path shouldn't get fast clears.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise we don't get the barriers to flush dcc etc.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
This keeps some of Connor's original code. However, while I was at it,
I updated this very old pass to a bit more modern NIR.
|
|
|
|
|
|
|
|
|
| |
After we figure out the value that we are going to return, we have a
loop that walks up the dominance tree and sets the value in each of the
blocks that doesn't have one yet. In the case of the phi, the def is
set to NEEDS_PHI not NULL, so the last one where the phi node actually
goes never gets filled out. This can lead to duplicating the phi node
unnecessarily.
|
| |
|
|
|
|
| |
This matches the naming of nir_lower_vars_to_ssa, the other to-SSA pass.
|
|
|
|
|
|
|
|
|
| |
After removing brw_shader in the previous commit this is no longer
needed.
V2: remove use in src/compiler/glsl/test_optpass.cpp
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to delete brw_shader and removes the last use of
gl_linked_shader in the codegen paths.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will let us to make _CurrentFragmentProgram a gl_program pointer
allowing for simpilifications to be made.
We also need to add a field to gl_shader to hold it during parsing.
In gl_program we put it inside a union in anticipation of moving
more fields here that can be only fs or vertex stage fields.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
gl_shader_program
This will allow us to make the CurrentProgram array store gl_program which allows
us to do a bunch of simplifications.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will help allow us to store gl_program in the CurrentProgram array rather
than gl_shader_program which will allow a bunch of simplifications.
Note that we make LinkedTransformFeedback a pointer so we don't waste
memory creating a struct for each stage. We also store a pointer to
the gl_program that will contain the pointer in gl_shader_program so
we can get easy access to the correct stage.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We have already set the gl_shader_program pointer to the correct
shader program in _mesa_BeginTransformFeedback() so use it.
This is more consistent with how we do it for gen7.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
We no longer need to initialise it because gl_program is never reused.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This will be used in api_validate.c in a following patch when we
switch to using gl_program pointers for the pipelines CurrentProgram
array.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This now contains everything we need.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow us to store gl_program rather than gl_shader_program
as the current program perstage which allows us to simplify code
that makes use of the CurrentProgram list.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to simplify the current program logic for SSO.
Also since we aim to detach shader_info from nir_shader this will come
in handy avoiding passing nir_shader around just to keep track of
the stage we are dealing with.
V2: set stage for arb asm programs also.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jonas's patch got us most of the benefit of scheduling instructions into
the delay slots of thread switch, but if there had been nothing to pair
the thrsw with, it would move the thrsw up and leave a NOP where the thrsw
was.
Instead, don't pair anything with thrsw through the normal scheduling
path, and have a separate helper function that inserts the thrsw earlier
if possible and inserts any necessary NOPs.
total instructions in shared programs: 93027 -> 92643 (-0.41%)
instructions in affected programs: 14952 -> 14568 (-2.57%)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scan for instructions without a signal set in front of the switching
instruction and move the signal up there.
shader-db results:
total instructions in shared programs: 94494 -> 93027 (-1.55%)
instructions in affected programs: 23545 -> 22078 (-6.23%)
v2: Fix re-emitting of the instruction in the loop trying to emit NOPs,
drop a scheduling change from branch delay slots. (by anholt)
Signed-off-by: Jonas Pfeil <[email protected]>
|
|
|
|
|
| |
This successfully unrolls a new shader in GLB2.7, which also gets that
shader to successfully compile in multithreaded mode.
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I'm sure anv has support for these as well, but this is just
a first use of the interface to allow different supported spir-v
features.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I expect over time the struct contents will change as all
drivers support stuff etc, but for now this should be a good
starting point.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use atomic ops when updating gl_shader::RefCount.
Fixes intermittent failures and crashes in
'dEQP-EGL.functional.sharing.gles2.multithread.*'.
All tests in that group now pass except
'dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.textures.copyteximage2d_texsubimage2d_render'.
Tested with:
mesa: branch 'master' at d6545f2
deqp: branch 'nougat-cts-dev' at 4acf725 with additional local fixes
DEQP_TARGET: x11_egl
hw: Intel Broadwell 0x1616
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99085
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Cc: [email protected]
Cc: Mark Janes <[email protected]>
Cc: Haixia Shi <[email protected]>
|
|
|
|
|
|
|
| |
It should actually be 32 for a4xx/a5xx.. we still only advertise 16 but
for a5xx the linkage map includes position/psize.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We need this in case it is streamed out. Not sure why we were treating
it specially before. Having it as a VS out is harmless if FS doesn't
have a matching input.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We'll need to revisit when adding hw binning pass support, whether we
can still do this in main draw step, as we do w/ a3xx/a4xx, or if we
needed to move it to the binning stage.
Still some failing piglits but most tests pass and the common cases seem
to work.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Pull in a5xx streamout related regs. Also fixes a couple incorrect
register definitions.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Update address calculation to support 64b addresses.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Rework how we lay out driver constants (driver-params, UBO/TFBO buffer
addresses, immediates) for more flexibility. For a5xx+ we need to deal
with the fact that gpu ptrs are 64b instead of 32b, which makes the
fixed offset scheme not work so well. While we are dealing with that
we might also make the layout more dynamic to account for varying # of
UBOs, etc.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Reloc for the buffer address is two dwords on 64b devices (a5xx+)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Seems to be imilar to a4xx, and sampler state "array-pitch" needs
to be aligned to page size.
Signed-off-by: Rob Clark <[email protected]>
|