| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Reduce likelihood of collision with real buffers by placing the
hole at the top of the 4G area. This fixes some indirect draw+compute
tests with large buffers.
Suggested by Ilia Mirkin.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Uniform buffer objects will be sticked to the driver constant buffer
like buffers because the launch descriptor only allows 8 CBs.
Input kernel parameters for OpenCL are still uploaded to screen->parm
which is bound on c0, but this will be changed later with a new series.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using the screen->parm buffer object which will be removed,
upload auxiliary constants to uniform_bo to be consistent regarding
what we already do for Fermi.
This breaks surfaces support (for compute only) but this will be
properly re-introduced later for ARB_shader_image_load_store.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
By using os_log_message directly, as _debug_vprintf truncates messages
to 4K.
Also cleanup the disassemble interface.
Spotted by Roland.
Trivial.
|
|
|
|
|
|
|
|
| |
Just noticed this in passing.. gl_shader_stage already has tess so this
comment no longer applies.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Mesa demos are no longer part of the main Mesa tree/tarball.
Add Gallium and GLX code to list of major components.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Android Bionic does not support strchrnul() string function,
gallium auxiliary util/u_string.h provides util_strchrnul()
This change avoids the following building error:
external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error:
undefined reference to 'strchrnul'
collect2: error: ld returned 1 exit status
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID config
attributes to true for Android. These are required in Marshmallow.
The implementation of EGL_RECORDABLE_ANDROID support has 2 options in
the definition of the extension. Android implements the 2nd option
which is the encoder must support RGB input. The requested input format
is RGB888, so setting the attribute on all the native Android visual
formats should be sufficient.
Similarly, setting EGL_FRAMEBUFFER_TARGET_ANDROID for all configs with
a EGL_NATIVE_VISUAL_ID should be sufficient. Most likely, the HWC should
support the same set of formats the underlying DRM driver supports.
Cc: [email protected]
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is used by Android to select an eglconfig compatible with screen
recording.
Cc: [email protected]
Signed-off-by: Rob Herring <[email protected]>
[Emil Velikov: add the _eglIsConfigAttribValid check]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is used by Android to select an eglconfig compatible with HWComposer.
Cc: [email protected]
Signed-off-by: Rob Herring <[email protected]>
[Emil Velikov: add the _eglIsConfigAttribValid check]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Builds with gallium enabled fail on x86 with linker error:
external/mesa3d/src/mesa/vbo/vbo_exec_array.c:127: error: undefined reference to '_mesa_uint_array_min_max'
The problem is sse_minmax.c is not included in the libmesa_st_mesa
library. Since the SSE4.1 files are needed for both libmesa_st_mesa
and libmesa_dricore, move SSE4.1 files into a separate static library
that can be used by both.
Cc: "11.1 11.2" <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an old patch I had around.
Vector selects seem to work well from LLVM 3.3. Using them should
improve code quality, as it might make constant propagation pass more
effective.
Tested lp_test_*
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
variables
Needed because not all the built-in variables are marked as system
values, so they still have the mode ir_var_auto. Right now it fixes
raising the warning when gl_GlobalInvocationID and
gl_LocalInvocationIndex are used.
v2: use is_gl_identifier instead of filtering for some names (Ilia
Mirkin)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
vdpau has recently come to rely on this, so make sure to check it
properly.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This is the same ext as ARB_draw_buffers_blend (plus some core
functionality that already exists). Add the alias entrypoints.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell GT2 and GT3 have a minimum of 64 entries. Hardcoding 32
is not legal.
v2: Delete stale comment (caught by Alejandro).
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Sandybridge PRM's description of the resinfo message,
the .z value returned will be Depth == 0 ? 0 : Depth + 1. The earlier
PRMs have the same table.
This means we return 0 for array textures with a single slice, when
we ought to return 1. Just override it to max(depth, 1).
Fixes 10 dEQP-GLES3.functional tests on Sandybridge:
shaders.texture_functions.texturesize.sampler2darray_fixed_vertex
shaders.texture_functions.texturesize.sampler2darray_fixed_fragment
shaders.texture_functions.texturesize.sampler2darray_float_vertex
shaders.texture_functions.texturesize.sampler2darray_float_fragment
shaders.texture_functions.texturesize.isampler2darray_vertex
shaders.texture_functions.texturesize.isampler2darray_fragment
shaders.texture_functions.texturesize.usampler2darray_vertex
shaders.texture_functions.texturesize.usampler2darray_fragment
shaders.texture_functions.texturesize.sampler2darrayshadow_vertex
shaders.texture_functions.texturesize.sampler2darrayshadow_fragment
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Oddly, this did not affect the shader where I first noticed the pattern.
That particular shader doesn't get its if-statement converted to a bcsel
because there are two assignments in the else-statement. This led to me
submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747.
shader-db results:
Sandy Bridge
total instructions in shared programs: 8467384 -> 8467069 (-0.00%)
instructions in affected programs: 36594 -> 36279 (-0.86%)
helped: 46
HURT: 0
total cycles in shared programs: 117573448 -> 117568518 (-0.00%)
cycles in affected programs: 339114 -> 334184 (-1.45%)
helped: 46
HURT: 0
Ivy Bridge / Haswell / Broadwell / Skylake:
total instructions in shared programs: 7774258 -> 7773999 (-0.00%)
instructions in affected programs: 30874 -> 30615 (-0.84%)
helped: 46
HURT: 0
total cycles in shared programs: 65739190 -> 65734530 (-0.01%)
cycles in affected programs: 180380 -> 175720 (-2.58%)
helped: 45
HURT: 1
No change on G45 or Ironlake.
I also tried these expressions, but none of them affected any shaders in
shader-db:
(('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)),
(('bcsel', a, 'b@bool', False), ('iand', a, b)),
(('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)),
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
None of the callers actually wanted what it did. In ptn_xpd, you only
ever want a vec3 swizzle. In ptn_tex, you want a swizzle that matches
the number of required texture coordinates.
shader-db results:
G45:
total instructions in shared programs: 4011240 -> 4010911 (-0.01%)
instructions in affected programs: 59232 -> 58903 (-0.56%)
helped: 114
HURT: 0
total cycles in shared programs: 84314194 -> 84313220 (-0.00%)
cycles in affected programs: 779150 -> 778176 (-0.13%)
helped: 110
HURT: 13
Ironlake:
total instructions in shared programs: 6397262 -> 6396605 (-0.01%)
instructions in affected programs: 117402 -> 116745 (-0.56%)
helped: 227
HURT: 0
total cycles in shared programs: 128889798 -> 128888524 (-0.00%)
cycles in affected programs: 1214644 -> 1213370 (-0.10%)
helped: 179
HURT: 44
Sandy Bridge:
total instructions in shared programs: 8467391 -> 8467384 (-0.00%)
instructions in affected programs: 3107 -> 3100 (-0.23%)
helped: 10
HURT: 6
total cycles in shared programs: 117580120 -> 117573448 (-0.01%)
cycles in affected programs: 103158 -> 96486 (-6.47%)
helped: 84
HURT: 11
Ivy Bridge:
total instructions in shared programs: 7774255 -> 7774258 (0.00%)
instructions in affected programs: 1677 -> 1680 (0.18%)
helped: 8
HURT: 6
total cycles in shared programs: 65743828 -> 65739190 (-0.01%)
cycles in affected programs: 89312 -> 84674 (-5.19%)
helped: 78
HURT: 23
Haswell:
total instructions in shared programs: 7107172 -> 7107150 (-0.00%)
instructions in affected programs: 2048 -> 2026 (-1.07%)
helped: 16
HURT: 0
total cycles in shared programs: 64653636 -> 64647486 (-0.01%)
cycles in affected programs: 86836 -> 80686 (-7.08%)
helped: 85
HURT: 17
Broadwell and Skylake:
total instructions in shared programs: 8447529 -> 8447507 (-0.00%)
instructions in affected programs: 2038 -> 2016 (-1.08%)
helped: 16
HURT: 0
total cycles in shared programs: 66418670 -> 66413416 (-0.01%)
cycles in affected programs: 90110 -> 84856 (-5.83%)
helped: 83
HURT: 20
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The KIL instruction doesn't have a destination, so ptn_kil never uses
dest.
program/prog_to_nir.c: In function ‘ptn_kil’:
program/prog_to_nir.c:547:38: warning: unused parameter ‘dest’ [-Wunused-parameter]
ptn_kil(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src)
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The unit variable can be used uninitialized.
Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The number of channels must be 4 for all RGBA components.
Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was kind of overloaded, returning two different things. Now get
the index of the shadow reference src register with a new
tgsi_util_get_shadow_ref_src_index() function.
To verify the new code, I added some temp/debug code which looped
over all TGSI_TEXTURE_x values, calling the old function and new and
checking that the returned indexes matched.
Also tested piglit "shadow" tests with softpipe/llvmpipe.
No testing of ilo and radeonsi changes.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Should fix the assertion in piglit
spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the
TXQ instruction specifies a 2D target but the sampler view was
declared as SHADOW2D.
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a null pointer dereference during the register allocation pass,
if a function had arguments.
Functions arguments get a definition from the function itself, a definition
which is therefore not linked to any instruction. If a value ends up having
a definition but no linked instruction, the register allocation pass doesn't
need to consider whether that value is generated by an instruction that
can only handle "short" registers (on nv50).
Signed-off-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
The extension is identical to GL_OES_copy_image. But dEQP has tests that
want the EXT variant.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We require the full ARB_gpu_shader5 for now, but in the future some
other CAP could get exposed to indicate that only the multisample-related
behavior of ARB_gpu_shader5 is available.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Ken did this earlier, and this is just me reimplementing his patch a
little differently.
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of removing every instruction in add_insts_from_block(), just
move the instruction to its scheduled location. This is a step towards
doing both bottom-up and top-down scheduling without conflicts.
Note that this patch changes cycle counts for programs because it begins
including control flow instructions in the estimates.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think when this code was written, basic blocks were always ended by a
control flow instruction or an end-of-thread message. That's no longer
the case, and removing this restriction actually helps things:
instructions in affected programs: 7267 -> 7244 (-0.32%)
helped: 4
total cycles in shared programs: 66559580 -> 66431900 (-0.19%)
cycles in affected programs: 28310152 -> 28182472 (-0.45%)
helped: 9577
HURT: 879
GAINED: 2
The addition of the is_control_flow() checks is not a functional change,
since the add_insts_from_block() does not put them in the list of
instructions to schedule. I plan to change this in a later patch.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Missing this causes an assertion failure in the scheduler with the next
patch.
Additionally, this gives cmod propagation enough information to optimize
code better.
total instructions in shared programs: 7112991 -> 7112852 (-0.00%)
instructions in affected programs: 25704 -> 25565 (-0.54%)
helped: 139
total cycles in shared programs: 64812898 -> 64810674 (-0.00%)
cycles in affected programs: 127224 -> 125000 (-1.75%)
helped: 139
Acked-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d0e1d6b7e27bf5f05436e47080d326d7daa63af2.
The change in the vec4 code is a mistake -- there's never an
FS_OPCODE_FB_WRITE in vec4 code.
The change in the fs code had the (harmless) effect of not recognizing
an FB_WRITE as a scheduling barrier even if it was marked EOT --
harmless because the scheduler marked the last instruction of a block as
a barrier, something I'm changing in the following patches.
This will be reimplemented later in the series.
|
|
|
|
|
|
|
| |
All of these were simply code for "architecture register file" (and in
the case of destinations, "not the null register").
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
These printed the cycle count the last basic block (sched.time is set
per basic block!). We have accurate, full program, data printed
elsewhere.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
Failed to update state tracker with new buffer interface.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|