| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
See GL_NV_viewport_array2::gl_ViewportMask for how this is supposed
to work.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4529>
|
|
|
|
|
|
|
| |
This restores support for promoting UBO loads to constant loads when
using LDC.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I had missed that LDC actually uses vec4 units for its offset. This
means that we have to create a new instruction, and lower it in
ir3_nir_lower_io_offsets, similar to the existing SSBO instructions.
Unfortunately we can't assume that loads are always vec4-aligned, so we
have to use the alignment information that NIR gives us. Unfortunately,
it's currently woefully inadequate, and will have to be fixed to give us
good codegen in the future.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a0e57432b76c32f2109dab0ad3df0ba03967441c.
It's unclear what caused the test to fail back then. Now it's seems to be
reversed. I tested with a close enough piglit and mesa branch and wasn't
able to reproduce the same test result I've got in some older piglit runs.
Fixes:
dEQP-GLES2.functional.rasterization.primitives.lines_wide
dEQP-GLES2.functional.rasterization.primitives.line_strip_wide
dEQP-GLES2.functional.rasterization.primitives.line_loop_wide
dEQP-GLES2.functional.rasterization.limits.points
dEQP-GLES2.functional.clipping.line.wide_line_z_clip
dEQP-GLES2.functional.clipping.line.wide_line_z_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_z_clip_viewport_corner
dEQP-GLES2.functional.clipping.line.wide_line_clip
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.line.wide_line_attrib_clip
dEQP-GLES2.functional.polygon_offset.default_result_depth_clamp
dEQP-GLES2.functional.polygon_offset.default_factor_1_slope
dEQP-GLES2.functional.polygon_offset.fixed16_result_depth_clamp
dEQP-GLES2.functional.polygon_offset.fixed16_factor_1_slope
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4575>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes two bugs: First, if the same block index showed up twice, we
only pick the first one. Second, we weren't multiplying by 32. This
didn't show up in tests because RBA testing is garbage. Found while
looking at shaders from the UE4 Shooter demo.
Fixes: e03f9652 "anv: Bounds-check pushed UBOs when..."
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4578>
|
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4578>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Iris allocates gem buffers using buckets of allocation sizes that are
page aligned. We always ask for batch buffers of size BATCH_SZ +
BATCH_RESERVED, which is not page aligned: we ask for 65552 bytes,
which ends up in the bucket of size 81920, resulting in 20% unused
space. Adjust things so there is no waste of space: BATCH_SZ +
BATCH_RESERVED is now 65536.
Reviewed-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>
|
|
|
|
|
|
|
|
|
| |
We assign a real value a few lines below, and none of the lines in
between rely on the zeroed bo->gtt_offset value.
Reviewed-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This decreases the size of the struct on a 64bit machine from 144 to
136. While that's not a lot, this is one of the structs that we're
allocating all the time.
For a full Aztec run on BDW we allocate this struct 3273 times, and we
can have up to 3259 of them live at the same time. So we end up saving
just a little over 6 pages for this benchmark.
Spotted this while trying to add another bool for an unrelated
feature.
Reviewed-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>
|
|
|
|
|
|
|
|
| |
It seems meson returns the filename with extension for full_path(), even
though Cygwin does it's best to pretend the file doesn't have that
extension.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4514>
|
|
|
|
|
|
| |
Fixes: 18f896e55d96 (llvmpipe: add initial nir support)
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4563>
|
|
|
|
|
|
| |
Fixes: 0d02a7b8ca794 (draw: add main tessellation code)
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4563>
|
|
|
|
|
|
|
|
|
| |
Not sure how I missed this, the ownership was a bit blurry,
free the NIR.
Fixes: bf12bc2dd7a2 (draw: add nir info gathering and building support)
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4563>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After some experimentation, I believe that GRAS_LAYER_CNTL is
actually just a count register storing the number of layers in the
render target. While debugging cube_array geometry tests, I noticed
that the blob was setting an unknown 0x8 to LAYER_CNTL, so I checked
the value of LAYER_CNTL for various layer sizes:
1: LAYER_CNTL=0
2: LAYER_CNTL=1
3: LAYER_CNTL=2
4: LAYER_CNTL=3
9: LAYER_CNTL=8
256: LAYER_CNTL=255
2000: LAYER_CNTL=1999
Seems like this register just stores a count of the largest layer
that can be written to via gl_Layer. This commit updates the reg
docs, freedreno's gs implementation, and turnip's gs implementation.
Fixes dEQP-VK.geometry.layered.cube_array.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
|
|
|
|
|
|
|
|
|
| |
Without these consts, the geometry shader is unable to read from
textures or uniforms.
Fixes dEQP-VK.geometry.layered.*.readback
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were using layout.layer_size for the layer stride, but
in Vulkan, you can alias a 3D image as an array of 2D images via the
VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT flag. One reason to use
this behavior is so the geometry shader can write to a specific
depth in a 3D framebuffer with gl_Layer.
Since the 3D image is not a *true* layered image, layer_size is 0.
Instead, we can copy what freedreno does and use the slice size.
Fixes dEQP-VK.geometry.layered.3d.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4541>
|
|
|
|
|
|
|
|
| |
Fixes: dEQP-GLES2.functional.fragment_ops.depth_stencil.stencil_* and more
Fixes: 4137a79c2a7e ("gallium: add viewport swizzling state and cap")
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4567>
|
|
|
|
|
|
|
|
| |
Fixes: ff168b297d94 ("mesa: add GL_NV_viewport_swizzle support")
Reported-by: Roy Spliet <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4567>
|
|
|
|
|
|
|
|
|
| |
It's unsupported because small bitsizes are still not completely
supported. It should have been disabled by default with ACO.
Acked-by: Daniel Schürmann <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4549>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4554>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier commit added the code unconditionally, since the loader code
itself is already built on macOS.
Although it did not consider the #include mayhem that src/glx is.
In particular, none of the __GLXDRI{screen,context,drawable) are
available for macOS... those are pulled by dri_common.[ch].
Ideally we'll untangle that, but for the time being simply #ifdef out
the include/call.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2726
Fixes: b699d070a6d ("glx: set the loader_logger early and for everyone")
Signed-off-by: Emil Velikov <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4490>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: use static array to keep name -> func mapping
v3: use unordered_map
v4: handle ARM constants
reorder dispatch table
wrap enqueue APIs as the command value differs between khr and arm
v5: move declarations into dispatch.hpp
handle CL_MEM_USES_SVM_POINTER_ARM in clGetMemObjectInfo
v6: breaking long lines
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
all of the functionality can be mapped to malloc/free if the device supports
fine grained system SVM.
v2: fix some API bugs found with the OpenCL CTS
v3: remove validate_even_wait_list
improve implementation of clSetKernelExecInfo
make clEnqueueSVMFree spec compliant
rename can_emulate_non_system_svm to has_system_svm and make it a member method
improve validation in clEnqueueSVMMemFill
handle CL_MEM_USES_SVM_POINTER in clGetMemObjectInfo
v4: break long lines and other minor cosmetic adjustments
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it is pretty much identical to a clSetKernelArg for a scalar field, except
it is only valid for global and constant memory pointers.
Also the type equals void* on the Host, so we can just check the size of it.
v2: prefer using target_size to extend the pointer value
v3: handle more corner cases in combiation to clSetKernelArg
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: without supporting userptrs SVM can't be implemented as it's impossible
to ensure memory consistency with HOST_PTR buffers
v3: fix comment style
v4: fixes typo in comment
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
although most of those are 2.0 core functions, there is
cl_arm_shared_virtual_memory to expose those in a 1.2 context. But we
should be able to expose this extension with 1.1 as well as there is no
technicaly reason why this shouldn't work.
v2: move svm functions into existing files
v3: rename func args to match convention
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
|
|
| |
v2: split enum in specific caps to abstract the CL enum
v3: remove BUFFER_SVM caps
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2076>
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Fixes: 6f718edcedd ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to assume any external buffer could be used by the display HW.
In the case that buffer is also CPU mapped, we want to assume no cache
coherency as it is only available between GT & CPU, not display.
Many thanks to Michel Dänzer for the hint!
v2: Move cache coherent drop to bufmgr (Chris)
v3: Also make BO external if created with PIPE_BIND_SHARED (Eric)
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2552
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4533>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).
RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>
|
|
|
|
|
|
|
| |
Fixes: KHR-GL45.packed_depth_stencil.blit.depth32f_stencil8
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL spec requires user culling, then clipping then face culling.
llvmpipe was doing clipping then user culling then face culling.
Fix the ordering by adding a new user_cull stage that does the user
culling
Fixes piglit clip_cull-4.shader_test
v2: simplify this a lot (Roland)
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
|
| |
This just appears to be missing:
Fixes:
KHR-GL45.cull_distance.functional
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
|
|
| |
You have to count the stats pre-culling here.
Fixes:
KHR-GL45.pipeline_statistics_query_tests_ARB.functional_primitives_vertices_submitted_and_clipping_input_output_primitives
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
| |
this only applies to the TGSI path, fixes
KHR-GLES31.core.geometry_shader.api.program_pipeline_vs_gs_capture
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
| |
Otherwise masked off channels can access random bad memory
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
| |
Fixes some sampling issues in vertex shaders
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
| |
The any queries need to signal if any stream has overflowed,
so we have to track all the streams.
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
| |
Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_tess_queries
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
|
|
| |
Make sure we unreference all resources for all shaders on context
destruction.
Fixes: eb5227173f03 (llvmpipe: add support for tessellation shaders)
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4560>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
before:
$ file src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py
src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py: Python script text executable, UTF-8 Unicode (with BOM) text
after:
$ file src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py
src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py: Python script text executable, ASCII text
This patch also fixes this build error.
File "src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py", line 1
# Copyright (C) 2014-2018 Intel Corporation. All Rights Reserved.
^
SyntaxError: invalid character in identifier
Fixes: c6e67f5a9373 ("gallium/swr: add OpenSWR rasterizer")
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Jan Zielinski <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4221>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These macros produced a lot of errors with ubsan preventing us from
expanding the ubsan coverage on CIs.
C++ spec has such clause:
"If the prvalue of type "pointer to cv1 B" points to a B that is
actually a subobject of an object of type D, the resulting pointer
points to the enclosing object of type D. Otherwise, the result
of the cast is undefined."
Ubsan error example:
../src/compiler/glsl/builtin_functions.cpp:4945:4: runtime error: downcast of address 0x559b926abb50 which does not point to an object of type 'ir_instruction'
0x559b926abb50: note: object has invalid vptr
9b 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 58 ba 6a 92 9b 55 00 00 01 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x559b914dbe1a in call ../src/compiler/glsl/builtin_functions.cpp:4945
Signed-off-by: Danylo Piliaiev <[email protected]>
Acked-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4129>
|
|
|
|
|
|
|
|
|
|
| |
Fixes (with other patches to allow these tests to run):
dEQP-VK.ycbcr.query.size_lod.vertex.*
Suggested-by: Rob Clark <[email protected]>
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>
|
|
|
|
|
|
|
|
|
| |
Fixes a "free(): invalid next size (fast)" error in:
dEQP-VK.glsl.texture_functions.query.texturequerylevels.*
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4557>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the alignment is now checked in the validator we must set it.
v2: Use alignement of 4, i.e. dest bit size by eight.
v3: Use alignment 16 (Rhys Perry & Jason Ekstand)
v4: Use nir_intrinsic_set_align to make it clear that align offset is 0
(Jason)
Fixes: e78a7a182524f091e2d77ba97bfbe057c3975cab
nir: Assert memory loads are aligned
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4544>
|