aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
...
* freedreno: a2xx: fix non-zero texture base offsetsJonathan Marek2019-01-211-1/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20xJonathan Marek2019-01-213-18/+34
| | | | | | | | | | On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small rearrangement on a220 to reduce the size of draw commands. Only set DEALLOC_CNTL on a20x because the correct a220 value is not known. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix gmem2mem viewportJonathan Marek2019-01-211-0/+7
| | | | | | | Fixes cases where previous viewport values might case gmem2mem to fail. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTLJonathan Marek2019-01-212-18/+20
| | | | | | | | | | Doesn't change much, but reduces the size of fd2_emit_state gmem2mem does not need to change the value: no Z clipping on resolve mem2gmem now needs to restore the common value after rendering Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: cleanup init_shader_constJonathan Marek2019-01-213-14/+11
| | | | | | | | | Only 3 vertices are used so we can drop the data for vertex 4 It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: add bit_size parameter to system values with multiple allowed bit sizesKarol Herbst2019-01-211-2/+2
| | | | | | | | | v2: add assert to verify we have at least one valid bit_size v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nv50,nvc0: add missing CAPs for unsupported featuresRhys Kidd2019-01-202-0/+4
| | | | | Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nir: rename nir_var_ssbo to nir_var_mem_ssboKarol Herbst2019-01-191-1/+1
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_ubo to nir_var_mem_uboKarol Herbst2019-01-191-1/+1
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-192-3/+3
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno/a6xx: Turn on texture tiling by defaultKristian H. Kristensen2019-01-187-43/+64
| | | | | | | | The color swap isn't available for tiled formats and it's not needed either. We pick one channel order and use for all non-linear formats. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Synchronize batch and flush for staging resourceKristian H. Kristensen2019-01-181-1/+15
| | | | | | | | Staging blit downloads would wait on the src resource instead of the staging resource and didn't make sure to submit the blit batch first. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* gm107/ir: disable TEXS for tex with derivAll setKarol Herbst2019-01-182-1/+3
| | | | | | | | | | | | | | | | | | | | | | | fixes deqp tests: dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_float_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usamplercube_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.isampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.usampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex Fixes: f821e80213e38e93f96255b3deacb737a600ed40 "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise ↵Karol Herbst2019-01-181-1/+1
| | | | | | | | | | instructions fixes dEQP-GLES2.functional.shaders.invariance.mediump.loop_3 CC: <[email protected]> Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* v3d: Restructure RO allocations using resource_from_handle.Eric Anholt2019-01-161-29/+38
| | | | | | | | | | | | | | | I had bugs in the old path where I was laying out as tiled (so we'd render tiled) but then only allocating space in the shared object for linear rendering. The resource_from_handle makes it so the same layout choices are made in both the import and export scanout cases. Also, fixes a leak of the fd that was tripping up the CTS. Now that we're checking PIPE_BIND_SHARED to choose to use RO, the DRM_FORMAT_MOD_LINEAR check wasn't needed any more. Fixes visual corruption and MMU faults in X in renderonly mode. Fixes: bd09bb1629a7 ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")
* v3d: If the modifier is not known on BO import, default to linear for RO.Eric Anholt2019-01-161-1/+3
| | | | | | Part of fixing DRI3 rendering with RO on X11. Fixes: e113b21cb779 ("v3d: Add renderonly support.")
* radeonsi/nir: get correct type for images inside structsTimothy Arceri2019-01-171-1/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* swr/rast: Store cached files in multiple subdirsAlok Hota2019-01-163-38/+52
| | | | | | This improves cache filesystem performance, especially during CI tests Also updated jitcache magic number due to codegen parameter changes Removed 2 `if constexpr` to prevent C++17 requirement
* swr/rast: New execution engine per JITAlok Hota2019-01-162-42/+65
| | | | Fixes relocation errors with LLVM 7.0.0
* swr/rast: Scope MEM_CLIENT enum for mem usagesAlok Hota2019-01-166-40/+38
| | | | | | | | Avoids confusion with other defaulted integer parameters - fixed some unspecified usages - removed unnecessary includes - removed unecessary protected access specifier in buckets framework
* swr/rast: Unaligned and translations in gathersAlok Hota2019-01-161-21/+35
| | | | | | | - added graphics address translation in odd gathers - added support for unaligned gathers in fetch shader - changed how 2+ GB offsets are handled to make them compatible with unaligned offsets
* swr/rast: partial support for Tiled ResourcesAlok Hota2019-01-162-0/+164
| | | | | | | | | - updated sample from TRTT surfaces correctly - implemented mapped status return for TRTT surfaces - implemented per-sample instruction minLod clamp - updated bilinear filter weight calculation to be closer to D3D specs - implemented "ReducedTexcoordRange" operation from D3D specs to avoid loss of precision on high-value normalized coordinates
* swr/rast: Add annotator to interleave isa textAlok Hota2019-01-162-3/+36
| | | | To make debugging simpler
* swr/rast: Use gfxptr_t value in JitGatherVerticesAlok Hota2019-01-161-18/+16
| | | | | Use gfxptr_t type value for stream pointer uses in gather and similar calls
* gallium/swr: Fix multi-context sync fence deadlock.Bruce Cherniak2019-01-161-1/+3
| | | | | | | | | | | | | | | | Various recreation scenarios lead to API thread getting stuck in swr_fence_finish(). This is a multi-context issue, whereby one context overwrites the fence read-value with a previous sync's lesser value. The fence sync value is supposed to be always increasing. In swr_fence_cb(), only update the "read" value if the new value is greater. (This may seem like we're not waiting on the other context to finish, but had we needed for it to finish there would have been a wait prior to submitting a new sync.) cc: [email protected]
* radeonsi: also apply the GS hang workaround to draws without tessellationMarek Olšák2019-01-141-11/+14
| | | | | | | ported from AMDVLK. Cc: 18.3 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.Eric Anholt2019-01-141-1/+1
| | | | | We don't have a way to talk to RO about modifiers it can do yet, so assume the minimum.
* v3d: Add support for shader_image_load_store.Eric Anholt2019-01-145-2/+196
| | | | | | This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).
* v3d: Add SSBO/atomic counters support.Eric Anholt2019-01-146-1/+102
| | | | | So far I assume that all the buffers get written. If they weren't, you'd probably be using UBOs instead.
* v3d: Drop the GLSL version level.Eric Anholt2019-01-141-1/+1
| | | | | | This was an arbitrary "we support lots of stuff" value when I started the driver. However, at 400 we expose OES_gpu_shader5, which claims support for dynamically indexing samplers, which the driver doesn't do yet.
* v3d: Add an isr to the simulator to catch GMP violations.Eric Anholt2019-01-143-0/+39
| | | | | | | Otherwise, the simulator raises the GMP interrupt and waits for it to be handled, and v3d ends up spinning in v3d_hw_tick(). Aborting right when violation happens gives us a chance to look at the backtrace of whatever thread triggered the violation.
* v3d: Add support for GL_ARB_framebuffer_no_attachments.Eric Anholt2019-01-143-2/+19
| | | | | | Fixes dEQP-GLES31.functional.state_query.integer.max_framebuffer_height_getboolean when GLES3 is enabled.
* v3d: Add support for flushing dirty TMU data at job end.Eric Anholt2019-01-142-0/+20
| | | | This will be needed for SSBOs and image_load_store.
* freedreno/a6xx: fix 3d+tiled layoutRob Clark2019-01-101-34/+52
| | | | | | | | The last round of fixing 3d layer+level layout skipped the tiled case, since tiled texture support was not in place yet. This finishes the job. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: move tile_mode to sampler-view CSORob Clark2019-01-102-7/+7
| | | | | | | | | | | This is known when the CSO is created, so no need to patch it in later. Also, it seems like smaller textures where the first level is small enough to be linear, it seems like we should set linear tile mode. See: dEQP-GLES3.functional.texture.format.unsized.rgb_unsigned_byte_3d_pot Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: separate stencil restore/resolve fixesRob Clark2019-01-101-14/+21
| | | | | | | | | | | | | Previously we'd use format/etc from the primary (z32) buffer for the stencil (s8), due to confusion about rsc vs psurf. Rework this to drop extra arg and push down handling of separate stencil case (and make sure we take the fmt from the right place). This doesn't completely fix separate-stencil, but at least it avoids the GPU scribbling over random other cmdstream buffers and causing a bunch of bogus fails in dEQP. Signed-off-by: Rob Clark <[email protected]>
* etnaviv: fix typo in cflush_all descriptionGuido Günther2019-01-101-1/+1
| | | | | Signed-off-by: Guido Günther <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* Revert "llvmpipe: Always return some fence in flush (v2)"Roland Scheidegger2019-01-091-2/+0
| | | | | | | | | This reverts commit f6a6da8131383d8eeee07cd59326a70f4b15866b. With this commit we see massive amounts of asserts triggering in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib state tracker and piglit. Not entirely sure if the assert could just be removed.
* radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.Mario Kleiner2019-01-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With Mesa 18.1, commit be973ed21f6e, si_llvm_load_input_vs() changed the number of source 32-bit wide dword components used for fetching vertex attributes into the vertex shader from a constant 4 to a variable num_channels number, depending on input data format, with some special case handling for input data formats like 64-Bit doubles. In the case of a GL_DOUBLE input data format with one or two components though, e.g, submitted via ... a) glTexCoordPointer(1, GL_DOUBLE, 0, buffer); b) glTexCoordPointer(2, GL_DOUBLE, 0, buffer); ... the input format would be SI_FIX_FETCH_RG_64_FLOAT, but no special case handling was implemented for that case, so in the default path the number of 32-bit dwords would be set to the number of float input components derived from info->input_usage_mask. This ends with corrupted input to the vertex shader, because fetching a 64-bit double from the vbo requires fetching two 32-bit dwords instead of 1, and fetching a two double input requires 4 dword fetches instead of 2, so in these cases the vertex shader receives incomplete/truncated input data: a) float v = gl_MultiTexCoord0.x; -> v.x is corrupted. b) vec2 v = gl_MultiTexCoord0.xy; -> v.x is assigned correctly, but v.y is corrupted. This happens with the standard TGSI IR compiled shaders. Under NIR with R600_DEBUG=nir, we got correct behavior because the current radeonsi nir code always assigns info->input_usage_mask = TGSI_WRITEMASK_XYZW, thereby always fetches 4 dwords regardless of what the shader actually needs. Fix this by properly assigning 2 or 4 dword fetches for one or two component GL_DOUBLE input. Fixes: be973ed21f6e ("radeonsi: load the right number of components for VS inputs and TBOs") Signed-off-by: Mario Kleiner <[email protected]> Cc: [email protected] Cc: Marek Olšák <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsicsRhys Perry2019-01-091-0/+3
| | | | | | | | | | | | | | Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is enabled with DXVK. It also fixes artifacts with Fallout 4's god rays with DXVK. Various piglit interpolateAt*() tests under NIR are also fixed. v2: formatting fix update commit message to include Fallout 4 and the Fixes tag Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595 Signed-off-by: Rhys Perry <[email protected]>
* llvmpipe: Always return some fence in flush (v2)Tomasz Figa2019-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <[email protected]>
* v3d: Enable GL_ARB_texture_gather on V3D 4.x.Eric Anholt2019-01-081-0/+5
| | | | | This is part of GLES 3.1, and with the NIR lowering we're now passing the GLES31 testcases.
* freedreno: Move register constant files to src/freedreno.Bas Nieuwenhuizen2019-01-0810-22475/+1
| | | | | | | | This way they can be shared. Build tested with meson, but not too sure on the autotools stuff though. Reviewed-by: Dylan Baker <[email protected]> Acked-by: Rob Clark <[email protected]>
* nir: rename global/local to private/function memoryKarol Herbst2019-01-082-3/+3
| | | | | | | | | | | | | | | | | | the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* virgl: use primconvert provoking vertex properlyDave Airlie2019-01-082-8/+24
| | | | | | | | This stores the raster state and calls the correct primconvert interface using the currently bound raster state. Reviewed-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: Add support for using derefs for UBO/SSBO accessJason Ekstrand2019-01-081-0/+1
| | | | | | | | | For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir: Distinguish between normal uniforms and UBOsJason Ekstrand2019-01-081-2/+3
| | | | | | | | | | | | | | | | Previously, NIR had a single nir_var_uniform mode used for atomic counters, UBOs, samplers, images, and normal uniforms. This commit splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform is still a bit of a catch-all but the nir_var_ubo is specific to UBOs. While we're at it, we also rename shader_storage to ssbo to follow the convention. We need this so that we can distinguish between normal uniforms and UBO access at the deref level without going all the way back variable and seeing if it has an interface type. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* etnaviv: annotate variables only used in debug buildLucas Stach2019-01-071-7/+4
| | | | | | | | | Some of the status variables in the compiler are only used in asserts and thus may be unused in release builds. Annotate them accordingly to avoid 'unused but set' warnings from the compiler. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: enable full overwrite in a few more casesLucas Stach2019-01-071-4/+7
| | | | | | | | | Take into account the render target format when checking if the color mask affects all channels of the RT. This allows to enable full overwrite in a few cases where a non-alpha format is used. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* v3d: Fix up VS output setup during precompiles.Eric Anholt2019-01-041-6/+10
| | | | | | | | | I noticed that a VS I was debugging was missing all of its output stores -- outputs_written was for POS, VAR0, VAR3, while the shader's variables were POS, VAR9, and VAR12. I'm not sure what outputs_written is supposed to be doing here, but we can just walk the declared variables and avoid both this bug and the emission of extra stvpms for less-than-vec4 varyings.