summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* drm-uapi: use local files, not system libdrmEric Engestrom2019-02-1432-36/+35
| | | | | | | | | There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: Move panfrost's isign lowering to nir_opt_algebraic.Eric Anholt2019-02-142-1/+1
| | | | | | | I wanted to reuse this from v3d. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* freedreno/a6xx: Fix point coordKristian H. Kristensen2019-02-133-8/+4
| | | | | | | | | | | Use ir3_next_varying() for iterating through varyings and unset the global point coord invert bit. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.pointcoord Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Front facing needs UNK3 bitKristian H. Kristensen2019-02-131-2/+5
| | | | | | | | | | | We need to set UNK3 in GRAS_CNTL and RB_RENDER_CONTROL0 for the value to be reliably delivered. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.frontfacing Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Update headersKristian H. Kristensen2019-02-132-7/+7
| | | | | | | | This pulls in changes for compute shaders and a6xx ssbo/image support. FACENESS bit moved from position 1 to 2 and there's a global invert bit for point coord. Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Clean up mixed use of swap and swizzle for texture stateKristian H. Kristensen2019-02-132-39/+28
| | | | Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: small compiler warning fixRob Clark2019-02-131-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: Fix guardband computation for large render targetsOscar Blumberg2019-02-121-2/+28
| | | | | | | | | Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <[email protected]> Cc: 18.3 19.0 <[email protected]>
* gallium/swr: Param defaults for unhandled PIPE_CAPsAlok Hota2019-02-121-4/+3
| | | | | | | Without using this function, we fail the -Wswitch flag when compiling the default debugoptimized mode in Meson Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SELMarek Olšák2019-02-121-3/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUGMarek Olšák2019-02-121-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* freedreno/a6xx: Fall back to masked RGBA blits for depth/stencilKristian H. Kristensen2019-02-111-5/+44
| | | | | | | | | | | | | | | | | | | The blitter doesn't seem to have a write mask, so for depth only and stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8 buffer and fall back to pipeline blits with corresponding write mask. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8 dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8 Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Add format argument to fd6_tex_swiz()Kristian H. Kristensen2019-02-114-8/+10
| | | | | | | | We need to allow overriding the format with that of the image or sampler view, so we can't take it from the resource in fd6_tex_swiz(). Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Support y-inverted blitsKristian H. Kristensen2019-02-111-5/+2
| | | | | | | | | | | | | | | | | The src coordinates are s24.8. For an inverted blit that ends at y=0 we need to program -1 for sy2, so we need to handle negative values correctly. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Support some depth/stencil blits on blitterKristian H. Kristensen2019-02-111-1/+84
| | | | | | | | | | | | | | | | | | We can rewrite almost all depth stencil blits to various red-only blits. The exception is depth-only or stencil-only blits into z24s8 combined depth stencil buffer. We can fall back for depth-only, but stencil-only remains broken. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Move blit check so as to restore commentKristian H. Kristensen2019-02-111-4/+4
| | | | | | | | | | | | | | The explanation for the compressed format check is broken across two comments: /* We can blit if both or neither formats are compressed formats... */ /* ... but only if they're the same compression format. */ but the ok_format() checks were inserted between, breaking up the flow of the sentence. Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Don't tell the blitter what it can't doKristian H. Kristensen2019-02-111-2/+4
| | | | | | | | Call ctx->blit() and let it reject blits it can't do instead of giving up on stencil blits and blits u_blitter can't do. Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Consolidate u_blitter functions in freedreno_blitter.cKristian H. Kristensen2019-02-115-149/+153
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Combine emit_blit and fd6_blitKristian H. Kristensen2019-02-111-12/+5
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Use the right resource for separate stencil strideKristian H. Kristensen2019-02-111-1/+1
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Log number of draw for sysmem passesKristian H. Kristensen2019-02-111-2/+3
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Drop render condition check in blitterKristian H. Kristensen2019-02-111-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | We already check earlier in the call chain in fd_blit(). glBlitFramebuffer always sets render_condition_enable and thus we would never try the blitter path for that. Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.* down this path, it turs out that the fail_if(info->mask != util_format_get_mask(info->src.format)); fail_if(info->mask != util_format_get_mask(info->dst.format)); conditions weren't accurate. util_format_get_mask() returns PIPE_MASK_RGBA for any format with any color channels, while info->mask is the exact set of channels to blit. So we reject things we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask is RG while util_format_get_mask() returns RGBA - and accept things we can't. It turns out that the blitter is happy to blit different number of channels, but fails to blit formats with different numerical formats and srgb formats. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0Marek Olšák2019-02-111-2/+5
| | | | | Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: we have 16k-sized framebuffers, fix default scissorsIlia Mirkin2019-02-101-2/+2
| | | | | | | | | For some reason we don't use view volume clipping by default, and use scissors instead. These scissors were set to an 8k max fb size, while the driver advertises 16k-sized framebuffers. Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* panfrost: Specify supported draw modes per-contextAlyssa Rosenzweig2019-02-112-12/+11
| | | | | | | | | | | | Midgard has native support for QUADS and POLYGONS; Bifrost seemingly does not. Thus, Midgard generally skips prim_convert whereas Bifrost needs the pass; this patch allows the setting of allowed primitives to occur on a per-context basis (for runtime hardware selection). v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Robert Foss <[email protected]>
* panfrost: Elucidate texture op scheduling commentAlyssa Rosenzweig2019-02-101-8/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove speculative if 0'd format bit codeAlyssa Rosenzweig2019-02-101-6/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove if 0'd dead codeAlyssa Rosenzweig2019-02-105-83/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add kernel-agnostic resource managementAlyssa Rosenzweig2019-02-102-15/+172
| | | | | | | | | | Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't hardcode number of nir_ssa_defsAlyssa Rosenzweig2019-02-101-14/+14
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Clean-up one-argument passing quirkAlyssa Rosenzweig2019-02-102-114/+112
| | | | | | | | | | | | | | | | | Most Midgard instructions take two-arguments logically; there are always two arguments at the assembly level. For the few instructions that take only a single argument, generally the second argument slot is unused, with a zero inline constant occupying the space. fmov/imov are the exception, where the first argument is filled with r24 and the logical argument is in the second slot. Previously, these constraints were handled by a delicate, buggy series of hacks. This commit removes these hacks. Instead, we look at the logical number of arguments (from NIR), switching between two argument and one-argument-one-zero style. We then introduce a quirk for the flipped style, which applies to fmov/imov. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* nouveau: Silence unhandled cap warningsKenneth Graunke2019-02-082-0/+2
| | | | | | | | | | Nouveau apparently uses the u_screen helper but prints a warning in the default case, so running any GL program would start grumbling. Fixes: 8fa54bc5490 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit. Reviewed-by: Karol Herbst <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_MAX_VARYINGSKarol Herbst2019-02-0716-12/+48
| | | | | | | | | | | | | | | | | Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* panfrost: Include glue for out-of-tree legacy codeAlyssa Rosenzweig2019-02-075-7/+29
| | | | | | | | | | | | | | | | | In addition to the DRM interface in active development, for legacy kernels Panfrost has a small, optional, out-of-tree glue repository. For various reasons, this legacy code should not be included in Mesa proper, but this commit allows it to coexist peacefully with upstream Panfrost. If the nondrm repo is cloned/symlinked to the directory `src/gallium/drivers/panfrost/nondrm`, legacy functionality will be built. Otherwise, the driver will build normally, though a runtime error message will be printed if a legacy kernel is detected. This workaround is icky, but it allows a nearly-upstream Panfrost to work on real hardware, today. Ideally, this patch will be reverted when the Panfrost kernel module is mature and we drop legacy support. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Check in sources for command streamAlyssa Rosenzweig2019-02-0722-5/+5441
| | | | | | | | This patch includes the command stream portion of the driver, complementing the earlier compiler. It provides a base for future work, though it does not integrate with any particular winsys. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use u_pipe_screen_get_param_defaultsAlyssa Rosenzweig2019-02-071-151/+6
| | | | | | | | Switching to the defaults function cleans up pan_screen.h markedly and futureproofs for when new PIPE_CAPs are added. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Eric Anholt <[email protected]>
* nvc0: add compute invocation counterRhys Perry2019-02-068-4/+207
| | | | | | | | | | | | | | | | | The strategy is to keep a CPU-side counter of the direct invocations, and a GPU-side counter of the indirect invocations, and then add them together for queries. The specific technique is a macro which multiplies a list of integers together and accumulates the product into SCRATCH registers held inside of the context. Another macro will read those values out and add them to the passed-in cpu-side counter to be stored in a query buffer the same way that all the other statistics are stored. Original implementation by Rhys Perry, redone by Ilia Mirkin to use the SCRATCH temporaries. Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: add fp64 rsqKarol Herbst2019-02-063-3/+128
| | | | | Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gm107/ir: add fp64 rcpKarol Herbst2019-02-063-4/+270
| | | | | Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk104/ir: Use the new rcp/rsq in libraryKarol Herbst2019-02-063-15/+334
| | | | | | [imirkin: add a few more "long" prefixes to safen things up] Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Use the new rcp/rsq in libraryBoyan Ding2019-02-065-0/+42
| | | | | | | | | | v2: (Karol Herbst <[email protected]> * fix Value setup for the builtins Signed-off-by: Boyan Ding <[email protected]> [imirkin: track the fp64 flag when switching ops to calls] Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Add rsq f64 implementationBoyan Ding2019-02-062-2/+109
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Add rcp f64 implementationBoyan Ding2019-02-062-4/+235
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0: stick zero values for the compute invocation countsIlia Mirkin2019-02-061-0/+2
| | | | | | | | | | Not quite perfect, but at least we don't end up with random values in the query buffer. Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nv50,nvc0: use condition for occlusion queries when already completeIlia Mirkin2019-02-066-28/+25
| | | | | | | | | | | | | | | | | | For the NO_WAIT variants, we would jump into the ALWAYS case for both nested and inverted occlusion queries. However if the query had previously completed, the application could reasonably expect that the render condition would follow that result. To resolve this, we remove the nesting distinction which unnecessarily created an imbalance between the regular and inverted cases (since there's no "zero" condition mode). We also use the proper comparison if we know that the query has completed (which could happen as a result of an earlier get_query_result call). Fixes KHR-GL45.conditional_render_inverted.functional Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0: fix 3d images on keplerIlia Mirkin2019-02-062-35/+34
| | | | | | | | | | | | | Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d tiling, they just need the correct inputs. Supply them. We also have to deal with the case where a 2d "layer" of a 3d image is bound. In this case, we supply the z coordinate separately to the shader, which has to optionally treat every 2d case as if it could be a slice of a 3d texture. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0/ir: fix second tex argument after levelZero optimizationIlia Mirkin2019-02-062-25/+24
| | | | | | | | | | | | | | | | | | | | We used to pre-set a bunch of extra arguments to a texture instruction in order to force the RA to allocate a register at the boundary of 4. However with the levelZero optimization, which removes a LOD argument when it's uniformly equal to zero, we undid that logic by removing an extra argument. As a result, we could end up with insufficient alignment on the second wide texture argument. Instead we switch to a different method of achieving the same result. The logic runs during the constraint analysis of the RA, and adds unset sources as necessary right before being merged into a wide argument. Fixes MISALIGNED_REG errors in Hitman when run with bindless textures enabled on a GK208. Fixes: 9145873b152 ("nvc0/ir: use levelZero flag when the lod is set to 0") Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0/ir: always use CG mode for loads from atomic-only buffersIlia Mirkin2019-02-061-2/+12
| | | | | | | | | | | | | | | | Atomic operations don't update the local cache, which means that we would have to issue CCTL operations in order to get the updated values. When we know that a buffer is primarily used for atomic operations, it's easier to just avoid the caching at that level entirely. The same issue persists for non-atomic buffers, which will have to be fixed separately. Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: 19.0 <[email protected]>
* nvc0: add support for handling indirect draws with attrib conversionIlia Mirkin2019-02-063-1/+82
| | | | | | | | | | | | | | | | The hardware does not natively support FIXED and DOUBLE formats. If those are used in an indirect draw, they have to be converted. Our conversion tries to be clever about only converting the data that's needed. However for indirect, that won't work. Given that DOUBLE or FIXED are highly unlikely to ever be used with indirect draws, read the indirect buffer on the CPU and issue draws directly. Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* freedreno/a6xx: Use tiling for all resourcesKristian H. Kristensen2019-02-061-1/+0
| | | | | | | | We used to restrict this to just PIPE_BIND_SAMPLER_VIEW resources, but most resources benefit from being tiled. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>