summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: Elucidate texture op scheduling commentAlyssa Rosenzweig2019-02-101-8/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove speculative if 0'd format bit codeAlyssa Rosenzweig2019-02-101-6/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove if 0'd dead codeAlyssa Rosenzweig2019-02-105-83/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add kernel-agnostic resource managementAlyssa Rosenzweig2019-02-102-15/+172
| | | | | | | | | | Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't hardcode number of nir_ssa_defsAlyssa Rosenzweig2019-02-101-14/+14
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Clean-up one-argument passing quirkAlyssa Rosenzweig2019-02-102-114/+112
| | | | | | | | | | | | | | | | | Most Midgard instructions take two-arguments logically; there are always two arguments at the assembly level. For the few instructions that take only a single argument, generally the second argument slot is unused, with a zero inline constant occupying the space. fmov/imov are the exception, where the first argument is filled with r24 and the logical argument is in the second slot. Previously, these constraints were handled by a delicate, buggy series of hacks. This commit removes these hacks. Instead, we look at the logical number of arguments (from NIR), switching between two argument and one-argument-one-zero style. We then introduce a quirk for the flipped style, which applies to fmov/imov. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* glsl_type: initialize offset and location to -1 for glsl_struct_fieldKarol Herbst2019-02-091-2/+2
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nouveau: Silence unhandled cap warningsKenneth Graunke2019-02-082-0/+2
| | | | | | | | | | Nouveau apparently uses the u_screen helper but prints a warning in the default case, so running any GL program would start grumbling. Fixes: 8fa54bc5490 gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit. Reviewed-by: Karol Herbst <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* intel/compiler: use 0 as sampler in emit_mcs_fetchCaio Marcelo de Oliveira Filho2019-02-082-2/+2
| | | | | | | | | | | | The sampler will be ignored since the underlying 'ld_mcs' operation won't use it, so just fill the field with 0 instead of the texture to make it clearer that's the case. This will also avoid is_high_sampler() to kick in unnecessarily, in case we are using the operation for a texture with index >= 16. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* wsi: query the ICD's max dimensions instead of hard-coding themEric Engestrom2019-02-086-12/+32
| | | | | | | | | anv and radv both happened to already return 2^14 for these, but querying the ICD is safer and will help if vdreno (or whatever it's called) doesn't have the same max. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Convert a bcsel with only phi node sources to a phi nodeIan Romanick2019-02-081-0/+220
| | | | | | | | | | | | | v2: Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Fix an issue where a bcsel that may not be executed on a loop iteration due to a break statement is converted to a phi (and therefore incorrectly "executed"). Noticed by Tim. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109216 Fixes: 8fb8ebfbb05 ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <[email protected]>
* nir: Split ALU instructions in loops that read phisIan Romanick2019-02-081-0/+294
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A single shader in Unigine Superposition is affected by this change. A single iadd is moved to the end of a loop. This iadd is involved in a complex set of logic to terminate the loop, and an extra mov instruction is inserted. This shader really needs the optimization suggested by bugzilla #94747, and I expect that to make this tiny regression go away. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15047543 -> 15047545 (<.01%) instructions in affected programs: 565 -> 567 (0.35%) helped: 0 HURT: 2 total cycles in shared programs: 369977253 -> 369978253 (<.01%) cycles in affected programs: 127910 -> 128910 (0.78%) helped: 0 HURT: 2 v2: Skip nir_op_vec{2,3,4} and nir_op_[fi]mov instructions to avoid infinite optimization loops. Remove the original ALU instruciton after all of its readers are modified to read the new ALU instruction. v3: Extend to the more general case. The if the prev-block value from the phi is not undef, this means the ALU instruction has to be duplicated in both the prev-block and the continue-block. Fixes: 8fb8ebfbb05 ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <[email protected]>
* nir: Select phi nodes using prev_block instead of continue_blockIan Romanick2019-02-081-11/+10
| | | | | | | This simplifies some changes coming later. Fixes: 8fb8ebfbb05 ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <[email protected]>
* nir: Refactor code that checks phi nodes in opt_peel_loop_initial_ifIan Romanick2019-02-081-16/+36
| | | | | | | | | | | This will be used in a couple more places soon. The function name is... horribly long. Neither Matt nor I could think of any thing that was shorter and still more descriptive than "is_phi_foo". I'm willing to entertain suggestions. Fixes: 8fb8ebfbb05 ("intel/compiler: More peephole select") Reviewed-by: Timothy Arceri <[email protected]>
* nir: Document some fields of nir_loop_terminatorIan Romanick2019-02-081-0/+5
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* intel/compiler: Silence warning about value that may be used uninitializedIan Romanick2019-02-081-1/+1
| | | | | | | | | | | | | | | | For some reason, this warning only occurs for me in release builds. In file included from src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:25:0: src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c: In function ‘brw_nir_lower_mem_access_bit_sizes’: src/compiler/nir/nir_builder.h:501:26: warning: ‘src_swiz[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized] alu_src.swizzle[i] = swiz[i]; ~~~~~~~~~~~~~~~~~~~^~~~~~~~~ src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c:225:16: note: ‘src_swiz[2]’ was declared here unsigned src_swiz[4]; ^~~~~~~~ Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Silence zillions of unused parameter warnings in release buildsIan Romanick2019-02-081-1/+1
| | | | | | Fixes: cd56d79b59f "nir: check NIR_SKIP to skip passes by name" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gitlab-ci: workaround docker bug for users with uppercase charactersEric Engestrom2019-02-081-1/+1
| | | | | | | CI_REGISTRY_IMAGE == lower($CI_REGISTRY/$CI_PROJECT_PATH) Suggested-by: Daniel Stone <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* i965: consider a 'base level' when calculating width0, height0, depth0Andrii Simiklit2019-02-071-1/+25
| | | | | | | | | | | I guess that when we calculating the width0, height0, depth0 to use for function 'intel_miptree_create' we need to consider the 'base level' like it is done in the 'intel_miptree_create_for_teximage' function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987 Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rewrite varying component packingTimothy Arceri2019-02-081-102/+254
| | | | | | | | | | | | | | | | | | There are a number of reasons for the rewrite. 1. Adding support for packing tess patch varyings in a sane way. 2. Making use of qsort allowing the code to be much easier to follow. 3. Fixes a bug where different interp types caused component packing to be skipped for all varyings in some scenarios. 4. Allows us to add a crude live range analysis for deciding which components should be packed together. This support can optionally be added in a future patch. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add is_packing_supported_for_type() helperTimothy Arceri2019-02-081-15/+13
| | | | | | | This will be used in the following patches to determine if we support packing the components of a varying. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add glsl_type_is_32bit() helperTimothy Arceri2019-02-083-0/+17
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add support for marking used patches when packing varyingsTimothy Arceri2019-02-081-23/+51
| | | | | | | This adds support needed for marking the varyings as used but we don't actually support packing patches in this patch. Reviewed-by: Jason Ekstrand <[email protected]>
* st/glsl_to_nir: call nir_remove_dead_variables() after lowing local indirectsTimothy Arceri2019-02-081-0/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* util: move BITFIELD macros to util/macros.hTimothy Arceri2019-02-082-24/+18
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: require RGBA2, RGB4, and RGBA4 to be renderableKarol Herbst2019-02-071-0/+2
| | | | | | | | | | | | If the driver does not support rendering to these formats but does support texturing, we can end up in incompatibilities between textures and renderbuffers that are then copied to. Fixes KHR-GL45.copy_image.functional on nvc0 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* gallium: add PIPE_CAP_MAX_VARYINGSKarol Herbst2019-02-0719-16/+54
| | | | | | | | | | | | | | | | | Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* kmsro: Silence warning if missingAlyssa Rosenzweig2019-02-081-1/+0
| | | | | | | | | | | | | | Regardless of whether the build uses kmsro, kmsro is the default driver descriptor when the static loader is used. Thus, in an edge case where the static loader is used, no static targets are loaded, and kmsro is not compiled, a spurious warning is printed. There's no harm in executing the stub function in this case, but it's not "an error" to not have kmsro in the build; the driver missing warning should not printed kmsro. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* radv: assert that colorAttachment is valid for CmdClearAttachmentLionel Landwerlin2019-02-081-3/+1
| | | | | | | | | | | | | This partially reverts a change from b7a93cbdede05a ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment") which fixed actual issues but also started to accept invalid values for the colorAttachment field. This change asserts that the field is valid for the current pass. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: b7a93cbdede05a ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: assert that color attachment are validLionel Landwerlin2019-02-081-4/+1
| | | | | | | | | | | This reverts commit d76e7779884775bcebf235adb0e8367816b9b95d. Let's make this obvious that there is an application issue if it tries to access an attachment that doesn't exist in the current pass. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: d76e7779884775 ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment") Reviewed-by: Jason Ekstrand <[email protected]>
* docs: update qbo support for virglDave Airlie2019-02-081-1/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* travis: fix osx make buildEric Engestrom2019-02-071-0/+4
| | | | | | | | | | This variable was removed in commit 087af992a276e7478c9c "travis: remove unused linux code path" because it looked like it was only used by the Linux build. Turns out I was wrong, so let's restore it. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* README: Drop the badges from the readmeJason Ekstrand2019-02-071-19/+0
| | | | | | They have been added as badges directly to the GitLab project. Reviewed-by: Eric Engestrom <[email protected]>
* driconf: drop unused macroEric Engestrom2019-02-071-4/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: add script to print the options before configuring a builddirEric Engestrom2019-02-072-1/+66
| | | | Signed-off-by: Eric Engestrom <[email protected]>
* panfrost: Include glue for out-of-tree legacy codeAlyssa Rosenzweig2019-02-075-7/+29
| | | | | | | | | | | | | | | | | In addition to the DRM interface in active development, for legacy kernels Panfrost has a small, optional, out-of-tree glue repository. For various reasons, this legacy code should not be included in Mesa proper, but this commit allows it to coexist peacefully with upstream Panfrost. If the nondrm repo is cloned/symlinked to the directory `src/gallium/drivers/panfrost/nondrm`, legacy functionality will be built. Otherwise, the driver will build normally, though a runtime error message will be printed if a legacy kernel is detected. This workaround is icky, but it allows a nearly-upstream Panfrost to work on real hardware, today. Ideally, this patch will be reverted when the Panfrost kernel module is mature and we drop legacy support. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Check in sources for command streamAlyssa Rosenzweig2019-02-0722-5/+5441
| | | | | | | | This patch includes the command stream portion of the driver, complementing the earlier compiler. It provides a base for future work, though it does not integrate with any particular winsys. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use u_pipe_screen_get_param_defaultsAlyssa Rosenzweig2019-02-071-151/+6
| | | | | | | | Switching to the defaults function cleans up pan_screen.h markedly and futureproofs for when new PIPE_CAPs are added. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Eric Anholt <[email protected]>
* kmsro: Move DRM entrypoints to shared blockAlyssa Rosenzweig2019-02-071-10/+8
| | | | | | | | | | | | | As kmsro allows an essentially mix-and-match hodgepodge of display drivers and renderonly GPUs, it doesn't make sense to couple the display driver entrypoint definition with the driver. Instead, we move *all* kmsro entrypoints to a shared kmsro block at the end (avoiding clutter and distraction since this list may snowball in the future). v2: Alphabetize driver list. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nvc0: add compute invocation counterRhys Perry2019-02-068-4/+207
| | | | | | | | | | | | | | | | | The strategy is to keep a CPU-side counter of the direct invocations, and a GPU-side counter of the indirect invocations, and then add them together for queries. The specific technique is a macro which multiplies a list of integers together and accumulates the product into SCRATCH registers held inside of the context. Another macro will read those values out and add them to the passed-in cpu-side counter to be stored in a query buffer the same way that all the other statistics are stored. Original implementation by Rhys Perry, redone by Ilia Mirkin to use the SCRATCH temporaries. Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: add fp64 rsqKarol Herbst2019-02-063-3/+128
| | | | | Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gm107/ir: add fp64 rcpKarol Herbst2019-02-063-4/+270
| | | | | Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk104/ir: Use the new rcp/rsq in libraryKarol Herbst2019-02-063-15/+334
| | | | | | [imirkin: add a few more "long" prefixes to safen things up] Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Use the new rcp/rsq in libraryBoyan Ding2019-02-065-0/+42
| | | | | | | | | | v2: (Karol Herbst <[email protected]> * fix Value setup for the builtins Signed-off-by: Boyan Ding <[email protected]> [imirkin: track the fp64 flag when switching ops to calls] Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Add rsq f64 implementationBoyan Ding2019-02-062-2/+109
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* gk110/ir: Add rcp f64 implementationBoyan Ding2019-02-062-4/+235
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0: stick zero values for the compute invocation countsIlia Mirkin2019-02-061-0/+2
| | | | | | | | | | Not quite perfect, but at least we don't end up with random values in the query buffer. Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nv50,nvc0: use condition for occlusion queries when already completeIlia Mirkin2019-02-066-28/+25
| | | | | | | | | | | | | | | | | | For the NO_WAIT variants, we would jump into the ALWAYS case for both nested and inverted occlusion queries. However if the query had previously completed, the application could reasonably expect that the render condition would follow that result. To resolve this, we remove the nesting distinction which unnecessarily created an imbalance between the regular and inverted cases (since there's no "zero" condition mode). We also use the proper comparison if we know that the query has completed (which could happen as a result of an earlier get_query_result call). Fixes KHR-GL45.conditional_render_inverted.functional Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0: fix 3d images on keplerIlia Mirkin2019-02-062-35/+34
| | | | | | | | | | | | | Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d tiling, they just need the correct inputs. Supply them. We also have to deal with the case where a 2d "layer" of a 3d image is bound. In this case, we supply the z coordinate separately to the shader, which has to optionally treat every 2d case as if it could be a slice of a 3d texture. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>
* nvc0/ir: fix second tex argument after levelZero optimizationIlia Mirkin2019-02-062-25/+24
| | | | | | | | | | | | | | | | | | | | We used to pre-set a bunch of extra arguments to a texture instruction in order to force the RA to allocate a register at the boundary of 4. However with the levelZero optimization, which removes a LOD argument when it's uniformly equal to zero, we undid that logic by removing an extra argument. As a result, we could end up with insufficient alignment on the second wide texture argument. Instead we switch to a different method of achieving the same result. The logic runs during the constraint analysis of the RA, and adds unset sources as necessary right before being merged into a wide argument. Fixes MISALIGNED_REG errors in Hitman when run with bindless textures enabled on a GK208. Fixes: 9145873b152 ("nvc0/ir: use levelZero flag when the lod is set to 0") Signed-off-by: Ilia Mirkin <[email protected]> Cc: 19.0 <[email protected]>