aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: fix issues with gallium HUDRob Clark2019-06-071-5/+8
| | | | | | | | | | | | | In some cases the draw for the text wasn't working. This seems to be fixed by resyncing some of the "golded registers" from blob (initial values were based on somewhat older blob version). Perhaps good to have a bit of soak time on master, but would be good to eventually land in 19.x stable branches. Cc: [email protected] Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* iris: Rename bind_state to bind_shader_state.Kenneth Graunke2019-06-071-9/+9
| | | | | bind_state is possibly the worst name ever. For create, we used create_shader_state, which is more descriptive. Put shader in the name.
* panfrost/ci: Texture wrap tests are legitimately fixedAlyssa Rosenzweig2019-06-071-58/+0
| | | | | | These depended on the wallpaper reload. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower inot to inor with 0Alyssa Rosenzweig2019-06-071-1/+2
| | | | | | | We were previously lowering to inand, but the second arg was not duplicated so inot would always return ~0. Oops. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Cleanup tag fetch in disassemblerAlyssa Rosenzweig2019-06-071-2/+3
| | | | | | Trivial. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Use fancy iteratorAlyssa Rosenzweig2019-06-071-1/+1
| | | | | | Trivial cleanup. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Cull dead branchesAlyssa Rosenzweig2019-06-072-2/+31
| | | | | | This fixes bugs with complex control flow. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add mir_print_bundle helperAlyssa Rosenzweig2019-06-072-0/+14
| | | | | | This helps with debugging scheduling/emission. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard/disasm: Pretty-print branch tagsAlyssa Rosenzweig2019-06-071-7/+34
| | | | | | Just makes it a little more obvious what's going on. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/ci: Note some since-fixed testsAlyssa Rosenzweig2019-06-071-26/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Vectorize I/OAlyssa Rosenzweig2019-06-073-7/+18
| | | | | | | | | This uses the new mesa/st functionality for NIR I/O vectorization, which eliminates a number of corner cases (resulting in assorted dEQP failures and regressions) and should improve performance substantial due to lessened pressure on the load/store pipe. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Remove varyings delay passAlyssa Rosenzweig2019-06-072-75/+9
| | | | | | | | This pass interfered with the more delicate path required for non-vectorized I/O. It's also ugly and duplicating the job of an actual honest-to-goodness scheduler. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Apply component to load_inputAlyssa Rosenzweig2019-06-071-0/+4
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno/a6xx: Drop struct stage arrayKristian H. Kristensen2019-06-071-144/+80
| | | | | | | | | | | | | | | This now boils down to just picking between binning or vertex shader and dummy_fs or real fs, which we can do in a couple of lines of code instead. The constlen logic isn't doing what it thinks it's doing, both constlens at this point MAX2(s[VS].constlen, align(state->bs->constlen, 4)); are binning shader constlens. We'll have to revisit the constlen logic, but this commit doesn't change how it works. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: Drop support for SS6_DIRECT shader uploadKristian H. Kristensen2019-06-071-30/+3
| | | | | | | a6xx only supports indirect shaders. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: Share shader_t_to_opcodeKristian H. Kristensen2019-06-073-35/+21
| | | | | | | | We have a similar function in fd6_program.c. Move to fd6_emit.h and share. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: Consolidate more of dword 0 building in fd6_draw_vboKristian H. Kristensen2019-06-071-31/+24
| | | | | | | | | There's already a bit of duplicated logic here and tessellation will add more. Build up dword 0 in fd6_draw_vbo() and drop the a4xx in the process. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: Move fd4_size2indextype() helper to freedreno_util.hKristian H. Kristensen2019-06-072-13/+13
| | | | | | | In preparation for refactoring fd6_draw.c a bit. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: Sweep the NIR in iris_create_uncompiled_shader().Kenneth Graunke2019-06-071-0/+2
| | | | | | We run a ton of backend specific passes here (mostly brw_preprocess_nir) and ought to sweep up any unused memory at this point, since we're going to hang on to this NIR for as long as the linked program lives.
* v3d: don't emit point coordinates varyings if the FS doesn't read themIago Toral Quiroga2019-06-071-0/+5
| | | | | | | We still need to emit them in V3D 3.x since there there is no mechanism to disable them. Reviewed-by: Eric Anholt <[email protected]>
* tests/graw: use C99 print conversion specifier for 32 bit buildsMark Janes2019-06-061-1/+2
| | | | | | | | | | | Fixes formatting errors for 32 bit compilations, eg: error: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Werror,-Wformat] printf("result1 = %lu result2 = %lu\n", res1.u64, res2.u64); Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* panfrost/midgard: Fix crash with unused SSA valuesAlyssa Rosenzweig2019-06-061-0/+4
| | | | | | | Crash introduced in "b38dab101ca7e0896255dccbd85fd510c47d84d1" but not adding a Fixes tag since it's our bug anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Report sRGB colorspace as not supportedBoris Brezillon2019-06-061-0/+4
| | | | | | | The driver does not support sRGB yet, so let's report it as unsupported. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* radeonsi: Don't force dcc disable for loadsConnor Abbott2019-06-062-13/+0
| | | | | | | | | | | | | When e9d935ed0e2 added force_dcc_off(), we forced it off for any preloaded image descriptor which had stores associated with them, since the same preloaded descriptors were used for loads and stores. However, when the preloading was removed in 16be87c9042, the existing logic was kept despite it not being necessary anymore. The comment above force_dcc_off() only mentions stores, so only force DCC off for stores. Cc: Nicolai Hähnle <[email protected]> Cc: Marek Olšák <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* virgl: Enable CAP_CLIP_HALFZ if host supports itGert Wollny2019-06-062-1/+3
| | | | | | | | | | | On according hosts this enables the piglits as "pass": arb_clip_control-* v2: sync flag with host Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Chia-I Wu <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]>
* svga: Remove unnecessary check for the pre flush bit for setting vertex buffersCharmaine Lee2019-06-061-4/+4
| | | | | | | | | | This fixes the missing rebind when the can_pre_flush bit is not set and the vertex buffers are the same as what have been sent. Cc: [email protected] Reviewed-by: Neha Bhende <[email protected]> Signed-off-by: Charmaine Lee <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* winsys/svga/drm: Fix 32-bit RPCI send messageDeepak Rawat2019-06-061-12/+23
| | | | | | | | | | | | | | | | | | | | | Depending on whether compiled with frame-pointer or not, the temporary memory location used for the bp parameter in these macros are referenced relative to the stack pointer or the frame pointer. Hence we can never reference that parameter when we've modified either the stack pointer or the frame pointer, because then the compiler would generate an incorrect stack reference. Fix this by pushing the temporary memory parameter on a known location on the stack before modifying the stack- and frame pointers. Also in case of failuire RPCI channel is not closed which lead to vmx running out of channels. Cc: [email protected] Signed-off-by: Deepak Rawat <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* lima/ppir: add missing handling of min/max ops for vec4 add slotVasily Khoruzhick2019-06-061-0/+6
| | | | | Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: fix crash when program uses no registers at allVasily Khoruzhick2019-06-061-0/+4
| | | | | | | | Program may need no regalloc at all, e.g. in case when program consists of single discard op. Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* radeonsi: Enable NIR's lower_fmod option.Kenneth Graunke2019-06-051-0/+1
| | | | | | | | | | | | | Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIR-based drivers. The AMD NIR backend also has code to handle fmod, so we could potentially skip this and still be fine. I don't have an opinion on that. Reviewed-by: Marek Olšák <[email protected]>
* vc4: Enable NIR's lower_fmod option.Kenneth Graunke2019-06-051-0/+1
| | | | | | | | | | Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIR-based drivers. Reviewed-by: Marek Olšák <[email protected]> Acked-by: Eric Anholt <[email protected]>
* nir: Combine lower_fmod16/32 back into a single lower_fmod.Kenneth Graunke2019-06-053-3/+3
| | | | | | | | | | | | | | We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Switch to nir_lower_doubles instead of lower_fmod64.Kenneth Graunke2019-06-051-1/+2
| | | | | | | | | I don't think panfrost actually does doubles yet, but it at least claims to support PIPE_CAP_DOUBLES, so at least pretend to switch to the new lowering. Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nouveau: Use nir_lower_doubles instead of lower_fmod64 on nvc0.Kenneth Graunke2019-06-051-2/+1
| | | | | | | | | | We currently have two duplicate mechanisms for lowering fmod@64. One is a nir_opt_algebraic rule keyed off of options->lower_fmod64, and the other is nir_lower_doubles, which offers a full gamut of fp64 lowering. The latter works slightly better in some corner cases, so I'm trying to eliminate lower_fmod64 and drop the redundancy. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Drop lower_fmod64 from drivers that don't support doubles.Kenneth Graunke2019-06-051-1/+0
| | | | | | | Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no fmod64 to be lowered. Reviewed-by: Marek Olšák <[email protected]>
* panfrost/midgard: Verify SSA claims when pipeliningAlyssa Rosenzweig2019-06-053-0/+24
| | | | | | | | | | | | | | | | | The pipeline register creation algorithm is only valid for SSA indices; NIR registers and such cannot be pipelined without more complex analysis. However, there are the ocassional class of "liars" -- indices that claim to be SSA but are not. This occurs in the blend shader prologue, for example. Detect this and just bail quietly for now. Eventually we need to rewrite the blend shader prologue to occur in NIR anyway (which would mitigate the issue), but that's more involved and depends on a better understanding of pixel formats in blend shaders (for non-RGBA8888/UNORM cases). Fixes some blend shader regressions. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Don't assign var locations ourselvesAlyssa Rosenzweig2019-06-052-434/+0
| | | | | | | | | | This piece of code was cargo-culted from the ir3 standalone compiler and made sense when we were a standalone compiler ourselves. Unfortunately, for the online compiler, mesa/st *already handles this for us* and if we duplicate it here, we're duplicating it *incorrectly*. So just delete these lines and fix a heck of a lot of tests. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Reload framebuffer contents if there's no clearTomeu Vizoso2019-06-054-311/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | If by flush time the client hasn't submitted a clear, add jobs for reloading the framebuffer contents as the first draw in the frame. This is required by programs such as Weston which don't do clears and rely on the previous contents of the framebuffer being there. Reloading the whole framebuffer on every frame without regards to what is needed or what is going to be covered is very inefficient, but future work will introduce support for damage regions and partial updates so we know what needs to be actually reloaded. Fixes quite a few tests in dEQP-EGL.functional.buffer_age.*. [Alyssa: The context is that tilers do an implicit glClear() on every frame, whether you asked them to or not. If you want a clear, this is very efficient. But if you don't, you have to explicitly blit the backbuffer back into tile memory, accomplished by a dummy texturing draw. This patch generates that draw via u_blitter, although we could do a bit better ourselves by eliding the vertex job. This fixes "black rectangles in Weston/sway" as well as "video not displaying when UI visible in mpv"] Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't flip scanoutAlyssa Rosenzweig2019-06-059-82/+47
| | | | | | | | | | | | | | | | | | | | | | The mesa/st flips the viewport, so we respect that rather than trying to flip the framebuffer itself and ignoring the viewport and using a messy heuristic. However, this brings an underlying disagreement about the interpretation of winding order to light. The blob uses a different strategy than Mesa for handling viewport Y flipping, so the meanings of the winding order bit are flipped for it. To keep things clean on our end, we rename to explicitly use Gallium (rather than flipped OpenGL) conventions. Fixes upside-down Xwayland/egl windows. v2: Adjust lowering configuration to correctly flip gl_PointCoord.y and gl_FragCoord.y. v1 was R-b'd by Tomeu, but then retracted due to these regressions which are not fixed. Suggested-by: Rob Clark <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]> Sort-of-reviewed-by: Tomeu Vizoso <[email protected]>
* st/nine: Use tgsi_to_nir when preferred IR is NIR.Timur Kristóf2019-06-054-6/+135
| | | | | | | | | This patch allows nine to read the preferred IR from pipe caps and use NIR when that is preferred by the driver, by calling tgsi_to_nir. Also adds some debug options that allow overriding it. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* intel/nir: Stop returning the shader from helpersJason Ekstrand2019-06-051-1/+1
| | | | | | | | Now that NIR_TEST_* doesn't swap the shader out from under us, it's sufficient to just modify the shader rather than having to return in case we're testing serialization or cloning. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Only recompile CS when neededCaio Marcelo de Oliveira Filho2019-06-051-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Sagar Ghuge <[email protected]>
* freedreno/a6xx: Use VALIDREG in next_regid() helperKristian H. Kristensen2019-06-051-6/+6
| | | | Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Remove dead code from a5xxKristian H. Kristensen2019-06-051-10/+0
| | | | Reviewed-by: Rob Clark <[email protected]>
* panfrost/midgard: Always break up fragment writeoutAlyssa Rosenzweig2019-06-051-68/+21
| | | | | | | | | | | | | | | | | In a fragment shader, r0 is written out with a special branch sequence. r0 is not a real register here, but essentially a pipeline register -- as such, it needs to be written out in full and on time, with hanging dependencies in the bundle. Otherwise, we break up the bundle, which costs an extra ALU cycle and adds a move. When the scheduler ran last thing, we could do this analysis within the scheduler. Now that RA can run after scheduling, that's no longer valid, so we remove the analysis and always break it up (at a performance penalty). Future work can add a post-RA/post-schedule pass to merge writeout blocks if possible. It's a bit of a low-priority next to fixing conformance regressions, of course. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix cubemap regressionAlyssa Rosenzweig2019-06-051-2/+9
| | | | | | Fixes: 2d9802233 ("panfrost/midgard: Extend RA to non-vec4 sources") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* winsys/drm: Fix out of scope variable usageDeepak Rawat2019-06-021-12/+13
| | | | | | | | | | In this particular instance, struct member were used outside of the block where it was defined. Fix this by moving the definition outside of block. Signed-off-by: Deepak Rawat <[email protected]> Fixes: 569f83898768 ("winsys/svga: Add support for new surface ioctl, multisample pattern") Reviewed-by: Brian Paul <[email protected]>
* panfrost/midgard: Lower integer divisionAlyssa Rosenzweig2019-06-052-144/+1
| | | | | | | | | We use the shared nir_lower_idiv pass to lower integer division, fixing 144 dEQP tests. This pass was not applied in the past due to breakage from iabs fixed earlier in the series. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-By: Ryan Houdek <[email protected]>
* panfrost/midgard: Fix 1-arg ALU memory corruptionAlyssa Rosenzweig2019-06-051-1/+2
| | | | | | | | | | | | | | | | | | Certain ops that only take one argument have an imaginary "zero" constant for their second argument. For instance, conversions: i2f [dest], [source], #0 Memory corruption meant that #0 was instead random noise. For some ops, that doesn't matter (manifested as abnormally large code size and poor scheduling due to extra constants in random places). But for others, where a 1-op is emulated by a 2-op with an implicit 0 second argument, that broke things. Fixes iabs (emulated by iabsdiff). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-By: Ryan Houdek <[email protected]>
* panfrost/midgard: Add a bunch of new ALU opsAlyssa Rosenzweig2019-06-053-4/+32
| | | | | | | | | | These ops are used to accelerate various functions exposed in OpenCL. This commit only includes the routine additions to the table. They are not wired through the compiler; rather, they are just here to keep a reference for the disassembler. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-By: Ryan Houdek <[email protected]>