summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* panfrost/midgard: Split up midgard_compile.c (RA)Alyssa Rosenzweig2019-05-1911-928/+1149
| | | | | | | | | | | | This commit moves the register allocator out of midgard_compile.c and into its own midgard_ra.c file. In doing so, a number of dependencies are identified and moved into their own files in turn. midgard_compile.c is still fairly monolithic, but this should help. Code churn, but no functional changes should be introduced by this commit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Improve fixed-function blendingAlyssa Rosenzweig2019-05-192-978/+34
| | | | | | | | This fixes a few miscellaneous issues with the fixed-function blending programming, though it is far from complete. For cases known to be buggy, we force a fallback to blend shaders. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Wire up nir_lower_blendAlyssa Rosenzweig2019-05-191-14/+33
| | | | | | | | This implements blend shaders via nir_lower_blend, by creating dummy fragment shaders simply passing through the source color and using the new lowering pass to inject blendability. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Route new blending intrinsicsAlyssa Rosenzweig2019-05-191-106/+117
| | | | | | | To prepare for the new nir_lower_blend pass, we wire up the intrinsics for tilebuffer reads and constant colour loading. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/nir: Add nir_lower_blend passAlyssa Rosenzweig2019-05-194-1/+305
| | | | | | | | | | | | | | | | | | | | | | | | | | | This new lowering pass implements the OpenGL ES blend pipeline in shaders, applicable to hardware lacking full-featured blending hardware (including Midgard/Bifrost and vc4). This pass is run on a fragment shader, rewriting the store to a blended version, loading in the framebuffer destination color and constant color via intrinsics as necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to pass dEQP's blend tests. MIN/MAX modes are included and tested as well. That said, at present it has the following limitations: - MRT is not supported (ES3). - sRGB support is missing (ES3). - Extended blending is not yet ported from GLSL IR lowering (ES3.2) - Dual-source blending is not supported. (N/A) - Logic ops are not supported. (N/A) v2: Fix code conventions (per Ian Romanick's feedback). Implement color masks. This pass should be in common nir/ space, but due to non-technical reasons, for now it's in Panfrost space. In the future, depending if other drivers need some of the functionality, we can move this back to src/compiler/nir space. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix Bifrost-specific paddingAlyssa Rosenzweig2019-05-191-7/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost: Cleanup panfrost_job commentsAlyssa Rosenzweig2019-05-191-6/+3
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/decode: Decode blend constantAlyssa Rosenzweig2019-05-192-2/+31
| | | | | | | | | | | | | This adds a forgotten decode line on Midgard and adds the field of a blend constant on Bifrost. The Bifrost encoding is fairly weird; whereas Midgard is just a regular 32-bit float, Bifrost uses a fancy fixed-point-esque encoding. The decode logic here is experimentally correct. The encode logic is a sort of "guesstimate", assuming that the high byte is just int(f / 255.0) and then solving algebraicly for the low byte. This might be slightly off in some cases. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost: Hoist blend constant into Midgard-specific structAlyssa Rosenzweig2019-05-196-16/+21
| | | | | | | | | This eliminates one major source of #ifdef parity between Midgard and Bifrost, better representing how the struct acts on Midgard and allowing proper decodes on Bifrost. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/decode: Disassemble Bifrost shadersAlyssa Rosenzweig2019-05-192-8/+10
| | | | | | | | | We already have the Bifrost disassembler in-tree, so now that panwrap is able to dump Bifrost command streams, hook up the disassembler to pandecode. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: TypofixAlyssa Rosenzweig2019-05-171-1/+1
| | | | | Reported-by: Ryan Houdek <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Cleanup leak todosAlyssa Rosenzweig2019-05-173-16/+9
| | | | | | | Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: assert(0) -> unreachable for some switchAlyssa Rosenzweig2019-05-162-31/+18
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add load/store opcodesAlyssa Rosenzweig2019-05-164-52/+131
| | | | | | | | | This commit adds a bunch of new load/store opcodes, largely related to OpenCL, as well as adjusting the name of existing opcodes to be more uniform. The immediate effect is compute shaders are substantially easier to interpret now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Enable integer constant inliningAlyssa Rosenzweig2019-05-161-4/+0
| | | | | | | | | | | | Midgard ALU features two types of constants: embedded constants (128-bit chunk, zero/one per schedule bundle) and inline constants (16-bit splattered into the op, second source if present). Inline constants are much more efficient from a space and scheduling freedom standpoint, so it's desirable to inline when possible. Now that integer ops are well understood and in use, we enable inlining of integers constants in addition to floats (which have been inlined since forever). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Remove imov workaroundAlyssa Rosenzweig2019-05-161-26/+0
| | | | | | The previous commit fixes the issue this patched around. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Set int outmod for ops writing integersAlyssa Rosenzweig2019-05-162-7/+23
| | | | | | | | | | | | | | | | | | | | | By default, the "normal" output modifier is set on ALU ops. This is the correct default for float outputs -- for floats, it preserves the semantic value. Unfortunately, when used with integers, it does not preserve the bitstream encoding, causing misbehaviour. (It's an open question what happens when `normal` is used with integers -- does it apply some other transformation? or does it do floating point normalization/etc on the ints as if they were floats?). Instead, we default to the "clamp to integer" output modifier for ops writing integers. Semantically, this makes sense (clamping an integer to the nearest integer is the identity function). In the hardware with an integer opcode, this is the actual "normal". This fixes numerous sporadic and sometimes bizarre bugs relating to integers, especially integer moves. With this in place, we no longer care about the types involved; it's just bits on the wire again. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set custom stride for textures when necessaryAlyssa Rosenzweig2019-05-161-0/+25
| | | | | | | | | | | | | | | From Gallium (and our) perspective, the stride of a BO is arbitrary. For internal buffers, we can make it something nice, but for imported linear buffers (e.g. EGL clients), we don't always have that luxury. To cope, we calculate the expected stride of a texture, compare it to the BO's actual reported stride, and if they differ, set the latter as a custom stride. Fixes rendering of windows not on tile boundaries (noticeable in Weston with es2gears_wayland, for instance). Also, this should fix stride issues with bufer reloading. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Stride decodingAlyssa Rosenzweig2019-05-162-3/+33
| | | | | | | | | With a special flag, texture descriptors can include custom stride(s). We haven't seen a case of this used for mipmaps/cubemaps, so it's not clear how that will be encoded, but this dumps correctly for single one-level 2D textures. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Futureproof texture dumpingAlyssa Rosenzweig2019-05-161-2/+13
| | | | | | | | | | | One field was not dumped for some reason. It's observed to be 0, but it's still good to have it available. Also, extra fields might be snuck in the bitmaps array (it's variable-lengthed at the end), and we want to guard against that possibility, so we dump a little more. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Reduce batch size to 3000Tomeu Vizoso2019-05-141-1/+1
| | | | | | | | As with the previous value of 5000 we seemed to be reaching OOM in some circumstances. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Update expectationsTomeu Vizoso2019-05-141-2/+0
| | | | | | | | | | Since last Friday, these two tests have been fixed: dEQP-GLES2.functional.shaders.functions.control_flow.return_in_nested_loop_fragment dEQP-GLES2.functional.shaders.linkage.varying_7 Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.Eric Anholt2019-05-131-1/+2
| | | | | | | | The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <[email protected]>
* panfrost/midgard: Handle csel correctlyAlyssa Rosenzweig2019-05-125-152/+128
| | | | | | | | | | | | We use an algebraic pass for the csel optimizations, and use proper vectorized csel ops (i/fcsel_v) for mixed, rather lowering. To avoid regressions along the way, we fix an issue with the copy propagation pass (it should not attempt to propagate constants). Similarly, we take care to break bundles when using csel to fix some scheduler corner cases. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add CAPFs for conservative rasterizationTomeu Vizoso2019-05-101-0/+5
| | | | | | | | | | | | | | Just do what everybody else but Nouveau does and return 0.0f. This prevents the repeated logging of these messages on startup: Unexpected PIPE_CAPF 6 query Unexpected PIPE_CAPF 7 query Unexpected PIPE_CAPF 8 query Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Only take the fast paths on buffers aligned to block sizeTomeu Vizoso2019-05-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the functions operate on 16-byte blocks. Fixes this Valgrind error: Invalid read of size 4 at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85) by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171) by 0x584F587: panfrost_tile_texture (pan_resource.c:489) by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525) by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516) by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515) by 0x5875F13: u_default_texture_subdata (u_transfer.c:80) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd at 0x483F5C8: malloc (vg_replace_malloc.c:299) by 0x584F47D: panfrost_transfer_map (pan_resource.c:467) by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243) by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2) Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: 19.1 <[email protected]>
* panfrost: Fix two uninitialized accesses in compilerTomeu Vizoso2019-05-101-1/+1
| | | | | | | | | | | | | | | | | Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop, such as: dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Fixes: 14531d676b11 ("nir: make nir_const_value scalar")
* panfrost: ci: Skip running some testsTomeu Vizoso2019-05-101-0/+2
| | | | | | | | These tests add too much time to the total run time, and some of them even hang the DUTs, even if I haven't been able to reproduce it locally. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Don't restart WestonTomeu Vizoso2019-05-101-8/+1
| | | | | | | | | | | | There doesn't seem to actually be any noticeably memory leaks on Weston when running dEQP. We do seem to leak quiet a bit in the client, so we still have to run the dEQP runner in batches. This removes the risk of Weston not restarting properly and introducing spurious failures. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Update list of expected failuresTomeu Vizoso2019-05-101-79/+7
| | | | | | | | This matches the current state of things on both RK3288 and RK3399. Hopefully, from now on we'll only remove stuff from this list. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Tweak dEQP to improve throughputTomeu Vizoso2019-05-101-2/+8
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Fix list of tests to runTomeu Vizoso2019-05-101-2/+2
| | | | | | | | Make sure we have only test case names in the list, excluding names of test groups. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Check for incomplete runsTomeu Vizoso2019-05-101-0/+1
| | | | | | | | | To improve robustness, check that we got the expected number of results. Right now we hard-code the expected number of tests run, but with some effort we may be able to infer it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Add tests to flip-flop listTomeu Vizoso2019-05-101-5/+48
| | | | | | | These tests aren't giving reliable results. Mask them for now. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Add support for running the tests on RK3288Tomeu Vizoso2019-05-106-75/+193
| | | | | | | | Build artifacts for armhf and schedule them on a Veyron Chromebook with RK3288. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir: Initialize lower_flrp_progress everywhereIan Romanick2019-05-091-1/+1
| | | | | | | | | | | | | | | | I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: d41cdef2a59 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989
* nir: Use the flrp lowering pass instead of nir_opt_algebraicIan Romanick2019-05-061-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | I tried to be very careful while updating all the various drivers, but I don't have any of that hardware for testing. :( i965 is the only platform that sets always_precise = true, and it is only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32 only for vertex shaders. For fragment shaders, nir_op_flrp is lowered during code generation as a(1-c)+bc. On all other platforms 64-bit nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old nir_opt_algebraic method. No changes on any other Intel platforms. v2: Add panfrost changes. Iron Lake and GM45 had similar results. (Iron Lake shown) total cycles in shared programs: 188647754 -> 188647748 (<.01%) cycles in affected programs: 5096 -> 5090 (-0.12%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% Reviewed-by: Matt Turner <[email protected]>
* nir: nir_shader_compiler_options: drop native_integersChristian Gmeiner2019-05-071-2/+0
| | | | | | | | Driver which do not support native integers should use a lowering pass to go from integers to floats. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* panfrost: Refactor blend descriptorsAlyssa Rosenzweig2019-05-073-120/+122
| | | | | | | | | | | | | | | | | This commit does a fairly large cleanup of blend descriptors, although there should not be any functional changes. In particular, we split apart the Midgard and Bifrost blend descriptors, since they are radically different. From there, we can identify that the Midgard descriptor as previously written was really two render targets' descriptors stuck together. From this observation, we split the Midgard descriptor into what a single RT actually needs. This enables us to correctly dump blending configuration for MRT samples on Midgard. It also allows the Midgard and Bifrost blend code to peacefully coexist, with runtime selection rather than a #ifdef. So, as a bonus, this will help the future Bifrost effort, eliminating one major source of compile-time architectural divergence. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: iabs cannot run on mulAlyssa Rosenzweig2019-05-041-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower mixed csel (NIR)Alyssa Rosenzweig2019-05-042-12/+83
| | | | | | | | | Basically, when the conditions of a csel diverge, we scalarize to avoid going into weird code paths during emit. We could be doing better, but this case can't occur organically from GLSL as far as I can, though it does fix lowered atan2. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix RA when temp_count = 0Alyssa Rosenzweig2019-05-042-50/+70
| | | | | | | | A previous commit by Tomeu aborted RA early, which solves the memory corruption issue, but then generates an incorrect compile. This fixes that. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix integer selectionAlyssa Rosenzweig2019-05-042-33/+10
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Support RGB565 FBOsAlyssa Rosenzweig2019-05-044-29/+80
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard/disasm: Handle dest_override generalizedAlyssa Rosenzweig2019-05-041-22/+68
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard/disasm: Stub out 64-bitAlyssa Rosenzweig2019-05-041-5/+15
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard/disasm: Print 8-bit sourcesAlyssa Rosenzweig2019-05-041-23/+43
| | | | | | | | | | | | | | | | | | | | | | | | | This handles the usual case. 8-bit register access parallels 16-bit access, but with one major caveat: in 8-bit mode, only half of the register file is actually (directly) accessible as sources. In particular, for each 16-bit integer register (hrN), we can only index a *single* 8-bit integer (qrN), corresponding to the lower 8-bits. To get the upper 8-bits, it is required to do an explicit shift. For example, to add the bytes of a 16-bit integer hr0.x and get the result as an 8-bit qr0, you'd need to do something like: ilsr hr1.x, hr0.x, #8 iadd qr0.x, qr0.x, qr1.x This scheme diverges from 32-bit registers, in that both the upper and lower halves of a 32-bit register are individually accessible as a pair of half registers. For contrast, to add the lower and upper 16-bits of a 32-bit integer r0.x, you can just: iadd hr0.x, hr0.x, hr1.x Since hr1.x = upper 16-bit of r0.x. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard/disasm: Support 8-bit destinationAlyssa Rosenzweig2019-05-041-18/+21
| | | | | | | | Meanwhile, we're forced to disable dest_override, since it's not yet clear how this interacts with other bitnesses (it'll likely need to be overhauled in any case). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Rename ilzcnt8 -> iclzAlyssa Rosenzweig2019-05-042-2/+2
| | | | | | Per OpenCL. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix crash on unknown opAlyssa Rosenzweig2019-05-041-2/+6
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>