aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* freedreno/a2xx: ir2: fix saturate in cpJonathan Marek2019-09-061-0/+4
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: set lower_fdphJonathan Marek2019-09-061-0/+1
| | | | | | | | The fdph opcode is not supported. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: remove pointcoord y invertJonathan Marek2019-09-061-4/+2
| | | | | | | | | Fixes the following deqp test: dEQP-GLES2.functional.shaders.builtin_variable.pointcoord Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: fix lowering of instructions after float loweringJonathan Marek2019-09-061-3/+2
| | | | | | | | | | | Some instructions generated by int/bool float lowering need to be lowered by opt_algebraic. Fixes: 43dbd7d6 Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima/ppir: don't lower vector {b,f}csel to scalar if condition is scalarVasily Khoruzhick2019-09-061-5/+21
| | | | | | | | | | Utgard PP has vector fcsel operation, but its condition is scalar. Add filtering callback that checks whether {b,f}csel condition is not scalar to lower {b,f}csel to scalar only in this case. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-0616-67/+113
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* util: android logging supportRob Clark2019-09-062-2/+21
| | | | | | | | | In particular, it would be nice for failed debug_assert() msgs to show up in logcat. Signed-off-by: Rob Clark <[email protected]> Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/ir3: allow copy propagation for relativeRob Clark2019-09-061-9/+19
| | | | | | | | | | This appears to work fine (with the additional constraint of keeping the indirect load in the same block that a0.x was loaded). We can probably lift this restriction on earlier gens after testing. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix cp cmps.s optRob Clark2019-09-061-1/+1
| | | | | | | | | | | Need to use ir3_instr_set_address(), otherwise the instruction might not get added to the indirects table. This becomes a problem when we turn on copy propagation for relative accesses, as check_instr() in the sched pass won't realize there is an indirect consumer of address register load that is ready to be scheduled. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: assert that only single addressRob Clark2019-09-062-0/+5
| | | | | | | | | | | | | | An instruction can reference only a single address register value. Add an assert to catch bugs. Also, address value should also be local to the same block as the instruction. (The one spot where changing the instruction address is actually legit needs to clear the address first.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix mad copy propagation special caseRob Clark2019-09-061-9/+35
| | | | | | | | | | | | After the next patch enabling copy propagation for relative sources, we'll need to dereference the n'th src in valid_flags(), so we actually need to swap the sources before calling valid_flags(). But the logic was already a bit cumbersome, so move it into a helper function. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix addr/pred spillingRob Clark2019-09-061-7/+42
| | | | | | | | | The live_values and use_count was not being properly updated. This starts triggering problems with the next patch, where we allow copy propagation for RELATIV access. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: cleanup "partially const" ubo srcsRob Clark2019-09-061-4/+52
| | | | | | | | | | Move the constant part of the indirect offset into nir intrinsic base. When we have multiple indirect accesses with different constant offsets, this lets other opt passes clean up things to use a single address register value. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima/ppir: improve regalloc spill cost calculationErico Nunes2019-09-051-5/+49
| | | | | | | | | | | | Now that spilling ops can be inserted into existing instructions, it makes sense to increase cost to spill registers that would cause the creation of a new instruction. Experimental results showed that penalizing too much due to this caused worse results, however it is beneficial as a tie resolver between registers with the same number of components. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: optimizations in regalloc spilling codeErico Nunes2019-09-051-90/+88
| | | | | | | | | | | | | | | Avoid creating unnecessary instructions for the load/store temp nodes when not required, to further reduce register pressure. The store_temp operation seems to be unable to do any spilling. At least the offline shader seems to never output instructions accessing swizzled components, and attempting to output that in ppir results in errors. So, force spilled registers to allocate a full vec4 register. This seems to be the optimal way as it is possible to always keep stores and temps in a single instruction that can be pipelined. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: mark regalloc created ssa unspillableErico Nunes2019-09-051-0/+1
| | | | | | | | | | | One ssa created in the spillinc code in ppir_update_spilled_src was not properly being marked 'spilled', which made it a candidate for future spilling attempts. Since it was being inserted by the spilling code itself, let's mark it unspillable to avoid an infinite spilling loop. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* v3d: writes to magic registers aren't RF writes after THRENDJose Maria Casanova Crespo2019-09-051-1/+3
| | | | | | | | | | | | | | Shaders must not attempt to write to the register files in the last three instructions, but that doesn't include the magic registers: nop ; nop ; thrsw; ldtmu.- *** ERROR *** nop ; nop nop ; nop v2: Simplify validation rules. (Eric Anholt) v3: Adjust validation even more. (Eric Anholt) Reviewed-by: Eric Anholt <[email protected]>
* intel/dri: finish proper glthreadSergii Romantsov2019-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: dca36d5516d0 (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Call nir_propagate_invariant()Connor Abbott2019-09-051-0/+2
| | | | | | | | Without this, invariant qualifiers don't do anything. Together with a fix to the game, this fixes flickering in No Man's Sky. Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/nir: Don't lower constant arrays to uniformsConnor Abbott2019-09-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shader-db results: Totals: SGPRS: 3955968 -> 3954960 (-0.03 %) VGPRS: 2220220 -> 2220092 (-0.01 %) Spilled SGPRs: 11387 -> 11325 (-0.54 %) Spilled VGPRs: 97 -> 97 (0.00 %) Private memory VGPRs: 2528 -> 2528 (0.00 %) Scratch size: 2656 -> 2656 (0.00 %) dwords per thread Code Size: 76002204 -> 75994988 (-0.01 %) bytes LDS: 740 -> 740 (0.00 %) blocks Max Waves: 772776 -> 772787 (0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 16840 -> 15832 (-5.99 %) VGPRS: 16452 -> 16324 (-0.78 %) Spilled SGPRs: 1416 -> 1354 (-4.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 2016 -> 2016 (0.00 %) Scratch size: 2040 -> 2040 (0.00 %) dwords per thread Code Size: 953624 -> 946408 (-0.76 %) bytes LDS: 303 -> 303 (0.00 %) blocks Max Waves: 1622 -> 1633 (0.68 %) Wait states: 0 -> 0 (0.00 %) There were a large number of regressions in code size, but they seem to be because NIR unrolls some loop which results in the table being replaced by a bunch of immediates on multiplies etc. -- this bloats code size since the table size is now included, but means that there are less loads so it's still a net positive. Reviewed-by: Timothy Arceri <[email protected]>
* gallium: Plumb through a way to disable GLSL const loweringConnor Abbott2019-09-057-1/+20
| | | | | | | | | | For radeonsi, we will prefer the NIR pass as it'll generate better code (some index calculation and a single load vs. a load, then index calculation, then another load) and oftentimes NIR optimization can kick in and make all the access indices constant. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Don't lower indirects when linkingConnor Abbott2019-09-051-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I believe this was stuck here early because otherwise nir_opt_copy_prop_vars could undo what lower_io_to_temporaries does. However that has since been fixed. Also, we now use scratch for large variables so the comment is stale. On radeonsi these are the shader-db results: Totals: SGPRS: 3955968 -> 3955968 (0.00 %) VGPRS: 2220208 -> 2220220 (0.00 %) Spilled SGPRs: 11387 -> 11387 (0.00 %) Spilled VGPRs: 97 -> 97 (0.00 %) Private memory VGPRs: 2528 -> 2528 (0.00 %) Scratch size: 2656 -> 2656 (0.00 %) dwords per thread Code Size: 76002108 -> 76002204 (0.00 %) bytes LDS: 740 -> 740 (0.00 %) blocks Max Waves: 772779 -> 772776 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 176 -> 176 (0.00 %) VGPRS: 144 -> 156 (8.33 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 12104 -> 12200 (0.79 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 25 (-10.71 %) Wait states: 0 -> 0 (0.00 %) The few small regressions are due to nir_opt_large_constants kicking in when indirect lowering happens to result in smaller code after optimization since the array is very simple. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Call nir_remove_unused_variables() in the opt loopConnor Abbott2019-09-051-0/+10
| | | | | | | | | | | | | This prevents regressions when disabling indirect lowering. Sometimes the only use of an input array was copying it to the array created by nir_lower_io_to_temporaries, and without lowering indirects we wouldn't have eliminated the temporary array until after linking, which was too late to remove unused code in the producer. No shader-db changes with radeonsi NIR. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: Enable nir_opt_large_constantsConnor Abbott2019-09-052-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vkpipeline-db numbers: Totals: SGPRS: 1740306 -> 1741322 (0.06 %) VGPRS: 1331124 -> 1331712 (0.04 %) Spilled SGPRs: 21201 -> 21316 (0.54 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 256 -> 256 (0.00 %) dwords per thread Code Size: 79022628 -> 78694788 (-0.41 %) bytes LDS: 6500 -> 6500 (0.00 %) blocks Max Waves: 301413 -> 301302 (-0.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 53633 -> 54649 (1.89 %) VGPRS: 53000 -> 53588 (1.11 %) Spilled SGPRs: 3454 -> 3569 (3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5284232 -> 4956392 (-6.20 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 4239 -> 4128 (-2.62 %) Wait states: 0 -> 0 (0.00 %) (The biggest VGPR and max wave regression is due to unrolling a loop, which made the scheduler more aggressive, but in this case it's able to effectively hide latency so it's actually probably a win.) shader-db numbers with radeonsi NIR: Totals: SGPRS: 3526496 -> 3526512 (0.00 %) VGPRS: 2198576 -> 2198576 (0.00 %) Spilled SGPRs: 10463 -> 10463 (0.00 %) Spilled VGPRs: 86 -> 86 (0.00 %) Private memory VGPRs: 3182 -> 2528 (-20.55 %) Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread Code Size: 74117280 -> 74106140 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 775846 -> 775844 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 856 -> 872 (1.87 %) VGPRS: 680 -> 680 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 654 -> 0 (-100.00 %) Scratch size: 668 -> 0 (-100.00 %) dwords per thread Code Size: 49652 -> 38512 (-22.44 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 182 -> 180 (-1.10 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Support load_constant intrinsicsConnor Abbott2019-09-051-0/+55
| | | | | | | Setup a constant global variable that LLVM will stick in a .rodata section and generate PC-relative loads for. Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: Don't count read-only data when reporting code sizeConnor Abbott2019-09-056-4/+14
| | | | | | | | | | We usually use these counts as a simple way to figure out if a change reduces the number of instructions or shrinks an instruction. However, since .rodata sections aren't executed, we shouldn't be counting their size for this analysis. Make the linker return the total executable size, and use it to report the more useful size in both drivers. Reviewed-by: Marek Olšák <[email protected]>
* headers: remove redundant GL token from GL wrapperHeinrich Fink2019-09-051-4/+0
| | | | | | | | Removing GL_FRAMEBUFFER_FLIP_Y_MESA token from glheader.h as it is now provided by glext.h Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* specs: Sync framebuffer_flip_y text with GL registryHeinrich Fink2019-09-051-2/+5
| | | | | | | | | | | Sync extension spec of MESA_framebuffer_flip_y to what has been merged upstream in the GL registry. Update now carries the accepted GL extension no. v2: split GL headers update off to separate commit Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* include: sync GL headers with registryHeinrich Fink2019-09-053-14/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | Integrating headers from upstream registry [0] master branch. Effective GL registry commit integrated: 9d534f9312e56c72df763207e449c6719576fd54 Keeping the following quirks local to Mesa: - glext.h: BUILDING_MESA guard (see !1492) - glxext.h: glXQueryGLXPbufferSGIX: 'int' return type (Mesa) vs while 'void' (GL registry) - glxext.h: GLX_RENDERER_ID_MESA is still expected by some mesa tests, even though its token has been removed from the spec (see docs/specs/MESA_query_renderer.spec) - glxext.h: glXGetTransparentIndexSUN / PFNGLXGETTRANSPARENTINDEXSUNPROC argument pTransparentIndex has type 'unsigned long *' (Mesa) vs. 'long *' (GL registry) [0] https://github.com/KhronosGroup/OpenGL-Registry Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* clover: Fix build after clang r370122.Hal Gentz2019-09-042-2/+16
| | | | | | | | | | | | | | | | | ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp: In function ‘std::unique_ptr<clang::CompilerInstance> {anonymous}::create_compiler_instance(const clover::device&, const std::vector<std::__cxx11::basic_string<char> >&, std::string&)’: ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:203:81: error: no matching function for call to ‘clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, const char* const*, const char* const*, clang::DiagnosticsEngine&)’ 203 | c->getInvocation(), copts.data(), copts.data() + copts.size(), diag)) | ^ In file included from /opt/llvm64/include/clang/Frontend/CompilerInstance.h:15, from ../mesa/src/gallium/state_trackers/clover/llvm/codegen.hpp:37, from ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:49: /opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate: ‘static bool clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, llvm::ArrayRef<const char*>, clang::DiagnosticsEngine&)’ 157 | static bool CreateFromArgs(CompilerInvocation &Res, | ^~~~~~~~~~~~~~ /opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate expects 3 arguments, 4 provided Signed-off-by: Hal Gentz <[email protected]> Reviewed-by: Aaron Watry <[email protected]>
* scons: Add coroutines component to build.Vinson Lee2019-09-041-0/+3
| | | | | | Fixes: d32690b43c91 ("gallivm: add coroutine pass manager support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium/osmesa: Move 565 format selection checks where the rest are.Eric Anholt2019-09-041-4/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/osmesa: Fix a race in creating the stmgr.Eric Anholt2019-09-041-9/+17
| | | | | | Noticed while looking at other OSMesa bugs. Reviewed-by: Timothy Arceri <[email protected]>
* gallium/osmesa: Introduce a test.Eric Anholt2019-09-042-0/+52
| | | | | | | | Given that we occasionally touch this code and probably nobody really wants to think about it, introduce a minimal test so that we know we haven't completely broken OSMesa. Reviewed-by: Timothy Arceri <[email protected]>
* docs: Mark 19.2.0-rc2 as done and push back rc3 and rc4/finalDylan Baker2019-09-041-9/+3
|
* glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.Hal Gentz2019-09-042-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When run in optirun, applications that linked to `libGLX.so` and then proceeded to querying Mesa for extension strings caused a SEGV in Mesa. `glXQueryExtensionsString` was calling a chain of functions that eventually led to `__glXQueryServerString`. This function would call `xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`. The latter for some unknown reason returned `NULL`. Passing this `NULL` to `xcb_glx_query_server_string_string_length` would cause a SEGV as the function tried to dereference it. The reason behind the function returning `NULL` is yet to be determined, however, simply checking that the ptr is not `NULL` resolves this. A similar check has been added to `__glXGetString` for completeness sake, although not immediately necessary. In addition to that, we stumbled into a similar problem in `AllocAndFetchScreenConfigs` which tries to access the configs to free them if `__glXQueryServerString` fails. This, of course, SEGVs, because the configs are yet to have been allocated. Simply continuing past the configs if their config ptrs are `NULL` resolves this. We also switch to `calloc` to make sure that the config ptrs are `NULL` by default, and not some uninitialized value. Cc: [email protected] Fixes: 24b8a8cfe821 "glx: implement __glXGetString, hide __glXGetStringFromServer" Fixes: cb3610e37c4c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it " Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Hal Gentz <[email protected]>
* egl: Enable 10bpc EGLConfigs for platform_{device,surfaceless}Adam Jackson2019-09-042-0/+4
| | | | | | It's somewhat annoying that these are so similar for so little benefit. Reviewed-by: Eric Engestrom <[email protected]>
* glsl: Store the precision for a function return typeNeil Roberts2019-09-043-1/+30
| | | | | | | | | The precision for a function return type is now stored in ir_function_signature. This will later be useful to implement mediump to float16 lowering. In the meantime it is also useful to catch errors where a function is redeclared with a different precision. Reviewed-by: Timothy Arceri <[email protected]>
* docs: add llvmpipe features for fb_no_attach and compute shadersDave Airlie2019-09-041-4/+4
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: enable compute shaders if LLVM has coroutinesDave Airlie2019-09-041-1/+1
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add local memory allocation pathDave Airlie2019-09-042-0/+12
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute shader parameter fetching supportDave Airlie2019-09-041-0/+54
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute shader images supportDave Airlie2019-09-044-1/+111
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add ssbo support to compute shadersDave Airlie2019-09-044-0/+61
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute sampler + sampler view support.Dave Airlie2019-09-044-4/+292
| | | | | | This is ported from the fragment shader code. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add support for compute constant buffers.Dave Airlie2019-09-044-2/+72
| | | | | | This is mostly ported from the fragment shader code. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute pipeline statistics support.Dave Airlie2019-09-042-1/+3
| | | | | | This just adds the CS invocations counter. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add grid launchDave Airlie2019-09-041-0/+76
| | | | | | | | | This adds the dispatch code. It creates a job for the number of blocks in the grid, and dispatches them to the threadpool implementation. The threadpool then calls the JIT code to execute the coroutines. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute shader generation.Dave Airlie2019-09-042-0/+337
| | | | | | | | | | | This creates the coroutine execution environment and the main compute shaders that get executed inside it. Each compute shader block is executed in it's own coroutine execution shader, which each "thread" being a coroutine executed inside it in sequence. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: introduce variant building infrastrucutre.Dave Airlie2019-09-041-1/+185
| | | | | | | | This doesn't actually build any of the shaders yet, but just builds up the framework necessary to start building the shaders and variants. Reviewed-by: Roland Scheidegger <[email protected]>