summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium/tests: fix build with clang compilerSamuel Pitoiset2016-01-031-273/+330
| | | | | | | | | | | | Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nv50,nvc0: optimize coherent buffer checking at draw timeSamuel Pitoiset2016-01-036-68/+82
| | | | | | | | | | Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Fix build from upload changes.Eric Anholt2016-01-021-1/+1
|
* gallium/radeon: send LLVM diagnostics as debug messagesNicolai Hähnle2016-01-021-15/+46
| | | | | | | | | | | | | Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2)Nicolai Hähnle2016-01-027-9/+18
| | | | | | | This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: send shader info as debug messages in addition to stderr outputNicolai Hähnle2016-01-021-14/+55
| | | | | | | | | | | | | | The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2)Nicolai Hähnle2016-01-024-14/+22
| | | | | | | This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: implement set_debug_callbackNicolai Hähnle2016-01-022-0/+14
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* u_upload_mgr: allow specifying PIPE_USAGE_* for the upload bufferMarek Olšák2016-01-0216-23/+37
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: remove alignment parameter from u_upload_createMarek Olšák2016-01-0216-23/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_buffer manuallyMarek Olšák2016-01-023-2/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_data manuallyMarek Olšák2016-01-0216-19/+29
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_alloc manuallyMarek Olšák2016-01-0221-23/+32
| | | | | | | | | | The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: rework the application of alignmentMarek Olšák2016-01-021-10/+14
| | | | | | | | | | | | | | | The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50,nvc0: make sure there's pushbuf space and that we ref the bo earlyIlia Mirkin2016-01-014-6/+5
| | | | | | | | | | | First off, we can't flush in the middle of a command. Secondly requesting the extra push space might cause a flush to happen. If that flush happens, we'd have to do the PUSH_REFN again. So instead do PUSH_REFN after the push space request. This helps avoid rare crashes with supertuxkart in libdrm due to assertion failures. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nvc0: Set winding order regardless of domain.Kenneth Graunke2015-12-301-2/+4
| | | | | | | | | | Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nvc0: add ARB_shader_draw_parameters supportIlia Mirkin2015-12-3013-15/+73
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add a drawid to pipe_draw_infoIlia Mirkin2015-12-301-0/+2
| | | | | | | | | This will allow the state tracker to inform the driver where in a broken-up multidraw we currently are. This can then be passed into the vertex shader. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add PIPE_CAP_DRAW_PARAMETERSIlia Mirkin2015-12-3016-2/+19
| | | | | | | | This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add baseinstance/drawid semanticsIlia Mirkin2015-12-303-1/+18
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv50/ir: attempt to do more constant folding on mad -> add conversionIlia Mirkin2015-12-301-11/+10
| | | | | | | | | The add might actually have a 0 as an argument, which would convert it into a mov. Make sure to detect that. Also avoid the hack of putting the immediate directly into the instruction, instead use a mov to put it into place and let the later LoadPropagation pass place it if possible. Signed-off-by: Ilia Mirkin <[email protected]>
* nir/builder: Add an init function that creates a simple shader for youJason Ekstrand2015-12-291-8/+4
| | | | | | | | | | | A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* nv50/ir: float(s32 & 0xff) = float(u8), not s8Ilia Mirkin2015-12-291-0/+3
| | | | | | | | Make sure to make conversion unsigned when we're ANDing the high bits away. Fixes corruption in dolphin. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* radeonsi: add RADEON_REPLACE_SHADERS debug optionNicolai Hähnle2015-12-293-5/+105
| | | | | | | | | | | This option allows replacing a single shader by a pre-compiled ELF object as generated by LLVM's llc, for example. This can be useful for debugging a deterministically occuring error in shaders (and has in fact helped find the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264). v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: count compilations in si_compile_llvmNicolai Hähnle2015-12-292-1/+2
| | | | | | | | This changes the count slightly (because of si_generate_gs_copy_shader), but this is only relevant for the driver-specific num-compilations query. It sets the stage for the next commit. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: add DEBUG_GET_ONCE_OPTIONNicolai Hähnle2015-12-291-0/+13
| | | | | | This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS. Reviewed-by: Marek Olšák <[email protected]>
* r600: fix constant buffer size programmingGrazvydas Ignotas2015-12-292-2/+2
| | | | | | | | | | | | When buffer size is less than 16, zero ends up being programmed as size, which prevents the hardware from fetching the correct values. Fix it by combining shift and align so that the value is always rounded up. Cc: "11.1 11.0 10.6" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229 Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: Get rid of function overloadsJason Ekstrand2015-12-287-26/+25
| | | | | | | | | | | | | | | | | When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]> ir3 bits are Reviewed-by: Rob Clark <[email protected]>
* nvc0: don't forget to reset VTX_TMP bufctx slot after blit completionIlia Mirkin2015-12-271-0/+2
| | | | | | | Also release the scratch allocation if any. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nv50,nvc0: add a note when converting vertex elements using CPUIlia Mirkin2015-12-272-0/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/auxiliary: don't build NIR sources with MSVC2008 flagsConnor Abbott2015-12-232-7/+15
| | | | | | | | | | | | | NIR has never been built with MSVC2008, so we shouldn't add MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get rid of the pragma in tgsi_to_nir.c. Build tested with freedreno. v2: Use MSVC2013_COMPAT_CLFAGS instead. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* freedreno/ir3: spelling..Rob Clark2015-12-231-6/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: Add a writemask to store intrinsics.Kenneth Graunke2015-12-222-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tessellation control shaders need to be careful when writing outputs. Because multiple threads can concurrently write the same output variables, we need to only write the exact components we were told. Traditionally, for sub-vector writes, we've read the whole vector, updated the temporary, and written the whole vector back. This breaks down with concurrent access. This patch prepares the way for a solution by adding a writemask field to store_var intrinsics, as well as the other store intrinsics. It then updates all produces to emit a writemask of "all channels enabled". It updates nir_lower_io to copy the writemask to output store intrinsics. Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks by doing a read-modify-write cycle (which is safe, because local variables are specific to a single thread). This should have no functional change, since no one actually emits partial writemasks yet. v2: Make nir_validate momentarily assert that writemasks cover the complete value - we shouldn't have partial writemasks yet (requested by Jason Ekstrand). v3: Fix accidental SSBO change that arose from merge conflicts. v4: Don't try to handle writemasks in ir3_compiler_nir - my code for indirects was likely wrong, and TTN doesn't generate partial writemasks today anyway. Change them to asserts as requested by Rob Clark. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [v3]
* nouveau: enable use of new kernel interfacesBen Skeggs2015-12-221-2/+0
| | | | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nvc0: remove use of deprecated sw class identifierBen Skeggs2015-12-221-3/+5
| | | | | | | | | | | Also emits a method to properly bind the class to a subchannel, which was missing previously. The kernel currently doesn't care, but this will break if it ever decides to (ie. to support multiple sw classes). Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nv50: fix g98+ vdec class allocationBen Skeggs2015-12-221-6/+51
| | | | | | | | | | | | | | | | | | | The kernel previously exposed incorrect classes for some of the chipsets that this code supports. It no longer does, but the older object ioctls have compatibility to avoid breaking userspace. This needs to be fixed before switching over to the newer interfaces. Rather than hardcoding chipset->class like the rest of the driver does, this makes use of (new) sclass queries to determine what's available. v2. - update to use symbolic class identifier from <nvif/class.h> Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nouveau: remove use of deprecated nouveau_device_wrap()Ben Skeggs2015-12-223-8/+28
| | | | | | | | | | | | | | Switching to the newer libdrm entry-points tells libdrm that it's OK to make use of newer kernel interfaces. We want to be able to isolate any bugs to either the interfaces changes, or the use of NVIF itself. As such, this commit has a slight hack which forces libdrm to continue using the older kernel interfaces. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nouveau: fix screen creation failure pathsBen Skeggs2015-12-225-25/+33
| | | | | | | | | | | | | | The winsys layer would attempt to cleanup the nouveau_device if screen init failed, however, in most paths the pipe driver would have already destroyed it, resulting in accesses to freed memory etc. This commit fixes the problem by allowing the winsys to detect whether the pipe driver's destroy function needs to be called or not. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nouveau: return nouveau_screen from hw-specific creation functionsBen Skeggs2015-12-225-11/+11
| | | | | | | | | Kills off a void cast. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nouveau: remove use of deprecated nouveau_device::drm_versionBen Skeggs2015-12-227-12/+15
| | | | | | | | | v2. update for libdrm nouveau_drm::lib_version removal Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nouveau: remove use of deprecated nouveau_device::fdBen Skeggs2015-12-223-1/+3
| | | | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* r600: fix viewport clipping handling (v2)Dave Airlie2015-12-223-12/+15
| | | | | | | | | | | | | | If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (we don't have enough info to program VPORT_PROVOKE). Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop vport provoke write, drop initial state writing this on evergreen, only program it on evergreen. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix viewport clipping handling. (v2)Dave Airlie2015-12-221-1/+4
| | | | | | | | | | | | | | | If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (We don't know if oViewport is constant so we skip this.) Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop writing to provoke disable, drop write in initial state. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: drop VTX_CNT_EN write from initial stateDave Airlie2015-12-221-8/+4
| | | | | | | we always program this in shader stages atom now. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/radeon: fix regression in a number of driver queriesNicolai Hähnle2015-12-211-3/+3
| | | | | | | | This rather silly mistake was introduced by commit 01910676. Cc: "11.1" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Do instruction scheduling on the QIR to hide texture fetch latency.Eric Anholt2015-12-184-0/+624
| | | | | | | | | | | | | | | | | | | | This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)
* vc4: Fix latency handling for QPU texture scheduling.Eric Anholt2015-12-181-32/+50
| | | | | | There's only high latency between a complete texture fetch setup and collecting its result, not between each step of setting up the texture fetch request.
* vc4: Keep sample mask writes from being reordered after TLB writesEric Anholt2015-12-181-1/+2
| | | | | | Fixes a regression I noticed after introducing scheduling on the QIR. Cc: "11.1" <[email protected]>
* freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabledRob Herring2015-12-181-1/+1
| | | | | | | | | Android builds with -Werror=pointer-to-int-cast causing an error on 32-bit builds. Cc: "11.0 11.1" <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* nir: Delete bany, ball, fany, fall.Matt Turner2015-12-181-1/+3
| | | | | | | | | | | As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <[email protected]>