aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-318-97/+180
| | | | | v2: - Drop dependency on LLVM >= 3.5.1
* gallium/radeon: Add query for symbol specific config informationTom Stellard2014-10-313-0/+86
| | | | | | | This adds a query which allows drivers to access the config information of a specific function within the LLVM generated ELF binary. This makes it possible for the driver to handle ELF binaries with multiple kernels / global functions.
* r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param'Dieter Nützel2014-10-301-1/+0
| | | | Signed-off-by: Dieter Nützel <[email protected]>
* radeon/llvm: Dynamically allocate branch/loop stack arraysMichel Dänzer2014-10-292-6/+37
| | | | | | | | | | | This prevents us from silently overflowing the stack arrays, and allows arbitrary stack depths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454 Cc: [email protected] Reported-and-Tested-by: Nick Sarnie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.Eric Anholt2014-10-2810-34/+407
| | | | | Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.
* vc4: Fix mixup of return type in reloc_tex().Eric Anholt2014-10-281-2/+2
|
* vc4: Drop redundant check for is_tmu_write().Eric Anholt2014-10-281-3/+0
| | | | This function is only called when it would return true.
* vc4: Don't forget to validate code that's got PROG_END on it.Eric Anholt2014-10-281-5/+6
| | | | | This signal doesn't terminate the program now, it terminates the program soon. So you have to actually validate the code in the instruction.
* vc4: Add .dir-locals.el for kernel style in the kernel code.Eric Anholt2014-10-281-0/+12
|
* vc4: Fix a couple missing '\n's in error output.Eric Anholt2014-10-282-2/+2
|
* r300g: only set clip_halfz for chips with HW TCLMarek Olšák2014-10-281-1/+1
| | | | | I forgot that we cannot emit vertex shader state on a chip without VS. In such a case, clip_halfz is handled by the Draw module.
* radeonsi: fix incorrect index buffer max size for lowered 8-bit indicesMarek Olšák2014-10-281-1/+1
| | | | | Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-281-3/+3
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* r600g: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-282-6/+6
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* r600g: Implement sm5 UBO/sampler indexingGlenn Kennard2014-10-287-19/+164
| | | | | | | | | Caveat: Shaders using UBO/sampler indexing will not be optimized by SB, due to SB not currently supporting the necessary CF_INDEX_[01] index registers. Signed-off-by: Glenn Kennard <[email protected]>
* r600g: Implement sm5 interpolation functionsGlenn Kennard2014-10-282-3/+237
| | | | | | Requires evergreen/cayman Signed-off-by: Glenn Kennard <[email protected]>
* nv50: handle inverted render conditionsTobias Klausmann2014-10-264-10/+51
| | | | | | | This enables ARB_conditional_render_inverted. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: consider instruction neighbors in cpRob Clark2014-10-252-11/+178
| | | | | | | | | | | | | | | | | Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: always mov tex coordsRob Clark2014-10-251-54/+30
| | | | | | | | | | | | | | | | Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <[email protected]>
* freedreno: rename a couple debug flagsRob Clark2014-10-253-7/+7
| | | | | | | | | dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: skip virtual outputs in standalone compilerRob Clark2014-10-251-0/+3
| | | | | | | Kills get added to the outputs list, to ensure they get scheduled. But they aren't *really* outputs so skip them in the header comment block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: standalone compiler updates for ir3testRob Clark2014-10-254-18/+51
| | | | | | | | | | | | | | | In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <[email protected]>
* ilo: improve blob decodingChia-I Wu2014-10-251-8/+31
| | | | | | | The last few dwords were skipped if the total number of dwords was not a multiple of 4. Change the formatting for better readability. Signed-off-by: Chia-I Wu <[email protected]>
* llvmpipe: Ensure the packed input of the lp_test_format is aligned.José Fonseca2014-10-241-2/+10
| | | | | | | | Fixes: - https://bugs.freedesktop.org/show_bug.cgi?id=85377 - http://llvm.org/bugs/show_bug.cgi?id=21365 Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Flush stdout on lp_test_* unit tests.José Fonseca2014-10-242-0/+3
| | | | | | | So that the order of test messages and gallivm/llvmpipe debug output is preserved. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: introduce PIPE_CAP_CLIP_HALFZ.Mathias Fröhlich2014-10-2413-0/+16
| | | | | | | | | | | | In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]
* vc4: Reuse uniform_data/contents indices when making uniforms.Eric Anholt2014-10-241-0/+7
| | | | | | | | | | This allows vc4_opt_cse.c to CSE-away operations involving the same uniform values. total instructions in shared programs: 37341 -> 36906 (-1.16%) instructions in affected programs: 10233 -> 9798 (-4.25%) total uniforms in shared programs: 10523 -> 10320 (-1.93%) uniforms in affected programs: 2467 -> 2264 (-8.23%)
* vc4: When asked to discard-map a whole resource, discard it.Eric Anholt2014-10-241-14/+28
| | | | | | | This saves a bunch of extra flushes when texsubimaging a whole texture that's been used for rendering, or subdataing a whole BO. In particular, this massively reduces the runtime of piglit texture-packed-formats (when the probes have been moved out of the inner loop).
* vc4: Refactor flushing before mapping a BO.Eric Anholt2014-10-243-12/+13
| | | | I'm going to want to make some other decisions here before flushing.
* vc4: Allow dead code elimination of unused varyings.Eric Anholt2014-10-245-5/+31
| | | | | | | total instructions in shared programs: 39022 -> 37341 (-4.31%) instructions in affected programs: 26979 -> 25298 (-6.23%) total uniforms in shared programs: 11242 -> 10523 (-6.40%) uniforms in affected programs: 5836 -> 5117 (-12.32%)
* vc4: Add debug output to match shaderdb info to program dumps.Eric Anholt2014-10-244-7/+29
| | | | | | I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but when debugging regressions, I want to match shaderdb output to shader disassembly.
* radeon: enable Hyper-Z on r600g and radeonsi by defaultAndreas Boll2014-10-244-5/+5
| | | | | | | | | | | | | | | | | This reverts commit 01e637114914453451becc0dc8afe60faff48d84. Since then many Hyper-Z issues have been fixed or worked around. Enable Hyper-Z by default so that we get enough feedback for the upcoming mesa 10.4 release. If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment variable R600_DEBUG=nohyperz and please report the issue on the bugtracker. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011 See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112 Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* Revert "freedreno/a3xx: only emit dirty consts"Rob Clark2014-10-232-9/+5
| | | | | | | This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79. Which somehow broke gnome-shell.. and needs more investigation. For now, revert..
* freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCERob Clark2014-10-231-7/+6
| | | | | | | | | | | | | | | fd_bo_cpu_prep() doesn't realize the bo is already referenced in unflushed cmdstream. It could be made to do so (but would have to be implemented twice, ie. both for msm and kgsl). But we still can't do the expected thing if the caller isn't using _NOSYNC. Because of the way the tiling works, we need to build quite a bit of cmdstream at flush time, which is not possible to do at the libdrm level. So rather than trying to make fd_bo_cpu_prep() smarter than it can possibly be, just *always* discard and reallocate if the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set. Signed-off-by: Rob Clark <[email protected]>
* gallium/nouveau: fully build the driver under androidMauro Rossi2014-10-231-1/+1
| | | | | | Fix the trivial typo in the variable name. Cc: "10.2 10.3" <[email protected]>
* gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.José Fonseca2014-10-232-14/+0
| | | | | | | | | | | | | | We'll need to update gallivm for the interface changes in LLVM 3.6, and the fewer the number of older LLVM versions we support the less hairy that will be. As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant whether LLVM version supports AVX or not. Runtime support for AVX is always checked and enforced independently.) Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5. Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: implement pipe_rasterizer_state::clip_halfzMarek Olšák2014-10-221-0/+1
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: implement pipe_rasterizer_state::clip_halfzMarek Olšák2014-10-222-0/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r300g: implement pipe_rasterizer_state::clip_halfzMarek Olšák2014-10-223-0/+9
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: Drop references to destroyed blend stateMichel Dänzer2014-10-221-1/+8
| | | | | | | | | | | | Fixes use-after-free when the currently bound blend state is destroyed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]> Cc: [email protected]
* freedreno/a3xx: fix depth/stencil restore formatRob Clark2014-10-211-1/+5
| | | | | | Also fix z16 restore format which was completely wrong. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix viewport state during clearRob Clark2014-10-211-1/+19
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: mark scissor state dirty when enable bit changesRob Clark2014-10-211-0/+10
| | | | | | | | We don't have a scissor enable bit in hw, so when a raster state change results in scissor enable bit changing, we need to also mark scissor state as dirty. Signed-off-by: Rob Clark <[email protected]>
* freedreno: clear vs scissorRob Clark2014-10-217-13/+96
| | | | | | | | | | | The optimization of avoiding restore (mem2gmem) if there was a clear falls down a bit if you don't have a fullscreen scissor. We need to make the decision logic a bit more clever to keep track of *what* was cleared, so that we can (a) completely skip mem2gmem if entire buffer was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that were completely cleared. Signed-off-by: Rob Clark <[email protected]>
* r600g,radeonsi: convert TGSI shader type to LLVM shader typeMarek Olšák2014-10-211-1/+30
| | | | | | | | | | | | The values are hardcoded in the LLVM backend, but the TGSI definitions are going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be increased by 2. We'll use VS for LS and HS, because there's nothing special about them from the LLVM backend point of view, even though the hardware side is different. We do the same for ES. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add some missing register definitionsMarek Olšák2014-10-211-0/+23
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: load ring resource descriptors only onceMarek Olšák2014-10-211-35/+42
| | | | | | v2: document the new functions Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: clarify shader constant load functionsMarek Olšák2014-10-211-40/+46
| | | | | | | | | | I'll need indexed loads without the meta data flag for tessellation later. Also rename load_const to buffer_load_const to distinguish it from indexed const loads. v2: add comments Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: statically declare resource and sampler arraysMarek Olšák2014-10-211-8/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove conversion of DX9 FACE input to GLMarek Olšák2014-10-211-14/+1
| | | | | | st/mesa and gallium expect the DX9 format, so this is useless. Reviewed-by: Michel Dänzer <[email protected]>