aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* swr: [rasterizer codegen] Fix generation of knobsTim Rowley2017-03-208-8/+23
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer codegen] Change backend template comment styleTim Rowley2017-03-201-29/+29
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer codegen] Rewrite gen_llvm_ir_macros.py to use makoTim Rowley2017-03-206-349/+204
| | | | | | Don't create/use cpp files, header only now. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer codegen] Quiet gen_backends.py executionTim Rowley2017-03-201-3/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer scripts] Put codegen scripts into a separate directoryTim Rowley2017-03-2032-84/+84
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix trifan regression from 9d3442575fTim Rowley2017-03-202-5/+11
| | | | | | | | Fixes piglit triangle-rasterization-overdraw. SIMD16 path not working. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] SIMD16 Frontend WIP - fix tesselation crashesTim Rowley2017-03-203-31/+35
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] Fix LogicOp blend jit after assert changesTim Rowley2017-03-201-10/+25
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Convert more SWR_ASSERT(false, ...) to SWR_INVALID(...)Tim Rowley2017-03-2018-57/+57
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix typo in SIMD16 code pathTim Rowley2017-03-201-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core/common] Fix the native AVX512 build under ICCTim Rowley2017-03-203-27/+47
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Allow no arguments to SWR_INVALID macroTim Rowley2017-03-201-1/+13
| | | | | | Turns out this is somewhat tricky with gcc/g++. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Slight assert refactoringTim Rowley2017-03-2017-256/+296
| | | | | | | | Make asserts more robust. Add SWR_INVALID(...) as a replacement for SWR_ASSERT(0, ...) Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Backend code adjustmentsTim Rowley2017-03-205-45/+70
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Fix the early and late depthstencil eventsTim Rowley2017-03-201-5/+5
| | | | | | The coverage and stencil mask arguments were reversed. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Implement double pumped SIMD16 TESSTim Rowley2017-03-201-79/+177
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast/core/scripts] Fix archrast multithreading issueTim Rowley2017-03-206-16/+52
| | | | | | | | Per pixel stats are cached but were not always being flushed as threads moved from one draw context to the next. Added an explicit flush to allow all archrast objects to flush any cached events. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Remove redundant data from archrast filesTim Rowley2017-03-202-137/+103
| | | | | | | If count can be derived from other counts then this can be done in post processing scripts. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast/scripts] Further archrast cleanupsTim Rowley2017-03-203-164/+104
| | | | | | Removed redundant data being written out to file Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix RECT_LIST primitive assemblyTim Rowley2017-03-201-2/+2
| | | | | | | The bug would make the 3rd component of attributes on the second triangle of a RECT be invalid. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] Add InterpolateComponentFlat utilityTim Rowley2017-03-201-0/+13
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Fix performance issue with archrast statsTim Rowley2017-03-201-15/+15
| | | | | | | Performance is now 50x faster with archrast now that we're properly filtering out all of the rdtsc begin/end. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Implement SIMD16 GS and STREAMOUTTim Rowley2017-03-201-51/+251
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Add additional API eventsTim Rowley2017-03-202-0/+48
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core/scripts] Autogen backend initialization function(s)Tim Rowley2017-03-207-226/+398
| | | | | | | | | | | Autogen functions that instantiates different BackendPixelRate templates. Functions get split into separate files after reaching a user defined threshold (currently 512 per file) to speed up compilation. This change will enable the addition of more template flags in the pixel back end. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] backend.h declares gBackendPixelRateTableTim Rowley2017-03-202-1/+8
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Finish SIMD16 PA OPT including tesselationTim Rowley2017-03-201-21/+247
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Finish SIMD16 PA OPT except tesselationTim Rowley2017-03-202-274/+1405
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Support sparse numa id values on all OSesTim Rowley2017-03-201-27/+53
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* r600g/sb: Fix memory leak by reworking uses list (rebased)Constantine Kharlamov2017-03-204-61/+28
| | | | | | | | | | | | | | | | | | | | | | | | | The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment. Tested-by: Constantine Charlamov <[email protected]> v2: Resend the patch again through git-email. The prev. rebase was sent through Thunderbird, which screwed up tab characters, making the patch not apply. -------------- When fixing the stalls on evergreen I introduced leaking of the useinfo structure(s). Sorry. Instead of allocating a new object to hold 3 values where only one is actually used, rework the list to just store the node pointer. Thus no allocating and deallocation is needed. Since use_info and use_kind aren't used anywhere, drop them and reduce code complexity. This might also save some small amount of cycles. Thanks to Bartosz Tomczyk for finding the bug. Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Supersedes: https://patchwork.freedesktop.org/patch/135852 Signed-off-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: check the IR type before waiting for a compute compilation fenceMarek Olšák2017-03-201-1/+3
| | | | | | | This should fix OpenCL getting stuck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288 Reviewed-by: Samuel Pitoiset <[email protected]>
* si_descriptor: move velems nullity check before dereferenceJulien Isorce2017-03-201-4/+11
| | | | | | | | | CID 1399479: Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking velems suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* si_pipe: remove nullity check after dereferenceJulien Isorce2017-03-201-3/+0
| | | | | | | | | sscreen cannot be NULL CID 1354483 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: Fix out of bounds accessBartosz Tomczyk2017-03-202-20/+22
| | | | | | | | | fc_sp variable should indicate number of elements in fc_stack array, but fc_sp was increased at beginning of fc_pushlevel function. It leads to situation where idx=0 was never used, and last 32 element was stored outside fs_stack array. Signed-off-by: Marek Olšák <[email protected]>
* r600g: update sb documentationConstantine Kharlamov2017-03-201-3/+6
| | | | | | | v2: s/r600/r600g in the title Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: make condition clearerConstantine Kharlamov2017-03-201-6/+8
| | | | | | | | | | | | | | | | | The second check in the old code looked pretty much unreachable, esp. because it's not obvious that "max_entries" could be zero. To find out that it was intentional I had to run some checks, and to dig into the old versions of the file. So, rewrite the check to make the intention clear. v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap lines of condition. Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Acked-by: Dieter Nützel <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* nv30: create uploader after pipe->screen is setIlia Mirkin2017-03-191-6/+6
| | | | | | Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: enable TEX_LZ and TXF_LZIlia Mirkin2017-03-183-4/+17
| | | | | | | | | There should be minimal gain, if any, for nvc0, but nv50 may end up noticing more often that the lod argument is uniform. This, in turn, will remove the need for some unnecessary transformations, which were being hit due to the checks being done pre-ssa. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: treat FMA like MAD for operand propagationKarol Herbst2017-03-181-0/+1
| | | | | | | | | | | | | | | | | | Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium/radeon: formalize that create_batch_query doesn't need pipe_contextMarek Olšák2017-03-173-13/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: formalize that create_query doesn't need pipe_contextMarek Olšák2017-03-173-32/+32
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: reference pipe_resource in pipe_transferMarek Olšák2017-03-172-2/+5
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: compile all TGSI compute shaders asynchronouslyMarek Olšák2017-03-171-44/+81
| | | | | | required by threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: require that compiler threads are enabledMarek Olšák2017-03-172-11/+13
| | | | | | | threaded gallium can't use pipe_context's LLVM target machine, because create_shader_selector can be called from a non-driver thread. Reviewed-by: Timothy Arceri <[email protected]>
* trace: remove leftover assertions after pipe_resource wrapping removalMarek Olšák2017-03-171-6/+0
|
* swr: support layer output in geometry shadersIlia Mirkin2017-03-151-0/+2
| | | | | | | | | This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with 2DMS_ARRAY, which is expected given the lackluster MSAA support. However all the regular types pass. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: validate backend state numAttributesTim Rowley2017-03-151-0/+2
| | | | | | | | General protection and prevents us from smashing the stack on the first clear state validation (a7b8d50bcb). Fixes crash using icc. Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: implement TGSI opcodes TEX_LZ and TXF_LZMarek Olšák2017-03-152-6/+16
| | | | | | | | | | This massively decreases VGPR spilling for DiRT Showdown, because we no longer have to use v4i32 for 2D fetches when level == 0. We now use v2i32 for those cases. DiRT Showdown - Spilled VGPRs: -26 (-81%) This surprisingly doesn't have any useful effect on performance (+ 0.05%).
* gallium: add PIPE_CAP_TGSI_TEX_TXF_LZMarek Olšák2017-03-1515-0/+15
|
* radeonsi: disable sinking common instructions down to the end blockSamuel Pitoiset2017-03-151-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>