summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* ddebug: move dd_call into dd_pipe.hMarek Olšák2016-07-262-66/+66
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: separate draw call dumping logicMarek Olšák2016-07-261-21/+26
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: move all states into a separate structureMarek Olšák2016-07-263-129/+140
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: write contents of dmesg into hang reportsMarek Olšák2016-07-261-4/+25
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: implement create_batch_queryMarek Olšák2016-07-261-0/+27
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: don't use abort()Marek Olšák2016-07-261-1/+1
| | | | | | We don't want a core dump. Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: make dd_get_file_stream accept the screen onlyMarek Olšák2016-07-261-7/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: clean up ddebug_screen_createMarek Olšák2016-07-261-16/+23
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: rework flags for pipe_context::dump_debug_stateMarek Olšák2016-07-262-15/+29
| | | | | | | | The pipelined hang detection mode will not want to dump everything. (and it's also time consuming) It will only dump shaders after a draw call and then dump the status registers separately if a hang is detected. Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: add hash table look-up for exported dmabufsRob Herring2016-07-264-3/+56
| | | | | | | | | | | | | It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Disable early Z with computed depth.Eric Anholt2016-07-263-2/+11
| | | | | We don't tell the hardware whether we're computing depth, so we need to manage early Z state manually. Fixes piglit early-z.
* nvc0: use nvc0_m2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-13/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use nve4_p2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-36/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: upload sample locations on GM20xSamuel Pitoiset2016-07-243-5/+31
| | | | | | | | This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: time-elapsed query should be active for clearsRob Clark2016-07-241-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* nvc0/ir: fix up an assertion in emitUADD()Samuel Pitoiset2016-07-241-4/+3
| | | | | | | | It's illegal to have neg modifiers on both sources for OP_ADD, and it's illegal to have OP_SUB with just src0 neg. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix wrong indentation in nvc0_validate_fb()Samuel Pitoiset2016-07-231-141/+141
| | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]>
* freedreno/a4xx: timestamp queriesRob Clark2016-07-233-1/+34
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: hw timestamp supportRob Clark2016-07-232-2/+15
| | | | | | If the kernel supports it, use hw counter for timestamps. Signed-off-by: Rob Clark <[email protected]>
* freedreno: prep work for timestamp queriesRob Clark2016-07-233-6/+10
| | | | | | | | | We need "NULL" state to be a valid bit in the bitmask, because timestamp queries are not restricted to draw/etc stages (ie. the only commands to submit may just be to read the timestamp). And just because there are no draws, isn't a reason to skip the flush and return zero. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: ensure sample locations are set for line and polygon smoothingNicolai Hähnle2016-07-231-2/+1
| | | | | | | Since commit d938b8c, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <[email protected]>
* radeonsi: fix Polaris MSAA regressionNicolai Hähnle2016-07-232-15/+20
| | | | | | | | | | | | | The regression was introduced by commit d938b8c. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <[email protected]>
* freedreno/ir3: Add missing braces in initializer[email protected]2016-07-231-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning (v2)[email protected]2016-07-231-0/+2
| | | | | | | v2: no need for break after an unreachable (Matt Turner) Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* radeonsi: implement buffer_subdata without indirect callsMarek Olšák2016-07-235-5/+41
| | | | | | There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/util: don't modify usage in pipe_buffer_writeMarek Olšák2016-07-231-0/+5
| | | | | | All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-2334-132/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* nv50/ir: allow to swap sources for OP_SUBSamuel Pitoiset2016-07-221-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the load-propagation pass to swap the sources in presence of immediate values. Maxwell (GM107): total instructions in shared programs :1928187 -> 1927634 (-0.03%) total gprs used in shared programs :330741 -> 330154 (-0.18%) total local used in shared programs :28032 -> 28032 (0.00%) local gpr inst bytes helped 0 271 425 425 hurt 0 0 194 194 Fermi (GF114): total instructions in shared programs :2334474 -> 2333829 (-0.03%) total gprs used in shared programs :380934 -> 380215 (-0.19%) total local used in shared programs :33304 -> 33264 (-0.12%) local gpr inst bytes helped 5 314 521 521 hurt 0 4 195 195 No regressions on GM107 and GF114 with full piglit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/radeon: make deferred flushes asynchronousMarek Olšák2016-07-221-0/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* nvc0/mme: fix offsets used for indirect drawsSamuel Pitoiset2016-07-222-8/+8
| | | | | | | | | | This fixes a regression introduced in 1da704a94c57aa0b0cf8faaa3236fe47dfb8f88c because the offset has moved from 0x180 to 0x1a0, and the macros have to be re-compiled. Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix offsets of MP perf counters input parametersSamuel Pitoiset2016-07-221-15/+15
| | | | | | | | | | | | | This fixes a regression introduced in 1da704a94c57aa0b0cf8faaa3236fe47dfb8f88c because the offset has moved from 0x600 to 0x620, and the kernels used for reading MP perf counters have to be re-assembled. This also fixes amd_performance_monitor_measure piglit. Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Return V3D version details in the GL renderer info.Eric Anholt2016-07-202-1/+12
| | | | This is as close as we get to a name for the 3D blocks.
* vc4: Check the V3D version reported by the kernel.Eric Anholt2016-07-202-0/+62
| | | | | | We don't want to bring up an old userspace driver on a kernel for newer hardware. We'll also want to look at the other ident fields in the future.
* vc4: Detect and report kernel support for branching.Eric Anholt2016-07-201-2/+12
|
* vc4: Switch to using the libdrm-provided vc4_drm.h.Eric Anholt2016-07-202-280/+2
| | | | | The required version is set to .69 for the getparam ioctl that will be used in the next commit.
* swr: [rasterizer core] introduce simd16intrin.hTim Rowley2016-07-204-6/+751
| | | | | | | | | Refactoring to leave existing simd_* intrinsics in "simdintrin.h" unchanged, adding corresponding simd16_* intrinsics in "simd16intrin.h" on the side, with emulation, that we can use piecemeal, rather than the all-or-nothing approach to bring up avx512. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] fix for possible int32 overflow conditionTim Rowley2016-07-201-1/+1
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] rename *_MAX enum values to *_COUNTTim Rowley2016-07-205-22/+21
| | | | | | Makes these names semantically correct. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] centroid correctionTim Rowley2016-07-201-9/+17
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] support range of values in TemplateArgUnrollerTim Rowley2016-07-203-26/+56
| | | | | | Fixes Linux warnings. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] ensure adjacent topologies use the cut-aware PATim Rowley2016-07-201-5/+2
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer] attribute swizzling and linkageTim Rowley2016-07-2011-171/+218
| | | | | | | | | Add support for enhanced attribute swizzling. Currently supports constant source overrides to handle PrimitiveID support. No support yet for input select swizzling or wrap shortest. Removes obsoleted linkageMask and associated code. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] icc declspec definitionsTim Rowley2016-07-201-1/+17
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] rework vertex/instance ID storage in fetchTim Rowley2016-07-202-64/+36
| | | | | | | | Moved the setting into the existing component control code. Fixes bad interaction between attribute/component setting for vertex/instance ID and component packing. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] avx512 simd utility workTim Rowley2016-07-204-10/+1026
| | | | | | Enabling KNOB_SIMD_WIDTH = 16 for AVX512 pre-work and low level simd utils Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] viewport rounding for disabled scissorTim Rowley2016-07-201-2/+4
| | | | | | | Adjust viewport rounding when scissor rect is disabled during macro tile scissor setup. Signed-off-by: Tim Rowley <[email protected]>
* radeonsi: advertise 8 bits subpixel precision for viewport boundsJózef Kucia2016-07-201-1/+2
| | | | | Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600: advertise 8 bits subpixel precision for viewport boundsJózef Kucia2016-07-201-1/+2
| | | | | Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: add a cap for VIEWPORT_SUBPIXEL_BITS (v2)Józef Kucia2016-07-2015-0/+15
| | | | | | | | | | | | This allows Gallium drivers to advertise the subpixel precision for floating point viewports bounds. v2: - Set ViewportSubpixelBits in st_init_limits. Signed-off-by: Józef Kucia <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: disable MS images on GM107+Samuel Pitoiset2016-07-201-0/+7
| | | | | | | | MS images have to be handled explicitly and I don't plan to implement them for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>