summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Fix the src count on exp2/log2.Eric Anholt2016-05-061-2/+2
| | | | Found by the upcoming QIR validate pass.
* vc4: Reuse QPU disasm's cond flags in QIR.Eric Anholt2016-05-063-27/+46
| | | | In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.
* vc4: When emitting an instruction to an existing temp, mark it non-SSA.Eric Anholt2016-05-061-0/+2
| | | | Prevents a bug in the later control-flow support series.
* vc4: Make sure that we don't overwrite the signal for PROG_END.Eric Anholt2016-05-061-0/+8
| | | | | | | | We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.
* nvc0: unreference images when the context is destroyedSamuel Pitoiset2016-05-061-0/+4
| | | | | | | Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4Marek Olšák2016-05-061-1/+2
| | | | | | | | Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: use the hw MSAA resolving if formats are compatibleMarek Olšák2016-05-061-1/+2
| | | | | | | This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <[email protected]>
* vc4: fixup for new nir_foreach_block()Connor Abbott2016-05-054-48/+20
| | | | Reviewed-by: Eric Anholt <[email protected]>
* ir3: fixup for new nir_foreach_block()Connor Abbott2016-05-051-30/+21
|
* swr: [rasterizer core] Faster modulo operator in ProcessVertsTim Rowley2016-05-051-1/+4
| | | | | | Avoid % operator, since we know that curVertex is always incrementing. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Small warning cleanupTim Rowley2016-05-052-8/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macrosTim Rowley2016-05-052-14/+52
| | | | | | Fix static code analysis errors found by coverity on Linux Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Miscellaneous backend changesTim Rowley2016-05-053-22/+31
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT formatTim Rowley2016-05-053-7/+41
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] Fix printing bugs for tracing.Tim Rowley2016-05-051-81/+24
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer memory] Add missing store tiles functionTim Rowley2016-05-051-1/+4
| | | | | | Storing color hot tile to 8bit w-major stencil format. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] Add asserts for supported formats in fetch shaderTim Rowley2016-05-051-0/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix thread allocationTim Rowley2016-05-051-17/+47
| | | | | | | | Fix windows in 32-bit mode when hyperthreading is disabled on Xeons. Some support for asymmetric processor topologies. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix threadviz support in bucketsTim Rowley2016-05-053-12/+14
| | | | | | | Need to do lazy eval of the threadviz knob since order of globals is undefined. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Whitespace cleanup and misc changesTim Rowley2016-05-055-5/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: mark descriptor loads as using dynamically uniform indicesNicolai Hähnle2016-05-051-5/+17
| | | | | | | | This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <[email protected]>
* swr: Remove stall waiting for core query counters.Bruce Cherniak2016-05-054-124/+81
| | | | | | | | When gathering query results, swr_gather_stats was unnecessarily stalling the entire pipeline. Results are now collected asynchronously, with a fence marking completion. Reviewed-By: George Kyriazis <[email protected]>
* freedreno: remove null check before freeThomas Hindoe Paaboel Andersen2016-05-051-2/+1
| | | | Reviewed-by: Eduardo Lima Mitev <[email protected]>
* r600,compute: create vtx buffer for text + rodataJan Vesely2016-05-041-2/+10
| | | | | | | Reserve buffer id 2 Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* freedreno: allow ctx->draw_vbo to failRob Clark2016-05-045-30/+37
| | | | | | | Pretty much only happens if shader variant compile fails. But in this case, if we haven't emitted cmdstream, we don't want to set needs_flush. Signed-off-by: Rob Clark <[email protected]>
* freedreno: move shader-stage dirty bits to global dirty flagRob Clark2016-05-048-59/+41
| | | | | | | | | | | This was always a bit overly complicated, and had some issues (like ctx->prog.dirty not getting reset at the end of the batch). It also required some special hacks to avoid resetting dirty state on binning pass. So just move it all into ctx->dirty (leaving some free bits for future shader stages), and make FD_DIRTY_PROG just be the union of all FD_SHADER_DIRTY_*. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix bogus offset for f32x24s8 stencil restoreRob Clark2016-05-041-4/+5
| | | | | | fixes: $piglit/bin/fbo-clear-formats GL_ARB_depth_buffer_float Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some debug_asserts() to catch insane offsetsRob Clark2016-05-041-0/+2
| | | | | | | Ofc won't catch *all* faults, but at least helpful for catching offsets which are completely bogus. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: deal with VS which do not write positionRob Clark2016-05-041-0/+7
| | | | | | | | Fixes $piglit/bin/glsl-1.40-tf-no-position a3xx may need similar? Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove a couple redundant is_flow()sRob Clark2016-05-042-2/+2
| | | | | | | Now that the opc's encode the instruction category (making them unique) we no longer need to check the category in addition to the opc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cp small negative integers tooRob Clark2016-05-041-1/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix # of registersRob Clark2016-05-041-1/+1
| | | | | | | The instruction encoding allows for more registers, but at least on a3xx/a4xx they don't actually exist. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower immeds to constRob Clark2016-05-043-4/+80
| | | | | | | | | | | | | | | | | Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3_cp_ctxRob Clark2016-05-043-12/+22
| | | | | | Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <[email protected]>
* nouveau/video: properly detect the decoder class for availability checksIlia Mirkin2016-05-041-8/+17
| | | | | | | | | | | The kernel is now more strict with the class ids it exposes, so we need to check the G98 and MCP89 classes as well as the GT215 class. This effectively caused us to decide there were no decoding capabilities on newer kernel for VP3 chips. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2" <[email protected]>
* svga: try to flag surfaces for sampling, in addition to renderingBrian Paul2016-05-031-0/+11
| | | | | | | | | This silences some warnings when we try to sample from surfaces that were created for drawing, such as when blitting from one of the framebuffer surfaces. We were already doing the opposite situation (adding a bind flag for rendering to surfaces declared as texture sources). Reviewed-by: Charmaine Lee <[email protected]>
* svga: fix copying non-zero layers of 1D array texturesBrian Paul2016-05-031-10/+12
| | | | | | | | Like cube maps, we need to convert the z information to a layer index. Also rename the *_face vars to *_face_layer to make things a little more understandable. Reviewed-by: Charmaine Lee <[email protected]>
* svga: clean up svga_pipe_blit.cBrian Paul2016-05-031-68/+13
| | | | | | Remove dead code. Fix formatting. Reviewed-by: Charmaine Lee <[email protected]>
* rbug: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-1/+1
| | | | Signed-off-by: Brian Paul <[email protected]>
* freedreno: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-1/+1
| | | | | Signed-off-by: Brian Paul <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* trace: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-4/+4
| | | | Signed-off-by: Brian Paul <[email protected]>
* ilo: s/Elements/ARRAY_SIZE/Brian Paul2016-05-0314-43/+43
| | | | Signed-off-by: Brian Paul <[email protected]>
* i915g: s/Elements/ARRAY_SIZE/Brian Paul2016-05-036-12/+12
| | | | Signed-off-by: Brian Paul <[email protected]>
* nvc0: compute a percentage for metric-achieved_occupancySamuel Pitoiset2016-05-031-4/+4
| | | | | | | metric-issue_slot_utilization and metric-branch_efficiency are already computed as percentages. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: display some performance metrics with a percentageSamuel Pitoiset2016-05-031-3/+3
| | | | | | This makes more sense for them. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: store the driver query type for performance metricsSamuel Pitoiset2016-05-031-18/+22
| | | | | | | | This will allow to use percentages for some metrics because the Gallium HUD doesn't allow to display floating point numbers and 0 is printed instead. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: fix exposing of metric-issue_slots for SM21/SM30Samuel Pitoiset2016-05-031-2/+22
| | | | | | | | | This is most likely a copy-paste error when I reworked this area few weeks ago. For SM20, metric-issue_slots is equal to inst_issued because there is only one pipeline, so the metric is not exposed there. Signed-off-by: Samuel Pitoiset <[email protected]> Reported-by: Karol Herbst <[email protected]>
* gallium/radeon: remove stencil_tile_split from metadataMarek Olšák2016-05-022-3/+0
| | | | | | | | this is a leftover from the days when depth-stencil buffers were allocated by the DDX Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: remove tile_mode_array_valid flagsMarek Olšák2016-05-022-4/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture importMarek Olšák2016-05-021-0/+1
| | | | | | | This hasn't been needed, but I think we should set it. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>