summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: move shader-stage dirty bits to global dirty flagRob Clark2016-05-048-59/+41
| | | | | | | | | | | This was always a bit overly complicated, and had some issues (like ctx->prog.dirty not getting reset at the end of the batch). It also required some special hacks to avoid resetting dirty state on binning pass. So just move it all into ctx->dirty (leaving some free bits for future shader stages), and make FD_DIRTY_PROG just be the union of all FD_SHADER_DIRTY_*. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix bogus offset for f32x24s8 stencil restoreRob Clark2016-05-041-4/+5
| | | | | | fixes: $piglit/bin/fbo-clear-formats GL_ARB_depth_buffer_float Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some debug_asserts() to catch insane offsetsRob Clark2016-05-041-0/+2
| | | | | | | Ofc won't catch *all* faults, but at least helpful for catching offsets which are completely bogus. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: deal with VS which do not write positionRob Clark2016-05-041-0/+7
| | | | | | | | Fixes $piglit/bin/glsl-1.40-tf-no-position a3xx may need similar? Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove a couple redundant is_flow()sRob Clark2016-05-042-2/+2
| | | | | | | Now that the opc's encode the instruction category (making them unique) we no longer need to check the category in addition to the opc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cp small negative integers tooRob Clark2016-05-041-1/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix # of registersRob Clark2016-05-041-1/+1
| | | | | | | The instruction encoding allows for more registers, but at least on a3xx/a4xx they don't actually exist. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower immeds to constRob Clark2016-05-043-4/+80
| | | | | | | | | | | | | | | | | Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3_cp_ctxRob Clark2016-05-043-12/+22
| | | | | | Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <[email protected]>
* nouveau/video: properly detect the decoder class for availability checksIlia Mirkin2016-05-041-8/+17
| | | | | | | | | | | The kernel is now more strict with the class ids it exposes, so we need to check the G98 and MCP89 classes as well as the GT215 class. This effectively caused us to decide there were no decoding capabilities on newer kernel for VP3 chips. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2" <[email protected]>
* gallium/util: change assertion to conditional in util_bitmask_destroy()Brian Paul2016-05-031-4/+4
| | | | | | | | If we fail to create a context in the VMware driver we call this function unconditionally to free a bunch of bit vectors. Instead of asserting on a null pointer, just no-op. Reviewed-by: Jose Fonseca <[email protected]>
* cso: null-out previously bound sampler statesBrian Paul2016-05-031-1/+3
| | | | | | | | | | | | | If, for example, we previously had 2 sampler states bound and now we are binding one, we'd leave the second sampler state unchanged. This change nulls-out the second sampler state in this situation. We're already doing the same thing for sampler views. This silences an occasional warning issued by the VMware driver when the number of sampler views and sampler states disagreed. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* svga: try to flag surfaces for sampling, in addition to renderingBrian Paul2016-05-031-0/+11
| | | | | | | | | This silences some warnings when we try to sample from surfaces that were created for drawing, such as when blitting from one of the framebuffer surfaces. We were already doing the opposite situation (adding a bind flag for rendering to surfaces declared as texture sources). Reviewed-by: Charmaine Lee <[email protected]>
* svga: fix copying non-zero layers of 1D array texturesBrian Paul2016-05-031-10/+12
| | | | | | | | Like cube maps, we need to convert the z information to a layer index. Also rename the *_face vars to *_face_layer to make things a little more understandable. Reviewed-by: Charmaine Lee <[email protected]>
* svga: clean up svga_pipe_blit.cBrian Paul2016-05-031-68/+13
| | | | | | Remove dead code. Fix formatting. Reviewed-by: Charmaine Lee <[email protected]>
* rbug: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-1/+1
| | | | Signed-off-by: Brian Paul <[email protected]>
* freedreno: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-1/+1
| | | | | Signed-off-by: Brian Paul <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* trace: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-4/+4
| | | | Signed-off-by: Brian Paul <[email protected]>
* ilo: s/Elements/ARRAY_SIZE/Brian Paul2016-05-0314-43/+43
| | | | Signed-off-by: Brian Paul <[email protected]>
* i915g: s/Elements/ARRAY_SIZE/Brian Paul2016-05-036-12/+12
| | | | Signed-off-by: Brian Paul <[email protected]>
* nvc0: compute a percentage for metric-achieved_occupancySamuel Pitoiset2016-05-031-4/+4
| | | | | | | metric-issue_slot_utilization and metric-branch_efficiency are already computed as percentages. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: display some performance metrics with a percentageSamuel Pitoiset2016-05-031-3/+3
| | | | | | This makes more sense for them. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: store the driver query type for performance metricsSamuel Pitoiset2016-05-031-18/+22
| | | | | | | | This will allow to use percentages for some metrics because the Gallium HUD doesn't allow to display floating point numbers and 0 is printed instead. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: fix exposing of metric-issue_slots for SM21/SM30Samuel Pitoiset2016-05-031-2/+22
| | | | | | | | | This is most likely a copy-paste error when I reworked this area few weeks ago. For SM20, metric-issue_slots is equal to inst_issued because there is only one pipeline, so the metric is not exposed there. Signed-off-by: Samuel Pitoiset <[email protected]> Reported-by: Karol Herbst <[email protected]>
* gallium,utils: Fix trivial sign compare warningsJan Vesely2016-05-038-21/+21
| | | | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jakob Sinclair <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium/radeon: remove stencil_tile_split from metadataMarek Olšák2016-05-023-7/+0
| | | | | | | | this is a leftover from the days when depth-stencil buffers were allocated by the DDX Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: remove tile_mode_array_valid flagsMarek Olšák2016-05-024-8/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture importMarek Olšák2016-05-023-0/+3
| | | | | | | This hasn't been needed, but I think we should set it. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* winsys/amdgpu: read NUM_BANKS from buffer metadataMarek Olšák2016-05-023-21/+3
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unused tile mode gettersMarek Olšák2016-05-022-157/+2
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in SDMA setupMarek Olšák2016-05-021-51/+28
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in SI DMA setupMarek Olšák2016-05-021-33/+21
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in DB setupMarek Olšák2016-05-021-36/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: add radeon_surf::macro_tile_indexMarek Olšák2016-05-023-0/+20
| | | | | | | for indexing cik_macrotile_mode_array Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* winsys/radeon: drop support for kernels lacking tile mode array queriesMarek Olšák2016-05-021-6/+14
| | | | | | | | | | This will allow us to simplify a lot of code around tiling. Kernel 3.10 is required for SI support. Kernel 3.13 is required for CIK support. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* winsys/radeon: count buffer size only onceMarek Olšák2016-05-021-2/+2
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: count buffer size only onceMarek Olšák2016-05-021-2/+2
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: loosen up requirements for how much memory IBs can useMarek Olšák2016-05-021-4/+10
| | | | | | | ported from winsys/radeon. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: when parsing dmesg, skip empty linesMarek Olšák2016-05-021-0/+3
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the hw MSAA resolving if formats are compatibleMarek Olšák2016-05-021-1/+2
| | | | | | | This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50,nvc0: re-bind old compute state after reading MP perf countersSamuel Pitoiset2016-05-022-0/+4
| | | | | | | | This might be useful to avoid breaking the current compute state when monitoring MP perf counters because we use a compute kernel to read out those counters. This has been initially suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]>
* vc4: Use NIR lowering for sRGB decode.Eric Anholt2016-05-022-40/+3
| | | | | This should get us the same decode code generated, but with a lot less custom code in the driver.
* vc4: Just use NIR lowering for texture projection.Eric Anholt2016-05-021-15/+3
| | | | | This means doing Newton-Raphson on the RCP, but it's probably actually a good thing to be accurate on.
* vc4: Scalarize phi nodes as well.Eric Anholt2016-05-021-0/+1
| | | | | This makes fewer programs with loops assertion fail, replacing them with the rendering failure warning.
* vc4: Add whitespace after each program stage dump.Eric Anholt2016-05-022-0/+3
| | | | | In particular it's been hard to find the point where we switch from dumping pre-optimization QIR and post-optimization QIR.
* vc4: Remove the CSE pass.Eric Anholt2016-05-024-162/+0
| | | | | | It's not doing anything according to shader-db now that we're using NIR. It would have had to be reworked significantly anyway, to handle control flow.
* vc4: Emit only one FRAG_Z or FRAG_W QIR opcode.Eric Anholt2016-05-021-2/+19
| | | | | | We were generating piles of FRAG_W for interpolation, only to CSE them away immediately. Since this is the only thing that CSE is doing for us any more, just avoid making the CSE work necessary.
* vc4: Use the NIR cubemap normalization instead of our own.Eric Anholt2016-05-021-6/+1
| | | | | | | This is one of two uses of the current QIR CSE pass according to shader-db. The NIR pass means that we'll end up doing Newton-Raphson on our RCP, which we weren't doing before, but that's probably actually a good thing.
* vc4: Drop the support for DCE of texture instructions.Eric Anholt2016-05-021-22/+1
| | | | | Now that we're using NIR for our optimization, there's no need for this tricky code.
* radeonsi: fix PIPE_FORMAT_R11G11B10_FLOAT handlingNicolai Hähnle2016-05-021-8/+10
| | | | | | | | | | That format has first_non_void < 0. This fixes a regression in piglit arb_shader_image_load_store-semantics that was introduced by commit 76b8c5cc602, while hopefully still shutting Coverity up (and failing in a more obvious way if a similar error should re-appear). Reviewed-by: Jakob Sinclair <[email protected]> Reviewed-by: Marek Olšák <[email protected]>