summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: Expose 4 render targetsAlyssa Rosenzweig2019-07-181-2/+2
| | | | | | Hidden behind deqp flag as usual. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Shrink tiler heapAlyssa Rosenzweig2019-07-181-1/+1
| | | | | | | 128MB is excessive and 16MB is still plenty. Saves 112MB/context on kernels without growable/heap support. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* nir/large_constants: De-duplicate constantsCaio Marcelo de Oliveira Filho2019-07-181-21/+75
| | | | | | | | | | | | | | | | | | | | | If a function has a constant and is called more than once, after inlining we may end up with different variables representing the same constant. This commit look into the data and de-duplicate them. The first pass now will collect the constant data in a per variable buffer, then de-duplication happens (by sorting then linear walk), and the second pass will use the data in var->data.location. One side-effect of the current implementation is that constants will be reordered. If this turns out to be a problem is something that can be fixed. An alternative strategy considered was to perform this in a per-function basis and then merge the results, the problem is that we would have to fix up the offsets during the merge. Given the data we have, the current patch is good enough. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/large_constants: Use ralloc for var_infosCaio Marcelo de Oliveira Filho2019-07-181-3/+3
| | | | | | | | This will be used later on to allocate constant data for each variable (and then deduplicate). Also drop initializing found_read, as it is already implicitly false in the literal. Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.Eric Anholt2019-07-181-88/+51
| | | | | | Cuts a bunch of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_sample to the NIR lowering helper.Eric Anholt2019-07-181-48/+30
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.Eric Anholt2019-07-181-39/+19
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* v3d: Use nir_shader_lower_instructions() for txf_ms lowering.Eric Anholt2019-07-181-26/+16
| | | | | | Cuts out a bunch of boilerplate. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Allow internal changes to the instr in nir_shader_lower_instructions().Eric Anholt2019-07-182-1/+11
| | | | | | | | | v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in NIR, but doesn't generate a new txf_ms instructions as replacement. It's pretty easy to allow that in nir_shader_lower_instructions, and it may be common in lowering passes. Reviewed-by: Jason Ekstrand <[email protected]>
* vc4: Convert vc4_nir_lower_txf_ms to nir_shader_lower_instructions().Eric Anholt2019-07-181-32/+13
| | | | | | Cuts out a bunch of boilerplate. Reviewed-by: Iago Toral Quiroga <[email protected]>
* v3d: Fix assertion failures in debug builds.Eric Anholt2019-07-181-0/+2
| | | | | | | | | nir_lower_io leaves around deref_var instructions after lowering away deref intrinsics. This ends up breaking validation after v3d_nir_lower_io removes variables not actually being stored by the shader's store_output()s. Reviewed-by: Iago Toral Quiroga <[email protected]>
* panfrost: Handle Z24 texturesAlyssa Rosenzweig2019-07-181-1/+1
| | | | | | Just use the Z32 code. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/ci: Update expectationsAlyssa Rosenzweig2019-07-181-14/+0
| | | | | | We just fixed some stencil tests. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make scissor test more robustAlyssa Rosenzweig2019-07-181-8/+15
| | | | | | See v3d implementation. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use correct NO_DITHER field on MFBDAlyssa Rosenzweig2019-07-183-1/+9
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement Z32F(_S8) supportAlyssa Rosenzweig2019-07-182-0/+16
| | | | | | | Z32F uses a dediacted float path. Z32F_S8 uses separate stencil planes in the hardware, lowered via u_transfer_helper. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Don't disassemble NULL shadersAlyssa Rosenzweig2019-07-181-2/+3
| | | | | | | | It is legal to load a shader from a NULL address, particularly when the TILER job is used strictly for effects on the Z/S buffer with 0x0 color mask. Don't crash the decoder in this case. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Copy stencil front to back if back disabledAlyssa Rosenzweig2019-07-181-5/+14
| | | | | | | When backside stenciling is disabled, backfacing primitives just do the same thing as frontfacing primitives. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* swr/rast: Refactor memory API between rasterizer core and swrJan Zielinski2019-07-1830-185/+370
| | | | | | | This commit cleans up API between the core of the rasterizer and swr. Some formatting changes are also done. Reviewed-by: Alok Hota <[email protected]>
* lima/ppir: Add gl_PointCoord handlingAndreas Baierl2019-07-186-5/+34
| | | | | | | | | Treat gl_PointCoord as a system value and add the necessary bits for correct codegen. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium: Add PIPE_CAP_TGSI_FS_POINT_IS_SYSVALAndreas Baierl2019-07-184-0/+6
| | | | | | | | This adds an option to treat gl_PointCoord as a system value. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/tgsi: Extend tgsi_to_nir.c to support gl_PointCoord as a system value.Andreas Baierl2019-07-181-0/+20
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Add gl_PointCoord system valueAndreas Baierl2019-07-183-0/+6
| | | | | | | | | | gl_PointCoord handling needs some special bits set in lima/ppir code generation. Treating gl_PointCoord as a system value makes it easier to distinguish from a regular varying. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Optionally declare gl_PointCoord as a system valueAndreas Baierl2019-07-185-3/+15
| | | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima/gp: Fix problem with complex movesConnor Abbott2019-07-183-9/+125
| | | | | | | | | | | | | | | | | | | When writing the scheduler, we forgot that you can't read the complex unit in certain sources because it gets overwritten to 0 or 1. Fixing this turned out to be possible without giving up and reducing GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't expect. There can be at most 4 next-max nodes that can't have moves scheduled in the complex slot, so it actually isn't a problem for getting the number of next-max nodes at 5 or lower. However, it is a problem for stores. If a given node is a next-max node whose move cannot go in the complex slot *and* is used by a store that we decide to schedule, we have to reserve one of the non-complex slots for a move instead of all the slots, or we can wind up in a situation where only the complex slot is free and we fail the move. This means that we have to add another term to the reservation logic, for stores whose children cannot be in the complex slot. Acked-by: Qiang Yu <[email protected]>
* lima/gpir: Rework the schedulerConnor Abbott2019-07-189-560/+1187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, we do scheduling at the same time as value register allocation. The ready list now acts similarly to the array of registers in value_regalloc, keeping us from running out of slots. Before this, the value register allocator wasn't aware of the scheduling constraints of the actual machine, which meant that it sometimes chose the wrong false dependencies to insert. Now, we assign value registers at the same time as we actually schedule instructions, making its choices reflect reality much better. It was also conservative in some cases where the new scheme doesn't have to be. For example, in something like: 1 = ld_att 2 = ld_uni 3 = add 1, 2 It's possible that one of 1 and 2 can't be scheduled in the same instruction as 3, meaning that a move needs to be inserted, so the value register allocator needs to assume that this sequence requires two registers. But when actually scheduling, we could discover that 1, 2, and 3 can all be scheduled together, so that they only require one register. The new scheduler speculatively inserts the instruction under consideration, as well as all of its child load instructions, and then counts the number of live value registers after all is said and done. This lets us be more aggressive with scheduling when we're close to the limit. With the new scheduler, the kmscube vertex shader is now scheduled in 40 instructions, versus 66 before. Acked-by: Qiang Yu <[email protected]>
* lima/gp: Mark more add-only nodes as maybe-two-slotConnor Abbott2019-07-181-0/+8
| | | | Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir: Fix some bugs in instruction handlingConnor Abbott2019-07-181-0/+12
| | | | Reviewed-by: Qiang Yu <[email protected]>
* lima: Reintroduce the standalone compilerConnor Abbott2019-07-187-3/+352
| | | | | | I used this to test things without needing to have a device handy. Acked-by: Qiang Yu <[email protected]>
* nir/lower_viewport: Check variable mode firstConnor Abbott2019-07-181-1/+2
| | | | | | | | | | The location is unused for shader_temp and function_temp variables, and due to the way we nir_lower_io_to_temproraries demotes shader_out variables to shader_temp variables, it happened to equal VARYING_SLOT_POS for the gl_Position temporary, which made this pass fail with the offline compiler due to this coming before vars_to_ssa. Reviewed-by: Qiang Yu <[email protected]>
* radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive IDSamuel Pitoiset2019-07-181-0/+8
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: move emitting VGT_PRIMITIVEID_EN into the NGG pathSamuel Pitoiset2019-07-181-6/+11
| | | | | | And do not emit VGT_GS_MODE which is unnecessary on GFX10. Signed-off-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: do not always execute a barrier before the second shaderSamuel Pitoiset2019-07-181-1/+30
| | | | | | | | | With NGG, empty waves may still be required to export data. This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix VGT_GS_MODE if VS uses the primitive IDSamuel Pitoiset2019-07-181-5/+5
| | | | | | | | Found by inspection. Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* v3d: emit correct lowering for logic operations with MSAA render targetsIago Toral Quiroga2019-07-181-5/+54
| | | | | | | v2: - Drop the writemask from the per-sample color intrinsic (Eric) Reviewed-by: Eric Anholt <[email protected]>
* v3d: handle nir_intrinsic_store_tlb_sample_color_v3dIago Toral Quiroga2019-07-181-20/+44
| | | | | | | v2: - Move handling of output intrinsics to ntq_emit_intrinsic() (Eric). Reviewed-by: Eric Anholt <[email protected]>
* nir: add a V3D-specific intrinsic for per-sample color writesIago Toral Quiroga2019-07-181-0/+9
| | | | | | | | | | | For per-sample color writes we need the output intrinsic to pack the sample index, which is not provided with regular store_output intrinsics unless we figured out a way to encode it into the base or the offset. v2: - Drop the writemask (Eric) Reviewed-by: Eric Anholt <[email protected]>
* v3d: implement per-sample tlb color writesIago Toral Quiroga2019-07-181-30/+44
| | | | Reviewed-by: Eric Anholt <[email protected]>
* v3d: refactor the tlb color write codeIago Toral Quiroga2019-07-181-49/+39
| | | | | | | | We want to split the tlb specifier setup from the color writes, because when we implement per-sample color writes we want to do the latter for all the samples, but the former only once. Reviewed-by: Eric Anholt <[email protected]>
* v3d: move tlb color write emission to a helper functionIago Toral Quiroga2019-07-181-95/+99
| | | | | | | | | We will soon be adding per-sample color writes which means additional complexity and more indentation (we will need another loop to emit the writes for each individual sample), so this will help keeping things simple and a bit more readable. Reviewed-by: Eric Anholt <[email protected]>
* v3d: implement per-sample tlb color readsIago Toral Quiroga2019-07-181-39/+52
| | | | Reviewed-by: Eric Anholt <[email protected]>
* anv: fix format mapping for depth/stencil formatsLionel Landwerlin2019-07-181-0/+3
| | | | | | | | | | | | anv_format is supposed to have a pointer back to the associated VkFormat, we were missed this for depth/stencil formats. This doesn't fix anything afaict, but will be needed for future changes. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 465de47bad70 ("anv: associate vulkan formats with aspects") Acked-by: Jason Ekstrand <[email protected]>
* radv: put back VGT_FLUSH at ring init on gfx10Dave Airlie2019-07-181-4/+2
| | | | | | | I can find no evidence that removing this is a good idea. Fixes: 9b116173b6a ("radv: do not emit VGT_FLUSH on GFX10") Reviewed-by: Samuel Pitoiset <[email protected]>
* softpipe: Clamp border colors when neededGert Wollny2019-07-182-14/+31
| | | | | | | | | | | | | | | | | | | | | | unorm and snorm require that the border color values are clamped, so when picking the sampler view copy/clamp the border color from the sampler and use these adjusted values. Fixes: dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_compressed_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_snorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_srgb_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_unorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_compressed_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_snorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_srgb_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth_uint_stencil_sample_depth Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: set a lower minimum clamp value for texture coordinate border clampGert Wollny2019-07-181-1/+1
| | | | | | | | | | The value of -0.5f is not small enough to produce negative coordinates, so lower the minimum clamp value to -1.0f. This fixes a number of tests from dEQP-GLES31.functional.texture.border_clamp.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Correct repeat-mirror evaluationGert Wollny2019-07-181-5/+19
| | | | | | | | | | | | when mirroring the texture corrdinates the indices must be mirrored as well and the half pixel shift must be applied in reverse. Fixes a number of tests from: dEQP-GLES31.functional.texture.gather.offset.* dEQP-GLES31.functional.texture.gather.offsets.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Also mark textures as dirty when updating the framebuffer stateGert Wollny2019-07-181-1/+1
| | | | | | | | | | | At this point all the draw caches are flushed to the old attached textures, so the read caches of these textures will need to be updated too. Fixes: dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* etnaviv: set DITHER_MODEJonathan Marek2019-07-171-0/+1
| | | | | | | | | This fixes a rendering glitch observed in SDL testscale test, where alpha blending samples with value (1.0, 1.0, 1.0, 0.0) whitens the target instead of having no effect. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: update headers from rnndbJonathan Marek2019-07-171-1/+4
| | | | | | | Update to etna_viv commit a16a418. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: fix blend color on newer GPUsJonathan Marek2019-07-174-19/+21
| | | | | | | Newer GPUs use the half float ALPHA_COLOR_EXT register. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>