aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Enable dead CF elimination.Eric Anholt2016-07-041-0/+1
| | | | | | Now that we're about to start generating control flow in our NIR, we want this in place. It optimizes things frequently in the CS, when the GL VS has control flow that doesn't affect the vertex position.
* vc4: Optimize out redundant SF updates.Eric Anholt2016-07-042-6/+78
| | | | | | | | | | | Tiny change on shader-db currently, but it will be important when we start emitting a lot of SFs from the same variable as part of control flow support. total instructions in shared programs: 89463 -> 89430 (-0.04%) instructions in affected programs: 1522 -> 1489 (-2.17%) total estimated cycles in shared programs: 250060 -> 250015 (-0.02%) estimated cycles in affected programs: 8568 -> 8523 (-0.53%)
* vc4: Move SF removal to a separate peephole pass.Eric Anholt2016-07-045-17/+85
| | | | | | | | | The DCE pass is going to change significantly to handle control flow, while we don't really need to change it for the SF handling. We also need to add some more SF peephole optimization for SF updates generated by control flow support. No change on shader-db.
* vc4: DCE instructions with a NULL destination.Eric Anholt2016-07-041-2/+3
| | | | | | | | I'm going to add an optimization for redundant SF update removal, which will just remove the SF and leave us (in many cases) with an instruction with a NULL destination and no side effects. Rather than teaching that pass whether the whole instruction can be removed, leave that responsibility to this pass.
* vc4: Mark texturing setup instructions as having side effects.Eric Anholt2016-07-041-5/+5
| | | | | | | We need to not DCE them even though they don't have a destination in QIR. We also shouldn't relocate them in vc4_opt_vpm. Neither of these things happen, but I'm about to make DCE consider instructions with a NULL destination.
* vc4: Fix a pasteo in scheduling condition flag usage.Eric Anholt2016-07-041-1/+1
| | | | | | | Noticed by code inspection. This hasn't been too big of a deal, because our cond usages all start out as adder ops, either MOVs or the FTOI for Z writes. MOVs *can* get converted to mul ops during scheduling, but apparently we hadn't hit this.
* vc4: Drop the dead QIR_PACK() macro.Eric Anholt2016-07-041-8/+0
| | | | | This isn't used since we switched to using the dst.pack field instead of custom instructions.
* gallium: Add a cap for offset_units_unscaledAxel Davy2016-06-251-0/+1
| | | | | | | | | | | | | | D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* Remove wrongly repeated words in commentsGiuseppe Bilotta2016-06-232-2/+2
| | | | | | | | | | | | | | | | | Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* gallium: make constant_buffer constRob Clark2016-06-201-1/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all driversIlia Mirkin2016-06-181-0/+1
| | | | | | | | This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vc4: fix vc4_resource_from_handle() stride calculationRob Herring2016-06-151-2/+2
| | | | | | | | | | | | | | The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Android: move libdrm settings to top-level Android.common.mkRob Herring2016-06-131-1/+0
| | | | | | | | | | | | | | Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowedIlia Mirkin2016-06-061-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vc4: Fix compiler warnings in fail_instr path of QIR validate passRhys Kidd2016-05-311-10/+10
| | | | | | | Introduced in 8e2d0843c02daf5280184f179ae8ed440ac90d7f. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Fix doxygen warnings12.0-branchpointRhys Kidd2016-05-302-6/+6
| | | | | | | | Now that vc4 automated code documentation can be generated with doxygen, fix the warnings issued by Doxygen 1.8.11. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: push offset down to driverStanimir Varbanov2016-05-301-0/+7
| | | | | | | | | | | | | Push offset down to drivers when importing dmabuf. This is needed to more fully support EGL_EXT_image_dma_buf_import when a non-zero offset is specified. Tesing has been done for freedreno, and compile tested following gallium drivers: nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo Signed-off-by: Stanimir Varbanov <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: Add a pipe cap for whether primitive restart works for patches.Kenneth Graunke2016-05-231-0/+1
| | | | | | | | | | | | | | | Some hardware supports primitive restart on patch primitives, and other hardware does not. Modern GL and ES include a query for this feature; adding a capability bit will allow us to answer it. As far as I know, AMD hardware does not support this feature, while NVIDIA and Intel hardware does. However, most Gallium drivers do not appear to support tessellation shaders yet. So, I've enabled it for nvc0 and disabled it everywhere else. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Size transfer temporary mappings appropriately for full maps of 3D.Eric Anholt2016-05-181-2/+2
| | | | | | | | | We don't really support reading/writing of 3D textures since the hardware doesn't do 3D, but we do need to make sure that a pipe_transfer for them has enough space to store the image. This was previously not a problem because the state tracker only mapped a slice at a time until fb9fe352ea41c7e3633ba2c483c59b73c529845b. Fixes glean glsl1 tests, which all have setup of a 3D texture at the start.
* vc4: Add support for vertex color clamping in the rasterizer.Eric Anholt2016-05-173-1/+6
| | | | | This gets us precompile of vertex shaders at the state tracker level as well.
* vc4: Move tgsi_to_nir to precompile time.Eric Anholt2016-05-171-12/+15
| | | | | Now we have an immutable nir shader in our shader's CSO that we can clone and lower/optimize.
* vc4: Mark the driver as supporting fragment color clamping in rast.Eric Anholt2016-05-171-1/+1
| | | | | | | We always clamp fragment colors, since they're always 8-bit unorm, so there's no need to have us compile separate shaders based on GL_ARB_color_buffer_float. This gives us precompilation of fragment programs to the vc4_shader_state_create() level.
* vc4: Enable sharing shaders across contexts.Eric Anholt2016-05-172-2/+3
| | | | | | This allows the same pipe_shader_state to be referenced from multiple contexts. Since our pipe_shader_state is treated as immutable (other than the variant number) within the driver, this is no problem.
* vc4: Switch to using nir_load_front_face.Eric Anholt2016-05-172-9/+18
| | | | | | | | This will be generated by glsl_to_nir, and it turns out that this is a more code-efficient path than the floating point math, anyway. No change on shader-db, but drops an instruction in piglit's glsl-fs-frontfacing.
* vc4: Drop the dead export_linkage array.Eric Anholt2016-05-171-5/+0
| | | | This came from deriving from freedreno.
* vc4: Fix a -Wformat-security warning.Eric Anholt2016-05-171-1/+1
| | | | | This is apparently enabled as an error in Android builds, and the compiler can't tell that the return value is safe.
* gallium: Add a pipe cap for arb_cull_distanceTobias Klausmann2016-05-141-0/+1
| | | | | | | | | This lets us safely enable or disable the extension as needed Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vc4: Add support for loading immediate values in QIR.Eric Anholt2016-05-064-0/+32
| | | | | | | This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).
* vc4: Make vc4_qpu_validate() produce more verbose failures.Eric Anholt2016-05-061-35/+71
| | | | | | Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.
* vc4: Add a small QIR validate pass.Eric Anholt2016-05-064-0/+127
| | | | | This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.
* vc4: Fix the src count on exp2/log2.Eric Anholt2016-05-061-2/+2
| | | | Found by the upcoming QIR validate pass.
* vc4: Reuse QPU disasm's cond flags in QIR.Eric Anholt2016-05-063-27/+46
| | | | In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.
* vc4: When emitting an instruction to an existing temp, mark it non-SSA.Eric Anholt2016-05-061-0/+2
| | | | Prevents a bug in the later control-flow support series.
* vc4: Make sure that we don't overwrite the signal for PROG_END.Eric Anholt2016-05-061-0/+8
| | | | | | | | We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.
* vc4: fixup for new nir_foreach_block()Connor Abbott2016-05-054-48/+20
| | | | Reviewed-by: Eric Anholt <[email protected]>
* vc4: Use NIR lowering for sRGB decode.Eric Anholt2016-05-022-40/+3
| | | | | This should get us the same decode code generated, but with a lot less custom code in the driver.
* vc4: Just use NIR lowering for texture projection.Eric Anholt2016-05-021-15/+3
| | | | | This means doing Newton-Raphson on the RCP, but it's probably actually a good thing to be accurate on.
* vc4: Scalarize phi nodes as well.Eric Anholt2016-05-021-0/+1
| | | | | This makes fewer programs with loops assertion fail, replacing them with the rendering failure warning.
* vc4: Add whitespace after each program stage dump.Eric Anholt2016-05-022-0/+3
| | | | | In particular it's been hard to find the point where we switch from dumping pre-optimization QIR and post-optimization QIR.
* vc4: Remove the CSE pass.Eric Anholt2016-05-024-162/+0
| | | | | | It's not doing anything according to shader-db now that we're using NIR. It would have had to be reworked significantly anyway, to handle control flow.
* vc4: Emit only one FRAG_Z or FRAG_W QIR opcode.Eric Anholt2016-05-021-2/+19
| | | | | | We were generating piles of FRAG_W for interpolation, only to CSE them away immediately. Since this is the only thing that CSE is doing for us any more, just avoid making the CSE work necessary.
* vc4: Use the NIR cubemap normalization instead of our own.Eric Anholt2016-05-021-6/+1
| | | | | | | This is one of two uses of the current QIR CSE pass according to shader-db. The NIR pass means that we'll end up doing Newton-Raphson on our RCP, which we weren't doing before, but that's probably actually a good thing.
* vc4: Drop the support for DCE of texture instructions.Eric Anholt2016-05-021-22/+1
| | | | | Now that we're using NIR for our optimization, there's no need for this tricky code.
* nir: Switch the arguments to nir_foreach_functionJason Ekstrand2016-04-284-5/+5
| | | | | | | | | This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <[email protected]>
* nir: Switch the arguments to nir_foreach_instrJason Ekstrand2016-04-284-5/+5
| | | | | | | | | | | This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <[email protected]>
* nir: rename lower_flrp to lower_flrp32Samuel Iglesias Gonsálvez2016-04-281-1/+1
| | | | | | | A later patch will add lower_flrp64 option to NIR. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vc4: Make sure we recompile when sample_mask changes.Eric Anholt2016-04-221-0/+1
| | | | | | | Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted (there is also a bug with RCL tiled blits) Cc: "11.1 11.2" <[email protected]>
* vc4: Fix validation of full res tile offset if used for non-MSAA.Eric Anholt2016-04-223-2/+14
| | | | | | There's no reason we couldn't do non-MSAA full resolution tile buffer load/stores, but we would have claimed buffer overflow was being attempted. Nothing does this currently.
* vc4: Only do MSAA FB operations if the FB is MSAA.Eric Anholt2016-04-221-5/+8
| | | | | I noticed this as a problem with ET:QW traces emitting coverage code when the framebuffer was supposed to be single sampled.
* vc4: Fix tests for format supported with nr_samples == 1.Eric Anholt2016-04-221-3/+4
| | | | | | | | | | This was a bug from the MSAA enabling. Tests for surfaces with nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly fail out. Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance. Cc: "11.1 11.2" <[email protected]>