summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Make sure we recompile when sample_mask changes.Eric Anholt2016-04-221-0/+1
| | | | | | | Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted (there is also a bug with RCL tiled blits) Cc: "11.1 11.2" <[email protected]>
* vc4: Fix validation of full res tile offset if used for non-MSAA.Eric Anholt2016-04-223-2/+14
| | | | | | There's no reason we couldn't do non-MSAA full resolution tile buffer load/stores, but we would have claimed buffer overflow was being attempted. Nothing does this currently.
* vc4: Only do MSAA FB operations if the FB is MSAA.Eric Anholt2016-04-221-5/+8
| | | | | I noticed this as a problem with ET:QW traces emitting coverage code when the framebuffer was supposed to be single sampled.
* vc4: Fix tests for format supported with nr_samples == 1.Eric Anholt2016-04-221-3/+4
| | | | | | | | | | This was a bug from the MSAA enabling. Tests for surfaces with nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly fail out. Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance. Cc: "11.1 11.2" <[email protected]>
* vc4: Don't try to blit from MSAA surfaces with mismatched width to dst.Eric Anholt2016-04-221-11/+14
| | | | | | | | | I had made the previous blit fix non-MSAA only because I was thinking about how the hardware infers stride from the RENDERING_CONFIG packet. However, I'm also inferring the stride for both MSAA src and dst in vc4_render_cl.c from the width argument in the ioctl. Fixes 15 EXT_framebuffer_multisample piglit tests.
* gallium: add bool return to pipe_context::end_queryNicolai Hähnle2016-04-211-1/+2
| | | | | | | | | Even when begin_query succeeds, there can still be failures in query handling. For example for radeon, additional buffers may have to be allocated when queries span multiple command buffers. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*Marek Olšák2016-04-224-16/+16
| | | | | | | | Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
* nir: rename nir_foreach_block*() to nir_foreach_block*_call()Connor Abbott2016-04-204-4/+4
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* vc4: Fix fbo-generatemipmap-formats for NPOT.Eric Anholt2016-04-181-0/+20
| | | | | | | Single-sampled texture miplevels > 1 are stored in POT-aligned areas, but we only get one value to control the stride of the src and dst for single sampled buffers. A RCL tile blit from level != 1 to level == 0 would therefore load from the wrong stride.
* vc4: Remove unused "immediates" fieldEric Anholt2016-04-181-1/+0
| | | | This was for TGSI, which we no longer have to deal with.
* vc4: Add support for rendering to cube map surfaces.Eric Anholt2016-04-181-1/+2
| | | | | | | We need to fix up the offset to point at the face of the cube. Fixes piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test. Cc: "11.1 11.2" <[email protected]>
* vc4: Don't flush on read-only access of buffers read by the CL.Eric Anholt2016-04-183-7/+16
| | | | | | Fixes piglit mixed-immediate-and-vbo, and may significantly improve performance of applications that store a 4-byte IB in the same VBO as vertex data.
* vc4: Sanity check that flushes don't happen between state emit and draw.Eric Anholt2016-04-181-0/+7
| | | | | | Catches the cause of failure in arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of failure before, and it probably won't be the last time.
* vc4: Sanity check strides for imported BOs.Eric Anholt2016-04-181-5/+18
| | | | | | | If we're going to sample from or render to them at some particular size, we'd better make sure that they actually are that size. Causes some tests under simulation to generate appropriate error messages instead of failures.
* vc4: Fix subimage accesses to LT textures.Eric Anholt2016-04-151-4/+4
| | | | | | | | | | | | | This code started out like the T case, iterating over utile offsets, but I had partially switched it to iterating over pixel offsets. I hadn't caught this before because it's unusual to do piecemeal uploads to small textures. Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache. Also fixes 6 piglit tests related to glTexSubImage() and glGetTexSubImage(). Cc: "11.1 11.2" <[email protected]>
* nir/dead_variables: Configurably work with any variable modeJason Ekstrand2016-04-131-1/+1
| | | | | | | The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <[email protected]>
* vc4: Work around hardware limits on the number of verts in a single draw.Eric Anholt2016-04-121-18/+92
| | | | | Fixes rendering failures in glmark2's refract and bump:render-mode=high-poly demos, and partially in its terrain demo.
* gallium: Add capability for ARB_robust_buffer_access_behavior.Bas Nieuwenhuizen2016-04-121-0/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add pipe_context::set_active_query_state for pausing queriesMarek Olšák2016-04-121-0/+6
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: Move FRAG_X/Y/REV_FLAG to a QFILE like VPM or TLB color writes.Eric Anholt2016-04-084-27/+29
| | | | | This gives us one less set of special instruction generation cases, and instead just the case for returning the correct register to read.
* vc4: Allow TLB Z/color/stencil writes from any ALU operation in QIR.Eric Anholt2016-04-085-65/+100
| | | | | | | | This lets us write the Z directly from the FTOI for computed Z, and may let us coalesce color writes in the future. No change in my shader-db, but clearly drops an instruction in piglit's early-z test.
* vc4: Add a helper function for the construction of qregs.Eric Anholt2016-04-084-12/+13
| | | | | The separate declaration of the struct is not helping clarity, and I was going to be writing a whole lot more of these in the upcoming patches.
* vc4: Add missing scheduling dependency for MS color writes.Eric Anholt2016-04-081-0/+1
|
* vc4: Drop the multi_instruction distinction for QIR instructions.Eric Anholt2016-04-082-14/+5
| | | | | | | It wasn't correctly flagged everywhere, and QPU generation now handles the only remaining case that was paying attention to it. No change on shader-db.
* vc4: Handle SF on instructions that write r4.Eric Anholt2016-04-081-10/+14
| | | | | | | Normal SFU writes couldn't have SF because they were marked as multi_instruction, but tex_result and tlb_color_read weren't. This ended up not being a problem according to anything in shader-db, but it seems possible.
* vc4: Allow multi-instruction QIR nodes to get VPM optimization.Eric Anholt2016-04-081-2/+2
| | | | | | | | | | | There used to be multi-instruction operations that would use src[] twice, which is why we couldn't do some optimizations on them. This is no longer the case. total instructions in shared programs: 77973 -> 77969 (-0.01%) instructions in affected programs: 84 -> 80 (-4.76%) total estimated cycles in shared programs: 234165 -> 234157 (-0.00%) estimated cycles in affected programs: 92 -> 84 (-8.70%)
* vc4: Switch to using NIR_PASS macros.Eric Anholt2016-04-085-33/+32
| | | | This gets us better validation of our NIR transformations.
* vc4: Handle nir_intrinsic_load_user_clip_plane as a vec4.Eric Anholt2016-04-082-20/+12
| | | | | | | | I liked having all my NIR be scalar, but nir_validate() complains that the intrinsic writes 4 components but the destination we set up was only 1 component. I could generate a new scalar variant, but it's a lot easier to just leave it as a vec4. This doesn't hurt codegen since we GC unused uniforms, and UCP dot products use all the components anyway.
* vc4: Emit a warning and proceed for handling loops in NIR.Rhys Kidd2016-04-081-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | We don't really suppor control flow yet, but it's a lot nicer to render something and warn on stderr than to crash. Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. v2 (Eric): Add stronger stderr warning. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Add a stub for NIR->QIR of control flow function nodesRhys Kidd2016-04-081-0/+11
| | | | | | | | We shouldn't have any NIR functions present since all GLSL functions get inlined, but this would be a more informative error if it does happen. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Add better debug of NIR->QIR control flow graph failureRhys Kidd2016-04-081-1/+2
| | | | | | | | | | | | | Ensure NIR control flow graph nodes that are unhandled in QIR are reported with sufficient verbosity to aid debugging. This improves piglit outputs, amongst other tools. There are no other remaining uses of assert(0) as a blunt tool within vc4. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Remove unused include from vc4_program.cRhys Kidd2016-04-081-1/+0
| | | | | | | | Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vc4: Remove unused include from vc4_nir_lower_txf_ms.cRhys Kidd2016-03-281-1/+0
| | | | | | | | Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* nir: add a bit_size parameter to nir_ssa_dest_initConnor Abbott2016-03-174-7/+7
| | | | | | | | | | | | | | | | | | | | | | v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: rename nir_const_value fields to include bitsize informationIago Toral Quiroga2016-03-172-8/+8
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: update opcode definitions for different bit sizesConnor Abbott2016-03-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* vc4: Move discard handling to the condition flag.Eric Anholt2016-03-165-34/+29
| | | | | | | | | | | | | | | Now that the field exists in the instruction, we can make discards less special. As a bonus, that means that we should be able to merge some more .sf instructions together when we get around to that. This causes some scheduling changes, as it allows tlb_color_reads to be delayed past the discard condition setup. Since the tlb_color_read ends up later, this may mean performance improvements, but I haven't tested. total instructions in shared programs: 78114 -> 78035 (-0.10%) instructions in affected programs: 1922 -> 1843 (-4.11%) total estimated cycles in shared programs: 234318 -> 234329 (0.00%) estimated cycles in affected programs: 8200 -> 8211 (0.13%)
* vc4: Don't make a temporary for setting flags.Eric Anholt2016-03-161-1/+2
| | | | | | | | | The register allocator doesn't really do anything about the temp, so it doesn't seem like it should matter. However, the scheduler would think that a new def is being created. This doesn't change anything yet, but it avoids a bunch of regressions in the next commit.
* vc4: Add a safety check for setting flags.Eric Anholt2016-03-161-0/+3
| | | | | If a pack was on the src reg, should it be a float, int, or mul unpack? Just complain, instead.
* vc4: Reuse list_for_each_entry_safe_rev().Eric Anholt2016-03-161-6/+2
| | | | This didn't exist when I wrote the code.
* vc4: Coalesce instructions using VPM reads into the VPM read.Varad Gautam2016-03-153-7/+71
| | | | | | | | | | | | | | | This is done instead of copy propagating the VPM reads into the instructions using them, because VPM reads have to stay in order. shader-db results: total instructions in shared programs: 78509 -> 78114 (-0.50%) instructions in affected programs: 5203 -> 4808 (-7.59%) total estimated cycles in shared programs: 234670 -> 234318 (-0.15%) estimated cycles in affected programs: 5345 -> 4993 (-6.59%) Signed-off-by: Varad Gautam <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Rhys Kidd <[email protected]>
* vc4: rename file to group vpm optimizations togetherVarad Gautam2016-03-152-2/+2
| | | | | | | | This file will contain optimization passes for both vpm reads and writes. Signed-off-by: Varad Gautam <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Fix failures with nir_extract_* since the addition of the opcodes.Eric Anholt2016-03-151-0/+2
|
* gallium: add CAPs returning PCI device locationMarek Olšák2016-03-091-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium: add external usage flags to resource_from(get)_handle (v2)Marek Olšák2016-03-091-1/+2
| | | | | | | | | This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <[email protected]>
* Android: fix build break from nir/glsl move to compiler/Rob Herring2016-02-291-2/+4
| | | | | | | | | | | | | | | | Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38 ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <[email protected]> Cc: Mauro Rossi <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGESIlia Mirkin2016-02-151-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vc4: Add missing braces in initializerRhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | Silences the following GCC warning: mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c: In function 'qir_schedule_instructions': mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c:578:16: warning: missing braces around initializer [-Wmissing-braces] struct schedule_state state = { 0 }; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Correct typo setting 'handled_qinst_cond'Rhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | | | | Variable was previously always set to true. Accordingly, the later assert() served no active purpose. Found with GCC warning and code inspection: mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code': mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 'handled_qinst_cond' set but not used [-Wunused-but-set-variable] bool handled_qinst_cond = true; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>