aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* Android: fix build break from nir/glsl move to compiler/Rob Herring2016-02-291-2/+4
| | | | | | | | | | | | | | | | Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38 ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <[email protected]> Cc: Mauro Rossi <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGESIlia Mirkin2016-02-151-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vc4: Add missing braces in initializerRhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | Silences the following GCC warning: mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c: In function 'qir_schedule_instructions': mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c:578:16: warning: missing braces around initializer [-Wmissing-braces] struct schedule_state state = { 0 }; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Correct typo setting 'handled_qinst_cond'Rhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | | | | Variable was previously always set to true. Accordingly, the later assert() served no active purpose. Found with GCC warning and code inspection: mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code': mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 'handled_qinst_cond' set but not used [-Wunused-but-set-variable] bool handled_qinst_cond = true; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Don't treat conditional MOVs as raw MOV.Eric Anholt2016-02-151-0/+1
| | | | | | | The two consumers want to know that the destination will be exactly the source, which is not true if we might not set the destination. Signed-off-by: Eric Anholt <[email protected]>
* gallium: add PIPE_SHADER_CAP_SUPPORTED_IRSSamuel Pitoiset2016-02-131-0/+2
| | | | | | | | | | | | This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir/tex_instr: Rename sampler to textureJason Ekstrand2016-02-092-5/+5
| | | | | | | | | We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: add interface for querying memory usage and sizes (v2)Marek Olšák2016-02-051-0/+1
| | | | | | | | | | If you're worried about the duplication of some CAPs, we can remove them later. v2: add fields for memory eviction stats Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium: add PIPE_CAP_QUERY_BUFFER_OBJECTIlia Mirkin2016-02-041-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKSNicolai Hähnle2016-02-031-0/+1
| | | | | | | | | | This cap indicates whether pipe->create_surface can reinterpret a texture as a surface with a format of different block width/height (but equal block size). v2: fix whitespace Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLYNicolai Hähnle2016-02-031-0/+1
| | | | | | | | | This cap indicates that the driver only supports R, RG, RGB and RGBA formats for PIPE_BUFFER sampler views. v2: move into "unsupported features" section for nouveau (Ilia Mirkin) Reviewed-by: Edward O'Callaghan <[email protected]>
* vc4: Throttle outstanding rendering after submission.Eric Anholt2016-01-271-0/+9
| | | | | | | | | | | Just make sure that after we've submitted, we get to at least 5 (global) submits ago before we go on to do more. Prevents up to seconds of lag with window movement in X with xcompmgr -c. There may be useful tuning to do in the future, but for now this gets us usability. Cc: "11.0 11.1" <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Don't record the seqno of a failed job submit.Eric Anholt2016-01-271-2/+2
| | | | | | | | | On an error return, the returned seqno will probably be unset, so we'd lose track of what we've submitted so far for waiting on in the future. Cc: "11.0 11.1" <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* nir: move to compiler/Emil Velikov2016-01-266-7/+7
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* gallium: add GREMEDY_string_markerRob Clark2016-01-211-0/+1
| | | | | | | | | | Since the GREMEDY extensions are normally only exposed by the gremedy debugger (and could possibly trigger debug paths in the app), we don't expose the extension by default, but instead only with ST_DEBUG=gremedy. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/st: add pipe_context::generate_mipmap()Charmaine Lee2016-01-141-0/+1
| | | | | | | | | | | | | | | | This patch adds a new interface to support hardware mipmap generation. PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify if this new interface is supported; if not supported, the state tracker will fallback to mipmap generation by rendering/texturing. v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers v3: add format to the generate_mipmap interface to allow mipmap generation using a format other than the resource format v4: fix return type of trace_context_generate_mipmap() Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-141-0/+1
| | | | | | | | | It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENTIlia Mirkin2016-01-081-13/+14
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-081-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add caps for POSITION and FACE system valuesMarek Olšák2016-01-081-0/+2
| | | | | | | v2: document the integer behavior Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* gallium: add caps to expose support for multi indirect drawsIlia Mirkin2016-01-071-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Fix driver build from last minute rebase fix.Eric Anholt2016-01-061-21/+20
| | | | | | | I had the driver all tested for the last series, and in my last build I noticed that get_swizzled_channel was unused now, and removed it... apparently without testing to find that I removed the wrong channel swizzle function.
* vc4: Optimize out a comparison for bcsel based on an ALU comparisonEric Anholt2016-01-061-14/+59
| | | | | | | | | | | | | | | | | We routinely have code like: vec1 ssa_220 = fge ssa_104, ssa_61 vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105 and we would compare fge's args and choose between ~0 and 0 to generate ssa_220, then compare ssa_220 to 0 and choose between bcsel's args. Instead, try to notice the pattern and compare between fge's args to select between bcsel's args. total instructions in shared programs: 88019 -> 87574 (-0.51%) instructions in affected programs: 9985 -> 9540 (-4.46%) total estimated cycles in shared programs: 245752 -> 245237 (-0.21%) estimated cycles in affected programs: 17232 -> 16717 (-2.99%)
* vc4: Add missing sRGB decode to texel fetches.Eric Anholt2016-01-061-0/+5
| | | | | We only see txf on MSAA textures, currently, and apparently this didn't impact any of our piglit tests.
* vc4: Add support for GL_ARB_texture_swizzle.Eric Anholt2016-01-061-1/+1
| | | | | We already had the code supporting it, since it's needed for the depth mode when doing shadow comparisons.
* vc4: Use NIR texture lowering for texture swizzling.Eric Anholt2016-01-062-57/+63
| | | | | | | | | | | | | | We can't use its other features currently (mostly because we don't want Newton-Raphson on rcps for texture coordinates), but it gets us started. This eliminates some comparisons with constants in GLB2.7 and ETQW traces at the QIR level by moving the comparisons into NIR, where they get constant-folded out. instructions in affected programs: 165 -> 156 (-5.45%) total uniforms in shared programs: 32087 -> 32085 (-0.01%) total estimated cycles in shared programs: 245762 -> 245752 (-0.00%) estimated cycles in affected programs: 461 -> 451 (-2.17%)
* vc4: Replace the SSA-style SEL operators with conditional MOVs.Eric Anholt2016-01-066-201/+128
| | | | | | | | | | | | | I'm moving away from QIR being SSA (since NIR is doing lots of SSA optimization for us now) and instead having QIR just be QPU operations with virtual registers. By making our SELs be composed of two MOVs, we could potentially coalesce the registers for the MOV's src and dst and eliminate the MOV. total instructions in shared programs: 88448 -> 88028 (-0.47%) instructions in affected programs: 39845 -> 39425 (-1.05%) total estimated cycles in shared programs: 246306 -> 245762 (-0.22%) estimated cycles in affected programs: 162887 -> 162343 (-0.33%)
* vc4: Don't try the SF coalescing unless it's on a def.Eric Anholt2016-01-061-3/+3
| | | | | | If you want the SF of the value of a register produced from a series of packing MOVs or conditional MOVs, we can't just SF on the last MOV into the register.
* gallium/drivers: Remove unnecessary semicolonsEdward O'Callaghan2016-01-061-1/+1
| | | | | | Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H supportIlia Mirkin2016-01-031-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vc4: Fix build from upload changes.Eric Anholt2016-01-021-1/+1
|
* u_upload_mgr: allow specifying PIPE_USAGE_* for the upload bufferMarek Olšák2016-01-021-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: remove alignment parameter from u_upload_createMarek Olšák2016-01-021-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_data manuallyMarek Olšák2016-01-021-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_alloc manuallyMarek Olšák2016-01-021-1/+1
| | | | | | | | | | The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_DRAW_PARAMETERSIlia Mirkin2015-12-301-0/+1
| | | | | | | | This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nir: Get rid of function overloadsJason Ekstrand2015-12-284-17/+17
| | | | | | | | | | | | | | | | | When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]> ir3 bits are Reviewed-by: Rob Clark <[email protected]>
* vc4: Do instruction scheduling on the QIR to hide texture fetch latency.Eric Anholt2015-12-184-0/+624
| | | | | | | | | | | | | | | | | | | | This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)
* vc4: Fix latency handling for QPU texture scheduling.Eric Anholt2015-12-181-32/+50
| | | | | | There's only high latency between a complete texture fetch setup and collecting its result, not between each step of setting up the texture fetch request.
* vc4: Keep sample mask writes from being reordered after TLB writesEric Anholt2015-12-181-1/+2
| | | | | | Fixes a regression I noticed after introducing scheduling on the QIR. Cc: "11.1" <[email protected]>
* vc4: Add support for dumping executed commands to a file.Eric Anholt2015-12-153-0/+94
| | | | | | | | | | The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.
* vc4: Import updated vc4_drm.h with hang state.Eric Anholt2015-12-151-0/+45
|
* vc4: Only update vc4->msaa when the framebuffer changes.Eric Anholt2015-12-151-7/+0
| | | | | | Any update here should have been the same as in vc4_set_framebuffer_state(), except for the point where vc4_blit.c temporarily sets different state for its different buffers.
* vc4: Don't consider nr_samples==1 surfaces to be MSAA.Eric Anholt2015-12-156-21/+25
| | | | | | This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.
* vc4: Fix min() wrapper definition for the simulator's kernel code.Eric Anholt2015-12-151-1/+1
|
* vc4: Warn instead of abort()ing on exec ioctl failures.Eric Anholt2015-12-151-3/+5
| | | | | | | | It's really harsh to abort() the X Server because of a momentary failure (particularly -ENOMEM). I don't see a way to pass an -ENOMEM up the stack from here, but we can at least log to stderr before proceeding on. Cc: "11.1" <[email protected]>
* vc4: Add quick algebraic optimization for clamping of unpacked values.Eric Anholt2015-12-111-0/+18
| | | | | | | | | | | | | | | | GL likes to saturate your incoming color, but if that color's coming from unpacking from unorms, there's no point. Ideally we'd have a range propagation pass that cleans these up in NIR, but that doesn't seem to be going to land soon. It seems like we could do a one-off optimization in nir_opt_algebraic, except that doesn't want to operate on expressions involving unpack_unorm_4x8, since it's sized. total instructions in shared programs: 87879 -> 87761 (-0.13%) instructions in affected programs: 6044 -> 5926 (-1.95%) total estimated cycles in shared programs: 349457 -> 349252 (-0.06%) estimated cycles in affected programs: 6172 -> 5967 (-3.32%) No SSPD on openarena (which had the biggest gains, in its VS/CSes), n=15.
* vc4: When doing algebraic optimization into a MOV, use the right MOV.Eric Anholt2015-12-111-1/+6
| | | | | If there were src unpacks, changing to the integer MOV instead of float (for example) would change the unpack operation.
* vc4: Fix handling of src packs on in qir_follow_movs().Eric Anholt2015-12-111-2/+8
| | | | | | The caller isn't going to expect it from a return, so it would probably get misinterpreted. If the caller had an unpack in its reg, that's fine, but don't lose track of it.
* vc4: Add missing progress note in opt_algebraic.Eric Anholt2015-12-111-0/+1
|