summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* nir: add a bit_size parameter to nir_ssa_dest_initConnor Abbott2016-03-174-7/+7
| | | | | | | | | | | | | | | | | | | | | | v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: rename nir_const_value fields to include bitsize informationIago Toral Quiroga2016-03-172-8/+8
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: update opcode definitions for different bit sizesConnor Abbott2016-03-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* vc4: Move discard handling to the condition flag.Eric Anholt2016-03-165-34/+29
| | | | | | | | | | | | | | | Now that the field exists in the instruction, we can make discards less special. As a bonus, that means that we should be able to merge some more .sf instructions together when we get around to that. This causes some scheduling changes, as it allows tlb_color_reads to be delayed past the discard condition setup. Since the tlb_color_read ends up later, this may mean performance improvements, but I haven't tested. total instructions in shared programs: 78114 -> 78035 (-0.10%) instructions in affected programs: 1922 -> 1843 (-4.11%) total estimated cycles in shared programs: 234318 -> 234329 (0.00%) estimated cycles in affected programs: 8200 -> 8211 (0.13%)
* vc4: Don't make a temporary for setting flags.Eric Anholt2016-03-161-1/+2
| | | | | | | | | The register allocator doesn't really do anything about the temp, so it doesn't seem like it should matter. However, the scheduler would think that a new def is being created. This doesn't change anything yet, but it avoids a bunch of regressions in the next commit.
* vc4: Add a safety check for setting flags.Eric Anholt2016-03-161-0/+3
| | | | | If a pack was on the src reg, should it be a float, int, or mul unpack? Just complain, instead.
* vc4: Reuse list_for_each_entry_safe_rev().Eric Anholt2016-03-161-6/+2
| | | | This didn't exist when I wrote the code.
* vc4: Coalesce instructions using VPM reads into the VPM read.Varad Gautam2016-03-153-7/+71
| | | | | | | | | | | | | | | This is done instead of copy propagating the VPM reads into the instructions using them, because VPM reads have to stay in order. shader-db results: total instructions in shared programs: 78509 -> 78114 (-0.50%) instructions in affected programs: 5203 -> 4808 (-7.59%) total estimated cycles in shared programs: 234670 -> 234318 (-0.15%) estimated cycles in affected programs: 5345 -> 4993 (-6.59%) Signed-off-by: Varad Gautam <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Rhys Kidd <[email protected]>
* vc4: rename file to group vpm optimizations togetherVarad Gautam2016-03-152-2/+2
| | | | | | | | This file will contain optimization passes for both vpm reads and writes. Signed-off-by: Varad Gautam <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Fix failures with nir_extract_* since the addition of the opcodes.Eric Anholt2016-03-151-0/+2
|
* gallium: add CAPs returning PCI device locationMarek Olšák2016-03-091-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium: add external usage flags to resource_from(get)_handle (v2)Marek Olšák2016-03-091-1/+2
| | | | | | | | | This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <[email protected]>
* Android: fix build break from nir/glsl move to compiler/Rob Herring2016-02-291-2/+4
| | | | | | | | | | | | | | | | Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38 ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <[email protected]> Cc: Mauro Rossi <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGESIlia Mirkin2016-02-151-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vc4: Add missing braces in initializerRhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | Silences the following GCC warning: mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c: In function 'qir_schedule_instructions': mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c:578:16: warning: missing braces around initializer [-Wmissing-braces] struct schedule_state state = { 0 }; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Correct typo setting 'handled_qinst_cond'Rhys Kidd2016-02-151-1/+1
| | | | | | | | | | | | | | | Variable was previously always set to true. Accordingly, the later assert() served no active purpose. Found with GCC warning and code inspection: mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code': mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 'handled_qinst_cond' set but not used [-Wunused-but-set-variable] bool handled_qinst_cond = true; ^ Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Don't treat conditional MOVs as raw MOV.Eric Anholt2016-02-151-0/+1
| | | | | | | The two consumers want to know that the destination will be exactly the source, which is not true if we might not set the destination. Signed-off-by: Eric Anholt <[email protected]>
* gallium: add PIPE_SHADER_CAP_SUPPORTED_IRSSamuel Pitoiset2016-02-131-0/+2
| | | | | | | | | | | | This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir/tex_instr: Rename sampler to textureJason Ekstrand2016-02-092-5/+5
| | | | | | | | | We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: add interface for querying memory usage and sizes (v2)Marek Olšák2016-02-051-0/+1
| | | | | | | | | | If you're worried about the duplication of some CAPs, we can remove them later. v2: add fields for memory eviction stats Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium: add PIPE_CAP_QUERY_BUFFER_OBJECTIlia Mirkin2016-02-041-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKSNicolai Hähnle2016-02-031-0/+1
| | | | | | | | | | This cap indicates whether pipe->create_surface can reinterpret a texture as a surface with a format of different block width/height (but equal block size). v2: fix whitespace Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLYNicolai Hähnle2016-02-031-0/+1
| | | | | | | | | This cap indicates that the driver only supports R, RG, RGB and RGBA formats for PIPE_BUFFER sampler views. v2: move into "unsupported features" section for nouveau (Ilia Mirkin) Reviewed-by: Edward O'Callaghan <[email protected]>
* vc4: Throttle outstanding rendering after submission.Eric Anholt2016-01-271-0/+9
| | | | | | | | | | | Just make sure that after we've submitted, we get to at least 5 (global) submits ago before we go on to do more. Prevents up to seconds of lag with window movement in X with xcompmgr -c. There may be useful tuning to do in the future, but for now this gets us usability. Cc: "11.0 11.1" <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* vc4: Don't record the seqno of a failed job submit.Eric Anholt2016-01-271-2/+2
| | | | | | | | | On an error return, the returned seqno will probably be unset, so we'd lose track of what we've submitted so far for waiting on in the future. Cc: "11.0 11.1" <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* nir: move to compiler/Emil Velikov2016-01-266-7/+7
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* gallium: add GREMEDY_string_markerRob Clark2016-01-211-0/+1
| | | | | | | | | | Since the GREMEDY extensions are normally only exposed by the gremedy debugger (and could possibly trigger debug paths in the app), we don't expose the extension by default, but instead only with ST_DEBUG=gremedy. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/st: add pipe_context::generate_mipmap()Charmaine Lee2016-01-141-0/+1
| | | | | | | | | | | | | | | | This patch adds a new interface to support hardware mipmap generation. PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify if this new interface is supported; if not supported, the state tracker will fallback to mipmap generation by rendering/texturing. v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers v3: add format to the generate_mipmap interface to allow mipmap generation using a format other than the resource format v4: fix return type of trace_context_generate_mipmap() Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-141-0/+1
| | | | | | | | | It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENTIlia Mirkin2016-01-081-13/+14
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-081-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add caps for POSITION and FACE system valuesMarek Olšák2016-01-081-0/+2
| | | | | | | v2: document the integer behavior Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* gallium: add caps to expose support for multi indirect drawsIlia Mirkin2016-01-071-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Fix driver build from last minute rebase fix.Eric Anholt2016-01-061-21/+20
| | | | | | | I had the driver all tested for the last series, and in my last build I noticed that get_swizzled_channel was unused now, and removed it... apparently without testing to find that I removed the wrong channel swizzle function.
* vc4: Optimize out a comparison for bcsel based on an ALU comparisonEric Anholt2016-01-061-14/+59
| | | | | | | | | | | | | | | | | We routinely have code like: vec1 ssa_220 = fge ssa_104, ssa_61 vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105 and we would compare fge's args and choose between ~0 and 0 to generate ssa_220, then compare ssa_220 to 0 and choose between bcsel's args. Instead, try to notice the pattern and compare between fge's args to select between bcsel's args. total instructions in shared programs: 88019 -> 87574 (-0.51%) instructions in affected programs: 9985 -> 9540 (-4.46%) total estimated cycles in shared programs: 245752 -> 245237 (-0.21%) estimated cycles in affected programs: 17232 -> 16717 (-2.99%)
* vc4: Add missing sRGB decode to texel fetches.Eric Anholt2016-01-061-0/+5
| | | | | We only see txf on MSAA textures, currently, and apparently this didn't impact any of our piglit tests.
* vc4: Add support for GL_ARB_texture_swizzle.Eric Anholt2016-01-061-1/+1
| | | | | We already had the code supporting it, since it's needed for the depth mode when doing shadow comparisons.
* vc4: Use NIR texture lowering for texture swizzling.Eric Anholt2016-01-062-57/+63
| | | | | | | | | | | | | | We can't use its other features currently (mostly because we don't want Newton-Raphson on rcps for texture coordinates), but it gets us started. This eliminates some comparisons with constants in GLB2.7 and ETQW traces at the QIR level by moving the comparisons into NIR, where they get constant-folded out. instructions in affected programs: 165 -> 156 (-5.45%) total uniforms in shared programs: 32087 -> 32085 (-0.01%) total estimated cycles in shared programs: 245762 -> 245752 (-0.00%) estimated cycles in affected programs: 461 -> 451 (-2.17%)
* vc4: Replace the SSA-style SEL operators with conditional MOVs.Eric Anholt2016-01-066-201/+128
| | | | | | | | | | | | | I'm moving away from QIR being SSA (since NIR is doing lots of SSA optimization for us now) and instead having QIR just be QPU operations with virtual registers. By making our SELs be composed of two MOVs, we could potentially coalesce the registers for the MOV's src and dst and eliminate the MOV. total instructions in shared programs: 88448 -> 88028 (-0.47%) instructions in affected programs: 39845 -> 39425 (-1.05%) total estimated cycles in shared programs: 246306 -> 245762 (-0.22%) estimated cycles in affected programs: 162887 -> 162343 (-0.33%)
* vc4: Don't try the SF coalescing unless it's on a def.Eric Anholt2016-01-061-3/+3
| | | | | | If you want the SF of the value of a register produced from a series of packing MOVs or conditional MOVs, we can't just SF on the last MOV into the register.
* gallium/drivers: Remove unnecessary semicolonsEdward O'Callaghan2016-01-061-1/+1
| | | | | | Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H supportIlia Mirkin2016-01-031-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vc4: Fix build from upload changes.Eric Anholt2016-01-021-1/+1
|
* u_upload_mgr: allow specifying PIPE_USAGE_* for the upload bufferMarek Olšák2016-01-021-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: remove alignment parameter from u_upload_createMarek Olšák2016-01-021-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_data manuallyMarek Olšák2016-01-021-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_alloc manuallyMarek Olšák2016-01-021-1/+1
| | | | | | | | | | The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_DRAW_PARAMETERSIlia Mirkin2015-12-301-0/+1
| | | | | | | | This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nir: Get rid of function overloadsJason Ekstrand2015-12-284-17/+17
| | | | | | | | | | | | | | | | | When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]> ir3 bits are Reviewed-by: Rob Clark <[email protected]>
* vc4: Do instruction scheduling on the QIR to hide texture fetch latency.Eric Anholt2015-12-184-0/+624
| | | | | | | | | | | | | | | | | | | | This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)