summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/v3d
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Use nir_remove_unused_io_vars to handle binner shader output DCEEric Anholt2018-10-301-1/+1
| | | | | | We were doing this late after nir_lower_io, but we can just reuse the core code. By doing it at this stage, we won't even set up the VS attributes as inputs, reducing our VPM size.
* v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components.Eric Anholt2018-10-301-1/+4
| | | | | This lets us trim unused trailing components in the vertex attributes, reducing the size of our VPM allocations.
* util: use C99 declaration in the for-loop set_foreach() macroEric Engestrom2018-10-251-2/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util: use C99 declaration in the for-loop hash_table_foreach() macroEric Engestrom2018-10-253-9/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* v3d: Add support for hardware pack/unpack of half floats.Eric Anholt2018-10-151-0/+1
| | | | | Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test in half.
* gallium/ttn: Convert inputs and outputs to derefs of variables.Eric Anholt2018-10-151-3/+4
| | | | | | | | | | | This means that TTN shaders more closely resemble GTN shaders: they have inputs and outputs as variable derefs, with the variables having their .driver_location already set up for you. This will be useful for v3d to do input variable DCE in NIR, which we can't do when the TTN shaders never have a pre-nir_lower_io stage. Acked-by: Rob Clark <[email protected]>
* gallium/u_transfer_helper: Add support for separate Z24/S8 as well.Kenneth Graunke2018-10-141-1/+2
| | | | | | | | | | | | | | | | u_transfer_helper already had code to handle treating packed Z32_S8 as separate Z32_FLOAT and S8_UINT resources, since some drivers can't handle that interleaved format natively. Other hardware needs depth and stencil as separate resources for all formats. For example, V3D3 needs this for 24-bit depth as well. This patch adds a new flag to lower all depth/stencils formats, and implements support for Z24_UNORM_S8_UINT. (S8_UINT_Z24_UNORM is left as an exercise to the reader, preferably someone who has access to a machine that uses that format.) Reviewed-by: Eric Anholt <[email protected]>
* v3d: Switch from FLUSH_ALL_STATE to FLUSH for ending our bin CLs.Eric Anholt2018-09-171-6/+6
| | | | | | The HW for FLUSH_ALL_STATE isn't validated, since the closed driver only uses FLUSH. Now that we don't have any new state at the end of our bin CLs, follow their lead.
* v3d: Stop clearing the OQ state at the end of the job.Eric Anholt2018-09-173-18/+1
| | | | | | Ever since we added OQ support, we've been clearing OQ state at the start of the job anyway. We're intentionally breaking old-and-new-driver-mix systems, because we need to stop using the unvalidated FLUSH_ALL_STATE.
* v3d: Always emit a TF disable at the start of drawing on V3D 4.x.Eric Anholt2018-09-173-10/+8
| | | | | | | | | | The HW's FLUSH_ALL_STATE is not validated, so we probably shouldn't use it, meaning that we need to reset state at the start. By doing this, we also make ourselves more resilient to another client leaving the TF state enabled at the end of their batch (as we now do, ourselves). However, we still need to emit a single TF disable at the end of the frame, for SWVC5-718.
* v3d: Fix setup of the VCM cache size.Eric Anholt2018-09-071-1/+1
| | | | | | | | | | | There were two bugs working together to make things mostly work: I wasn't dividing the VPM output size available by the size of a batch (vertex), but I also had the size of the VPM reduced by a factor of 8. Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it seems also my intermittent varying failures. Fixes: 1561e4984eb0 ("v3d: Emit the VCM_CACHE_SIZE packet.")
* v3d: Fix SRC_ALPHA_SATURATE blending for RTs without alpha.Eric Anholt2018-09-071-1/+3
| | | | | | | | Fixes dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate and friends with --deqp-egl-config-name=rgb565d0s0 Cc: "18.2" <[email protected]>
* v3d: Drop a bunch of duplicated gallium PIPE_CAP default code.Eric Anholt2018-09-041-151/+0
| | | | | | | Now that we have the util function for the default values, we can get rid of the boilerplate. v2: Rebase on new gallium caps
* gallium: Add a helper for implementing PIPE_CAP_* default values.Eric Anholt2018-09-041-2/+2
| | | | | | | | | | | | | | | | | | One of the pains of implementing a gallium driver is filling in a million pipe caps you don't know about yet when you're just starting out. One of the pains of working on gallium is copy-and-pasting your new PIPE_CAP into each driver. We can fix both of these by having each driver call into the default helper from their default case, so that both sides can ignore each other until they need to. v2: fix i915g build, revert swr change to avoid breaking scons build (https://travis-ci.org/anholt/mesa/jobs/419739857) v3: Rebase on 3 new gallium caps. Reviewed-by: Marek Olšák <[email protected]> (v1) Cc: Bruce Cherniak <[email protected]> Cc: George Kyriazis <[email protected]> Cc: Kenneth Graunke <[email protected]>
* gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE.Kenneth Graunke2018-08-241-0/+1
| | | | | | | | | | | | | Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER. Drivers for such hardware would like to advertise support for ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp. This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit, changes the extension enable to be based on that, and enables it in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP (so they continue supporting this mode).
* gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZEMarek Olšák2018-08-231-0/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_GS_INVOCATIONSMarek Olšák2018-08-231-0/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* v3d: Emit the VCM_CACHE_SIZE packet.Eric Anholt2018-08-062-0/+9
| | | | | | | This is needed to ensure that we don't get blocked waiting for VPM space with bin/render overlapping. Cc: "18.2" <[email protected]>
* v3d: Drop "VC5" from the renderer string.Eric Anholt2018-08-061-1/+1
| | | | VC5 isn't a useful name any more, just stick to v3d.
* gallium: add storage_sample_count parameter into is_format_supportedMarek Olšák2018-07-311-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTSMarek Olšák2018-07-311-0/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* v3d: Include commands to run the BCL and RCL in CLIF dumps.Eric Anholt2018-07-301-10/+1
|
* v3d: Rename "configuration" and "config" in the XML to "cfg"Eric Anholt2018-07-304-30/+33
| | | | | | This matches what CLIF parsing expects, and makes TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more legible TILE_BINNING_MODE_CFG_COMMON.
* v3d: s/colour/color in the XML.Eric Anholt2018-07-303-20/+20
| | | | | | The CLIF format expects american english spelling, and the rest of Mesa is too. I was previously adhering to the spec's spelling, which is counterproductive.
* v3d: Rename primitives to prims in the XML to match CLIF names.Eric Anholt2018-07-302-5/+5
| | | | This makes us match up with the V3D HW team's names a bit more.
* v3d: Add a separate flag for CLIF ABI output versus human-readable CLs.Eric Anholt2018-07-301-2/+3
| | | | | | A few of the upcoming changes would make the V3D_DEBUG=cl output less readable, so let's make proper CLIF file production be under a separate V3D_DEBUG=clif flag.
* v3d: Add pack header support for f187 values.Eric Anholt2018-07-302-15/+5
| | | | | | V3D only has one of these (the top 16 bits of a float32) left in its CLs, but VC4 had many more. This gets us proper pretty-printing of the values instead of a large uint.
* v3d: Move depth offset packet setup to CSO creation time.Eric Anholt2018-07-304-33/+34
| | | | | This should be some simpler memcpying at draw time, and makes the next change easier.
* v3d: Block bin on render when doing vertex texturing.Eric Anholt2018-07-291-0/+14
| | | | | | | | | | | | The kernel by default serializes the BCL on previous BCLs submitted on this FD, but not RCLs. For now this fix is conservative and blocks on last RCL if any vertex texturing is done, which fails to get bin/render overlap if there was an intermediate job that doesn't draw to the BCL's buffer. I've dropped a perf_debug() in here to note that as a potential future improvement. Fixes intermittent failures in KHR-GLES3.copy_tex_image_conversions.required.*
* v3d: Move clif dumping to a separate step from noting where the CLs are.Eric Anholt2018-07-271-0/+2
| | | | Now all the printing happens from the same worklist processing.
* v3d: Move clif dump BO lookup into the clif dumper.Eric Anholt2018-07-271-22/+15
| | | | | The clif dumper is going to need information about all of our BOs if we're going to dump them for replay purposes.
* v3d: Drop the use of the semaphores.Eric Anholt2018-07-272-9/+0
| | | | | | The kernel's scheduler doesn't rely on our emitting them, and in fact we'd get in trouble if the kernel decided to schedule too many bins in a row before getting around to scheduling the corresponding render.
* v3d: Drop the VG support from the XML.Eric Anholt2018-07-271-2/+1
| | | | | | This reflects a change on the HW/closed SW side to drop this unused HW. With it dropped on their side, the CLIF parser no longer expects to find VG fields.
* v3d: Stop using spaces in the names of our buffers.Eric Anholt2018-07-273-2/+6
| | | | | For CLIF dumping, we need names to not have spaces. Rather than rewriting them after the fact, just change the two cases where I had put a space in.
* v3d: Avoid the GFXH-1461 workaround if we have only Z or only S.Eric Anholt2018-07-261-4/+6
| | | | | | This seems like a sensible precaution to avoid extra draws. It doesn't deal with the case of a Z24S8 buffer created by the window system for an application that happens to never use S.
* v3d: Rework the ordering of how we clear things.Eric Anholt2018-07-261-31/+54
| | | | | | | | | First, figure out if we can just sneak the clear into the TLB clear, even if drawing has already happened (since we have job->load and job->clear to tell us), taking into account GFXH-1461. For any pieces we can't TLB clear, fall back to drawing a quad without flushing the scene. Fixes extra scene flushes in glmark2 due to GFXH-1461.
* v3d: Only store buffers that have been written to.Eric Anholt2018-07-261-3/+9
| | | | | I've seen cases where a color buffer is bound, but only Z is written, and we end up storing color.
* v3d: Track the buffers being loaded separately.Eric Anholt2018-07-263-1/+8
| | | | | | We were computing this at RCL generation time, but that means you can't unflag the store for an invalidate_resource, or not flag the store if writmasking is disabled.
* v3d: Rename cleared/resolve to clear/store.Eric Anholt2018-07-265-35/+35
| | | | | | These describe what the fields mean in RCL generation. "resolve" is left over from VC4, and sounds like MSAA resolves (which may or may not be involved in the store we generate).
* v3d: Fix incorrect handling of two fences created back-to-back.Eric Anholt2018-07-201-12/+31
| | | | | | | | | | | Recreating our context's syncobj with ALREADY_SIGNALED meant that if you created two fences in a row, then waiting on the second would succeed immediately. Instead, export a sync file in the gallium fence (since we don't have a syncobj clone ioctl), and just create a new syncobj to wait on whenever we need to. Noticed while debugging dEQP-GLES3.functional.fence_sync.client_wait_sync_finish
* v3d: Fix the timeout value passed to drmSyncobjWait().Eric Anholt2018-07-201-1/+6
| | | | | The API wants an absolute time, so we need to go add gallium's argument to CLOCK_MONOTONIC.
* v3d: Fix drmSyncobjWait() return value checking even more.Eric Anholt2018-07-201-1/+1
| | | | | | It tends to return >0 in the success case (I think the value is something like "how much of the timeout remained"). Fixes dEQP-GLES3.functional.fence_sync.client_wait_sync_finish
* v3d: Use the list_first_entry/list_last_entry macros.Eric Anholt2018-07-201-8/+8
|
* v3d: Move BO cache counting to dump time instead of cache management.Eric Anholt2018-07-202-9/+9
| | | | This is one less way to get the dump stats wrong.
* v3d: Reduce the stale BO reclamation spam with dump_stats set.Eric Anholt2018-07-201-6/+5
| | | | | This was obviously meant to be when we were actually freeing a BO, not just when there was at least one BO in the list.
* v3d: Respect a sampler view's first_layer field.Eric Anholt2018-07-201-1/+3
| | | | | Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture
* v3d: Fix tiling modifier support to use the new UIF define.Eric Anholt2018-07-181-3/+16
| | | | | You can't use T tiled buffers on V3D 3.x and newer, it's been replaced with a newer layout shared with other hardware blocks.
* v3d: Work around GFXH-1461 bug losing our Z/S clears.Eric Anholt2018-07-131-0/+30
| | | | | | | If you load S and clear Z or vice versa, the clear may get lost. Just fall back to drawing a quad. Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
* u_blitter: Add an option to draw the triangles using an index buffer.Eric Anholt2018-07-121-0/+1
| | | | | | | | | | | | | | | For V3D, the HW will interpolate slightly differently along the shared edge of the trifan. The conformance tests manage to catch this in the nearest_consistency_* group. To get interpolation to match, we need the last vertex of the triangle to be shared. I first tried implementing draw_rectangle to do triangles instead, but that was quite a bit (147 lines) of code duplication from u_blitter, and this seems much simpler and less likely to break as u_blitter changes. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D. Reviewed-by: Marek Olšák <[email protected]>
* v3d: Don't automatically reallocate a PERSISTENT-mapped buffer.Eric Anholt2018-07-121-1/+1
| | | | | | | I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes piglit bufferstorage-persistent read