summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* iris: Add iris_resource fields for aux surfacesKenneth Graunke2019-02-212-0/+54
| | | | But without fast clears or HiZ per-level tracking just yet.
* iris: Emit default L3 config for the render pipelineJordan Justen2019-02-211-23/+38
| | | | Signed-off-by: Jordan Justen <[email protected]>
* iris: Always emit at least one BLEND_STATEKenneth Graunke2019-02-211-1/+8
|
* iris: Add missing depth cache flushesKenneth Graunke2019-02-211-0/+5
|
* iris: Simplify iris_get_depth_stencil_resourcesKenneth Graunke2019-02-211-5/+1
| | | | | | | | We can safely assume that the given resource is depth, depth/stencil, or stencil already. The stencil-only case is easily detectable with a single format check, and all other cases are handled identically. This saves some CPU overhead.
* iris: Make an IRIS_MAX_MIPLEVELS defineKenneth Graunke2019-02-212-1/+3
|
* iris: Store internal_format when getting resource from handle.Rafael Antognolli2019-02-211-0/+1
|
* iris: Move create and bind driver hooks to the end of iris_program.cKenneth Graunke2019-02-211-330/+312
| | | | | | | | This just moves the code for dealing with pipe_shader_state / pipe_compute_state / iris_uncompiled_shader to the end of the file. Now that those do precompiles, they want to call the actual compile functions. Putting them at the end eliminates the need for a bunch of prototypes.
* iris: implement clearing render target and depth stencilTimur Kristóf2019-02-211-107/+184
| | | | v2 (Kenneth Graunke): split color/depthstencil cases, fix iris_clear
* iris: Drop XXX about checking for swizzlingKenneth Graunke2019-02-211-2/+1
| | | | | | | | | | | | | Caio noted that this is not necessary on Gen8+: "Before Gen8, there was a historical configuration control field to swizzle address bit[6] for in X/Y tiling modes. This was set in three different places: TILECTL[1:0], ARB_MODE[5:4], and DISP_ARB_CTL[14:13]. For Gen8 and subsequent generations, the swizzle fields are all reserved, and the CPU's memory controller performs all address swizzling modifications." Since we don't support earlier hardware, we can skip it entirely.
* iris: Set HasWriteableRT correctlyKenneth Graunke2019-02-212-1/+45
| | | | A bit of irritating state cross dependency here, but nothing too hard
* iris: Set 3DSTATE_WM::ForceThreadDispatchEnableKenneth Graunke2019-02-211-0/+4
| | | | | | | | The Vulkan driver only sets this if color writes are disabled, which is more conservative - but would require us to inspect blend state. (If color writes are enabled, we don't need to force anything, because the internal signal is already correct. But it shouldn't hurt to do so.)
* iris: Drop XXX about alpha testingKenneth Graunke2019-02-211-3/+1
| | | | | | I was misreading i965 - the 3DSTATE_WM::PixelShaderKillsPixel bit from Gen < 8 needed all of this, but the 3DSTATE_PS_EXTRA bit only needs prog_data->uses_kill.
* iris: improve PIPE_CAP_VIDEO_MEMORY bogus valueAndre Heider2019-02-211-1/+1
| | | | | | -1 is a little too bogus for most games ;) Signed-off-by: Andre Heider <[email protected]>
* iris: fix build with gallium nineAndre Heider2019-02-212-3/+4
| | | | Signed-off-by: Andre Heider <[email protected]>
* iris: Stop chopping off the first nine characters of the renderer stringKenneth Graunke2019-02-211-1/+1
|
* iris: rework num textures to util_lastbitKenneth Graunke2019-02-212-6/+10
|
* iris: Add PIPE_CAP_MAX_VARYINGSKenneth Graunke2019-02-211-0/+1
|
* iris: Make a iris_batch_reference_signal_syncpt helper function.Kenneth Graunke2019-02-213-7/+22
| | | | Suggested by Chris Wilson. More obvious what's going on.
* iris: Use READ_ONCE and WRITE_ONCE for snapshots_landedKenneth Graunke2019-02-213-7/+8
| | | | | | Suggested by Chris Wilson, if only to make it obvious to the human readers that these are volatile reads. It may also be necessary for the compiler in a few cases.
* iris: Fix accidental busy-looping in query waitsKenneth Graunke2019-02-211-1/+1
| | | | | | | | | | | When switching from bo_wait to sync-points, I missed that we turned an if (not landed) bo_wait into a while (not landed) check_syncpt(), which has a timeout of 0. This meant, rather than sleeping until the batch is complete, we'd busy-loop, continually asking the kernel "is the batch done yet???". This is not what we want at all - if we wanted a busy loop, we'd just loop on !snapshots_landed. We want to sleep. Add an effectively infinite timeout so that we sleep.
* iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncptKenneth Graunke2019-02-213-6/+9
| | | | I want to be able to wait with a non-zero timeout from elsewhere.
* iris: Don't allocate a BO per query objectSagar Ghuge2019-02-215-45/+97
| | | | | | | | | | | Instead of allocating 4K BO per query object, we can create a large blob of memory and split it into pieces as required. Having one BO for multiple query objects, we don't want to wait on all of them, instead when we write last snapshot, we create a sync point, and check syncpoints while waiting on particular object. Signed-off-by: Sagar Ghuge <[email protected]>
* iris: Implement ALT mode for ARB_{vertex,fragment}_shaderKenneth Graunke2019-02-211-2/+4
| | | | Fixes gl-1.0-spot-light
* iris: Fix bug in bound vertex buffer trackingKenneth Graunke2019-02-211-3/+3
| | | | res might be NULL, at which point this is an unbind.
* iris: minor tidyingKenneth Graunke2019-02-212-40/+15
|
* iris: Unreference some more things on state module teardownKenneth Graunke2019-02-211-2/+21
|
* iris: Drop dead state_size hash tableKenneth Graunke2019-02-212-24/+2
| | | | | | | | I inherited this from i965. It would be nice to track the state size so INTEL_DEBUG=color,bat decoding can print the right number of e.g. binding table entries or blend states, but...without a single point of entry for state, it's a little tricky to get right. Punt for now, and drop the dead code in the meantime.
* iris: Drop comment about ISP_DISKenneth Graunke2019-02-211-2/+0
| | | | | | | i965 re-emits 3DSTATE_CONSTANT_* on every batch, so there's no point in restoring the constants from the context. Iris actually re-pins the constant buffers properly across the batch, and avoids re-emitting the constant packets unless it's necessary. So, we don't want ISP_DIS.
* iris: Enable PIPE_CAP_COMPACT_ARRAYSKenneth Graunke2019-02-211-0/+1
|
* iris: Remap stream output indexes back to VARYING_SLOT_*.Kenneth Graunke2019-02-211-2/+25
| | | | | | | | | | | | | Previously I had a hack in st/mesa to make it stop remapping VARYING_SLOT_* into the naively compacted slots, which aren't what we want. But that wasn't very feasible, as we'd have to update all drivers, or add capability bits, and it gets messy fast. It turns out that I can map back to VARYING_SLOT_* in about 5 LOC, so let's just do that. It removes the need for hacks, and is easy. This also fixes KHR-GL46.enhanced_layouts.xfb_capture_struct, which apparently with my hack was still getting the wrong slot info.
* iris: Zero the compute predicate when changing the render conditionKenneth Graunke2019-02-211-0/+3
| | | | | | | | | | | | | | | | | | | | 1. Set a render condition. We emit it immediately on the render engine, and stash q->bo as ice->state.compute_predicate in case the compute engine needs it. 2. Clear the render condition. We were incorrectly leaving a stale compute_predicate kicking around... 3. Dispatch compute. We would then read the stale compute predicate, and try to load it into MI_PREDICATE_DATA. But q->bo may have been freed altogether, causing us to try and use garbage memory as a BO, adding it to the validation list, failing asserts, and tripping EINVALs in execbuf. Huge thanks to Mark Janes for narrowing this sporadic GL CTS failure down to a list of 48 tests I could easily run to reproduce it. Huge thanks to the Valgrind authors for the memcheck tool that immediately pinpointed the problem.
* iris: always include an extra constbuf0 if using UBOsCaio Marcelo de Oliveira Filho2019-02-214-50/+56
| | | | | | | | | | | | | | | | | In st_nir_lower_uniforms_to_ubo() all UBO access in the shader have its index incremented to open room for uniforms in constbuf0. So if we use UBOs, we always need to include the extra binding entry in the table. To avoid doing this checks both when compiling the shader and when assigning binding tables, store the num_cbufs in iris_compiled_shader. Fixes a bunch of tests from Piglit and CTS that use UBOs but don't use uniforms or system values. Note that some tests fitting this criteria were passing because the UBOs were moved to be push constants (avoiding the problem). Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Do binder address allocations per-context, not globally.Kenneth Graunke2019-02-212-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iris_bufmgr allocates addresses across the entire screen, since buffers may be shared between multiple contexts. There used to be a single special address, IRIS_BINDER_ADDRESS, that was per-context - and all contexts used the same address. When I moved to the multi-binder system, I made a separate memory zone for them. I wanted there to be 2-3 binders per context, so we could cycle them to avoid the stalls inherent in pinning two buffers to the same address in back-to-back batches. But I figured I'd allow 100 binders just to be wildly excessive/cautious. What I didn't realize was that we need 2-3 binders per *context*, and what I did was allocate 100 binders per *screen*. Web browsers, for example, might have 1-2 contexts per tab, leading to hundreds of contexts, and thus binders. To fix this, we stop allocating VMA for binders in bufmgr, and let the binder handle it itself. Binders are per-context, and they can assign context-local addresses for the buffers by simply doing a ringbuffer style approach. We only hold on to one binder BO at a time, so we won't ever have a conflicting address. This fixes dEQP-EGL.functional.multicontext.non_shared_clear. Huge thanks to Tapani Pälli for debugging this whole mess and figuring out what was going wrong. Reviewed-by: Tapani Pälli <[email protected]>
* iris: Fix memzone_for_address for the surface and binder zonesKenneth Graunke2019-02-211-2/+2
| | | | | | | | | | | We use > for IRIS_MEMZONE_DYNAMIC because IRIS_BORDER_COLOR_POOL_ADDRESS lives at the very start of that zone. However, IRIS_MEMZONE_SURFACE and IRIS_MEMZONE_BINDER are normal zones. They used to be a single zone (surface) with a single binder BO at the beginning, similar to the border color pool. But when I moved us to multiple binders, I made them have a real zone (if a small one). So both zones should use >=. Reviewed-by: Tapani Pälli <[email protected]>
* iris: Don't whack SO dirty bits when finishing a BLORP opKenneth Graunke2019-02-211-0/+2
| | | | | Re-emitting 3DSTATE_SO_BUFFERS can be hazardous, as it could zero offsets. Plus, it's just not necessary - BLORP doesn't change these.
* iris: Fix SO issue with INTEL_DEBUG=reemit, set fewer bitsKenneth Graunke2019-02-211-2/+5
| | | | | | | | | INTEL_DEBUG=reemit was breaking streamout tests, by re-emitting 3DSTATE_SO_BUFFER commands that tell the HW to zero the SO write offsets. We would need to alter them to use 0xFFFFFFFF for the offset. Also, have each upload function only flag bits relevant to its own pipeline.
* iris: CS stall on VF cache invalidate workaroundsKenneth Graunke2019-02-212-3/+6
| | | | See commit 31e4c9ce400341df9b0136419b3b3c73b8c9eb7e in i965.
* iris: Pay attention to blit masksKenneth Graunke2019-02-211-11/+22
| | | | | For combined depth/stencil formats, we may want to only blit one half. If PIPE_BLIT_Z is set, blit depth; if PIPE_BLIT_S is set, blit stencil.
* iris: Assert about blits with color maskingKenneth Graunke2019-02-211-0/+4
| | | | | st/mesa never asks for this today, but in theory someone might, and we don't support it.
* iris: Don't enable smooth points when point sprites are enabledKenneth Graunke2019-02-211-4/+3
| | | | dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_*.primitives.points
* iris: Allow sample mask of 0Kenneth Graunke2019-02-211-1/+1
| | | | | | | | I think this was an attempt to work around various sample mask bugs I had early on. It's not correct. A sample mask of 0 is legal and means to disable all samples. Fixes dEQP-GLES31.functional.texture.multisample.*.*sample_mask*
* iris: fail to create screen for older unsupported HWKenneth Graunke2019-02-211-0/+3
| | | | loader shouldn't try, but let's be paranoid
* iris: Switch to the new PIPELINE_STATISTICS_QUERY_SINGLE capabilityKenneth Graunke2019-02-212-44/+6
| | | | | | | I had a hack in place earlier to pass the query type as q->index for the regular statistics query, but we ended up adjusting the interface and adding a new query type. Use that instead, fixing pipeline statistics queries since the rebase.
* iris: Use new PIPE_STAT_QUERY enums rather than hardcoded numbers.Kenneth Graunke2019-02-211-2/+5
|
* iris: Fix Broadwell WaDividePSInvocationCountBy4Kenneth Graunke2019-02-211-7/+7
| | | | | | | | | | We were dividing by 4 in calculate_result_on_gpu(), and also in iris_get_query_result(). We should stop doing the latter, and instead divide by 4 in calculate_result_on_cpu() as well. Otherwise, if snapshots were available, and you hit the calculate_result_on_cpu() path, but requested it be written to a QBO, you'd fail to get a divide.
* iris: Delete genx->bound_vertex_buffersKenneth Graunke2019-02-211-3/+0
| | | | This is actually stored in ice->state, as it isn't gen-specific
* iris: Drop a dead commentKenneth Graunke2019-02-211-2/+0
|
* iris: Don't check other batches for our batch BOKenneth Graunke2019-02-211-25/+27
| | | | | | | | | | This is an awkward corner case. We create batches in order, each of which creates and pins a BO. The other batches may not be set up yet, so it may not be safe to ask whether they reference a BO. Just avoid this for now. We could avoid it for other context-local BOs too, but we currently don't have a flag for that (and I'm not certain whether it's worth it).
* iris: Handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE somewhatKenneth Graunke2019-02-211-3/+6
| | | | | | | | | | | | Various places in the transfer code need to know whether they must read the existing resource's values. Rather than checking both flags everywhere, just make PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE also flag PIPE_TRANSFER_DISCARD_RANGE - if we can discard everything, we can discard a subrange, too. Obviously, we can do better for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, but eventually u_threaded_context should handle swapping out buffers for new idle buffers, anyway. In the meantime, this is at least better.