summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* nir: add lowering for gl_HelperInvocationRob Clark2018-07-182-0/+2
| | | | | | | | | v2: reword comment about lower_helper_invocations to be more clear that it might not work on all hardware v3: add special variant of load_sample_id which does not imply per- sample shading Signed-off-by: Rob Clark <[email protected]>
* freedreno: re-work fd_batch_reference() lockingRob Clark2018-07-172-23/+26
| | | | | | | | Annoyingly we still have to briefly drop the lock to unref resources.. but push the lock down into __fd_batch_destroy() so we can invalidate the batch and reset resources before dropping the lock. Signed-off-by: Rob Clark <[email protected]>
* freedreno: make fd_batch a one-shot thingRob Clark2018-07-172-11/+36
| | | | | | | | | | | | | Re-allocate rather than re-use. Originally we had an unnecessarily complex design to avoid re-allocating cmdstream buffers. But now that support for "growable" cmdstream buffers has been in place for a couple years, I guess we can care a bit less about the extra overhead on older kernels. But making the batches one-shot removes a class of potential race conditions vs the flush_queue. Signed-off-by: Rob Clark <[email protected]>
* freedreno: flush immediately when reading a pending batchRob Clark2018-07-172-30/+32
| | | | | | | | Instead of the reading batch setting a dependency on the writing batch, simply flush the writing batch immediately. This avoids situations where we have to flush the context's current batch later. Signed-off-by: Rob Clark <[email protected]>
* freedreno: get rid of noop renderRob Clark2018-07-174-21/+6
| | | | | | | This was basically to avoid a zero-dword IB (indirect-branch), but instead just don't emit the IB packet in that case. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix samples=0 vs samples=1 confusionRob Clark2018-07-171-1/+1
| | | | | | | | | pipe_framebuffer_state can have samples=0 in various cases, which is actually the same thing as samples=1. So use the _get_num_samples() helper to populate the key, to avoid this looking like two distinct fb states to the cache. Signed-off-by: Rob Clark <[email protected]>
* freedreno: comment for _invalidate_batch()Rob Clark2018-07-171-3/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: hold batch references when flushingRob Clark2018-07-171-32/+38
| | | | | | | It is possible for a batch to be freed under our feet when flushing, so it is best to hold a reference to all of them up-front. Signed-off-by: Rob Clark <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-061-3/+5
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* gallium/util: remove dummy function util_format_is_supportedMarek Olšák2018-06-294-8/+4
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/ir3: fix deref conversion falloutRob Clark2018-06-231-13/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix unused variable warningRob Clark2018-06-231-1/+0
| | | | | Fixes: cf0c7258ee0 freedreno/a5xx: MSAA Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix HW_ATOMIC_COUNTERS capRob Clark2018-06-231-1/+1
| | | | | | | | | | | This was mistakenly exposed, even though we want atomic counters to be lowered to atomic ops on an SSBO like nearly every other GPU. Which somehow recently started getting segfaults due to calling a null pipe->set_hw_atomic_buffers(). Fixes a crash in stk, and probably other things. Signed-off-by: Rob Clark <[email protected]>
* nir: Remove old-school deref chain supportJason Ekstrand2018-06-221-3/+0
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: convert to deref instructionsRob Clark2018-06-223-53/+57
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Rework lower_locals_to_regs to use deref instructionsJason Ekstrand2018-06-221-2/+2
| | | | | | | | | | This completely reworks the pass to support deref instructions and delete support for old deref chains Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Re-enable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | Now that it's rewritten for deref instructions, we can turn it back on. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Delete lower_io_typesJason Ekstrand2018-06-221-1/+0
| | | | | | | | | | It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st,ir3,radeonsi: push lower_deref_instrs back into driverRob Clark2018-06-222-3/+3
| | | | | | | | | | | | | vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: remove legacy versionRob Clark2018-06-221-1/+1
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: split out _legacy version for deref chainsRob Clark2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Disable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | | | | This pass doesn't handle deref instructions yet. Making it handle both legacy derefs and deref instructions would be painful. Since it's not important for correctness, just disable it for now. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand2018-06-222-1/+6
| | | | | | | | | | | This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno: a2xx: fix clear colorJonathan Marek2018-06-221-1/+1
| | | | | | | | the format of the CLEAR_COLOR register doesn't depend on the target format this fixes clear color when rendering to 32-bit RGBA and 16-bit targets Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix crash when freeing contextJonathan Marek2018-06-221-0/+2
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix crash on first clearJonathan Marek2018-06-221-4/+4
| | | | | | | blend can be NULL, so check for that Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: add a20xJonathan Marek2018-06-227-31/+85
| | | | | | | | | | | | | this patch adds support for a20x, which has some differences with a220: -no VGT_MAX_VTX_INDX register -no CLEAR_COLOR register -set RB_BC_CONTROL in restore (hangs without) -different CP_DRAW_INDX format tested with kmscube and glmark2 scenes, on par with a220 Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: increase size of the offset field in instr_fetch_vtx_tJonathan Marek2018-06-221-4/+2
| | | | | | | | The offset field is 22 bit large. 11 bits are necessary because MaxVertexAttribRelativeOffset = 2047 Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: MSAARob Clark2018-06-2114-42/+89
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-06-218-41/+53
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: txf_ms supportRob Clark2018-06-213-7/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix gpu hangs with large compute shadersRob Clark2018-06-211-3/+11
| | | | | | | Similar to the combined limit for VS+FS, there is an upper limit for shader size to run from internel memory. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix base_vertexRob Clark2018-06-211-0/+1
| | | | | Fixes: c366f422f0a nir: Offset vertex_id by first_vertex instead of base_vertex Signed-off-by: Rob Clark <[email protected]>
* gallium: add scalar isa shader capChristian Gmeiner2018-06-201-1/+2
| | | | | | | | | | | | | | | | v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx: move emit_marker5() into a5xx backendRob Clark2018-06-195-21/+24
| | | | | | | | | The scratch registers move again in a6xx.. so for post-a4xx let's just move this into the backend, and move the one place it used to be needed in core into fd5_emit_ib(). For a6xx we will do similar, calling emit_marker6() from fd6_emit_ib(). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix crash in ↵Rob Clark2018-06-193-1/+24
| | | | | | | | | | | dEQP-GLES31.stress.vertex_attribute_binding.buffer_bounds.bind_vertex_buffer_offset_near_wrap_10 This is kind of a hack, but really the only problem is the debug_assert() in OUT_RELOC(). But the debug_assert() is useful to catch real issues. So just add some #ifdef DEBUG code to filter things out before we hit the assert. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: don't crash if compute shader compile failsRob Clark2018-06-191-0/+2
| | | | | | | It is impolite, and a bit annoying with dEQP (all tests running in single process). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix missing recursion into block conditionRob Clark2018-06-191-0/+4
| | | | | | Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: better FOUR_QUAD/TWO_QUAD decision for computeRob Clark2018-06-191-4/+12
| | | | | | If we aren't going to get full occupancy, then use TWO_QUAD. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: bordercolor fixesRob Clark2018-06-191-4/+27
| | | | | | | Need a bit of hand-holding for stencil bordercolor, and add border color values for sRGB. Signed-off-by: Rob Clark <[email protected]>
* freedreno: remove per-stateobj dirty_mask'sRob Clark2018-06-195-37/+16
| | | | | | | | These never got updated in fd_context_all_dirty() so actually trying to rely on them (in the case of fd5_emit_images()) ends up in some cases where state is not emitted but should be. Best to just rip this out. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: remove one image stateblockRob Clark2018-06-191-13/+0
| | | | | | | | I think this ends up just setting uniform/const memory. But we upload x/y/z stride differently. At best this is unneeded, at worst it could possibly clobber other uniform/const memory. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: cubemap image fixesRob Clark2018-06-192-2/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle image bufferRob Clark2018-06-191-1/+8
| | | | | | Similar to txf case, we need to insert a 2nd coordinate (zero). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle arrays of imagesRob Clark2018-06-191-6/+30
| | | | | | | | Unlike textures, this doesn't get lowered for us. (Would be nice if they were.. at least until we are ready to deal w/ indirect indexing..) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: images can be arrays tooRob Clark2018-06-192-22/+83
| | | | | | Seems I previously toally forgot about 2d-arrays, etc.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use move_load_const passRob Clark2018-06-191-0/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium: add support for programmable sample locationsRhys Perry2018-06-141-0/+1
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* freedreno/ir3: use pipe_image_view's cppRob Clark2018-06-111-1/+6
| | | | | | | At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix image dimensions offsetRob Clark2018-06-111-1/+1
| | | | | | copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <[email protected]>