summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* vbo: Walk the VAO in print_draw_arrays.Mathias Fröhlich2016-07-311-20/+20
| | | | | | | | | Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Walk the VAO in _mesa_print_arrays.Mathias Fröhlich2016-07-311-32/+20
| | | | | | | | | Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO to check for mapped buffers.Mathias Fröhlich2016-07-311-10/+23
| | | | | | | | | Similarily to _mesa_all_varyings_in_vbos walk the VAO to check if we have an illegal mapped buffer object instead of walking all gl_client_arrays. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO to see if all varyings are in vbos.Mathias Fröhlich2016-07-311-2/+2
| | | | | | | | | | | | | | | | In vbo_draw_transform_feedback we currently look at exec->array.inputs to determine if all varying vertex attributes reside in vbos. But the vbo_bind_arrays call only happens past the vbo_all_varyings_in_vbos query. Thus we may work on a stale set of client arrays. Using the current VAOs content for this query feels much more logical to me. Additionally with this change mesa makes more use of the information already tracked in the VAO instead of looping across VERT_ATTRIB_MAX vertex arrays. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Implement _mesa_all_varyings_in_vbos.Mathias Fröhlich2016-07-312-0/+39
| | | | | | | | | | Implement the equivalent of vbo_all_varyings_in_vbos for vertex array objects. v2: Update comment. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Unbind deleted vbo using _mesa_bind_vertex_buffer.Mathias Fröhlich2016-07-311-4/+7
| | | | | | | | | | | | When a vertex buffer object gets deleted, it is unbound at the VAO. To do this use _mesa_bind_vertex_buffer instead of plain unreferencing the buffer object. This keeps the VAOs internal state consistent. In this case it showed up with gl_vertex_array_object::VertexAttribBufferMask getting out of sync. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: be more strict on block qualifiersTimothy Arceri2016-07-311-11/+73
| | | | | | | V2: Add spec references and allow patch qualifier (Ken) Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96528
* glsl: add name param to validate_flags()Timothy Arceri2016-07-313-10/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add component to ast_type_qualifier::validate_flagsTimothy Arceri2016-07-311-1/+2
| | | | | | | | This was added with ARB_enhanced_layouts. V2: Add an extra format specifier for the new qualifier. Reviewed-by: Kenneth Graunke <[email protected]>
* docs: Add GL4.4 and ARB_enhanced_layouts to the release notesTimothy Arceri2016-07-311-3/+4
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* anv: Perform rasterizer discard in the SOL stage instead of the clipper.Kenneth Graunke2016-07-304-3/+12
| | | | | | | | | See commit b0629e6894513a2c49a018bc3342a4e55435a236, where we discovered that the SOL stage's "Rendering Disable" feature is a lot faster at throwing away all geometry than the clipper's "reject all" mode. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* Revert "gallium/util: fix resource leak"Roland Scheidegger2016-07-301-2/+0
| | | | | | This reverts commit d1fe26a62862f4e47a799222dca1bc1dc14ca4af. Replacing a resource leak with a segfault isn't the solution.
* gallium/util: fix resource leakEric Engestrom2016-07-301-0/+2
| | | | | | | CovID: 401540 Signed-off-by: Eric Engestrom <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* freedreno/a4xx: fix comparison out of range warnings[email protected]2016-07-301-7/+7
| | | | | Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix comparison out of range warnings[email protected]2016-07-301-7/+7
| | | | | Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: fix comparison out of range warnings[email protected]2016-07-301-4/+4
| | | | | Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: init ir3_shader_key with memset()[email protected]2016-07-301-1/+2
| | | | | | | To silence missing initializers warning Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* gallium/freedreno: move cast to avoid integer overflowEric Engestrom2016-07-301-2/+2
| | | | | | | | | Previously, the bitshift would be performed on a simple int (32 bits on most systems), overflow, and then be cast to 64 bits. CovID: 1362461 Signed-off-by: Eric Engestrom <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: remove duplicate assignmentEric Engestrom2016-07-301-2/+2
| | | | | | CovID: 1362445, 1362446 Signed-off-by: Eric Engestrom <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: defer flush_queue allocationRob Clark2016-07-302-2/+4
| | | | | | | Some apps, like warsow, create a bazillion contexts but don't render on most of them. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some hw query tracesRob Clark2016-07-301-0/+16
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: some lockingRob Clark2016-07-309-23/+157
| | | | Signed-off-by: Rob Clark <[email protected]>
* os: add pipe_mutex_assert_locked()Rob Clark2016-07-301-0/+16
| | | | | | | Would be nice if we could also have lockdep, like in the linux kernel. But this is better than nothing. Signed-off-by: Rob Clark <[email protected]>
* freedreno: drop needs_rb_fbdRob Clark2016-07-306-31/+12
| | | | | | | | | | | We need to emit RB_FRAME_BUFFER_DIMENSION once per batch.. tracking this in fd_context is wrong when the gmem code executes asynchronously from the flush_queue worker. But in fact we don't really need to track it at all. We cannot assume previous value at the beginning of the batch (because of other processes potentially using the GPU), so just drop the tracking and emit it in _tile_init(). Signed-off-by: Rob Clark <[email protected]>
* freedreno: move needs_wfi into batchRob Clark2016-07-3019-94/+93
| | | | | | | This is also used in gmem code, which executes from the "bottom half" (ie. from the flush_queue worker thread), so it cannot be in fd_context. Signed-off-by: Rob Clark <[email protected]>
* freedreno: a bit of micro-optimizationRob Clark2016-07-302-10/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: drop mem2gmem/gmem2mem query stagesRob Clark2016-07-302-17/+1
| | | | | | | | | They weren't really used, and it gets somewhat more complicated to deal with if batches are flushed asynchronously (on another thread). So just drop them, and move _query_set_state(NULL) call into batch (so it is not happening on background thread). Signed-off-by: Rob Clark <[email protected]>
* freedreno: threaded batch flushRob Clark2016-07-309-26/+99
| | | | | | | | | | | | | | | With the state accessed from GMEM+submit factored out of fd_context and into fd_batch, now it is possible to punt this off to a helper thread. And more importantly, since there are cases where one context might force the batch-cache to flush another context's batches (ie. when there are too many in-flight batches), using a per-context helper thread keeps various different flushes for a given context serialized. TODO as with batch-cache, there are a few places where we'll need a mutex to protect critical sections, which is completely missing at the moment. Signed-off-by: Rob Clark <[email protected]>
* freedreno: track batch/blit typesRob Clark2016-07-305-24/+52
| | | | | | | | | | | | Add a bit of extra book-keeping about blits and back-blits (from resource shadowing). If the app uploads all mipmap levels, as opposed to uploading the first level and then glGenerateMipmap(), we can discard the back-blit (as opposed to being naive and shadowing the resource for each mipmap level). Also, after a normal blit, we might as well flush the batch immediately, since there is not likely to be further rendering to the surface. Signed-off-by: Rob Clark <[email protected]>
* freedreno: re-order support for hw queriesRob Clark2016-07-3019-264/+288
| | | | | | | | | | | Push query state down to batch, and use the resource tracking to figure out which batch(es) need to be flushed to get the query result. This means we actually need to allocate the prsc up front, before we know the size. So we have to add a special way to allocate an un- backed resource, and then later allocate the backing storage. Signed-off-by: Rob Clark <[email protected]>
* freedreno: use prsc for hw queriesRob Clark2016-07-303-35/+45
| | | | | | | | | Switch to using a pipe_resource (rather than an fd_bo directly) for hw query result buffers. This is first step towards making queries work properly with reordered batches, since we'll need the additional dependency tracking to know which batches to flush. Signed-off-by: Rob Clark <[email protected]>
* freedreno: support discarding previous rendering in special casesRob Clark2016-07-303-5/+32
| | | | | | | | | | Basically, to "DCE" blits triggered by resource shadowing, in cases where the levels are immediately completely overwritten. For example, mid-frame texture upload to level zero triggers shadowing and back-blits to the remaining levels, which are immediately overwritten by glGenerateMipmap(). Signed-off-by: Rob Clark <[email protected]>
* freedreno: shadow textures if possible to avoid stall/flushRob Clark2016-07-303-11/+211
| | | | | | | | | | | | | | To make batch re-ordering useful, we need to be able to create shadow resources to avoid a flush/stall in transfer_map(). For example, uploading new texture contents or updating a UBO mid-batch. In these cases, we want to clone the buffer, and update the new buffer, leaving the old buffer (whose reference is held by cmdstream) as a shadow. This is done by blitting the remaining other levels (and whatever part of current level that is not discarded) from the old/shadow buffer to the new one. Signed-off-by: Rob Clark <[email protected]>
* freedreno: spiff up some debug tracesRob Clark2016-07-306-6/+18
| | | | | | | Make it easier to track batches, to ensure things happen properly when they are reordered. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add batch-cache and batch reorderingRob Clark2016-07-3015-111/+760
| | | | | | | | | | | | Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <[email protected]>
* freedreno: move more batch related tracking to fd_batchRob Clark2016-07-3023-398/+420
| | | | | | | | | | | | | | | | To flush batches out of order, the gmem code needs to not depend on state from fd_context (since that may apply to a more recent batch). So this all moves into batch. The one exception is the gmem/pipe/tile state itself. But this is only used from gmem code (and batches are flushed serially). The alternative would be having to re-calculate GMEM layout on every batch, even if the dimensions of the render targets are the same. Note: This opens up the possibility of pushing gmem/submit into a helper thread. Signed-off-by: Rob Clark <[email protected]>
* freedreno: dynamically sized/growable cmd buffersRob Clark2016-07-302-23/+33
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: push resource tracking down into batchRob Clark2016-07-307-42/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: introduce fd_batchRob Clark2016-07-3020-177/+252
| | | | | | | | | | | | | | | | | | | Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <[email protected]>
* mesa: remove dd_function_table::UseProgramMarek Olšák2016-07-303-10/+0
| | | | | | finally unused Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: update sampler states when shaders are changedMarek Olšák2016-07-301-6/+12
| | | | | | | This bug seems to have always been there. Applications changing shaders but not textures between draw calls would have gotten undefined behavior. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: don't dirty sample shading on _NEW_PROGRAMMarek Olšák2016-07-301-2/+1
| | | | | | Already done as part of ST_NEW_FRAGMENT_PROGRAM in st_validate_state. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove excessive shader state dirtyingMarek Olšák2016-07-307-57/+33
| | | | | | | | | This just needs to be done by st_validate_state. v2: add "shaders_may_be_dirty" flags for not skipping st_validate_state on _NEW_PROGRAM to detect real shader changes Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: unreference optional shaders when unbindingMarek Olšák2016-07-301-0/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: skip updates of states that have no effectMarek Olšák2016-07-302-9/+28
| | | | | | v2: - also don't check edge flags for GLES Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: completely rewrite state atomsMarek Olšák2016-07-3033-516/+381
| | | | | | | | | | | | | | | | | | | | The goal is to do this in st_validate_state: while (dirty) atoms[u_bit_scan(&dirty)]->update(st); That implies that atoms can't specify which flags they consume. There is exactly one ST_NEW_* flag for each atom. (58 flags in total) There are macros that combine multiple flags into one for easier use. All _NEW_* flags are translated into ST_NEW_* flags in st_invalidate_state. st/mesa doesn't keep the _NEW_* flags after that. torcs is 2% faster between the previous patch and the end of this series. v2: - add st_atom_list.h to Makefile.sources Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove st_tracked_state::nameMarek Olšák2016-07-3020-58/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove atom debugging codeMarek Olšák2016-07-301-67/+3
| | | | | | This won't be needed after the rewrite. Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Fix move_interpolation_to_top() pass.Kenneth Graunke2016-07-291-21/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae was entirely broken. A missing "break" made the load_interpolated_input case always fall through to "default" and hit a "continue", making it not actually move any load_interpolated_input intrinsics at all. It would only move the simple load_barycentric_* intrinsics, which don't emit any code anyway, making it basically useless. The initial version I sent of the pass worked, but I apparently failed to verify that the simplified version in v2 actually worked. With the obvious fix applied (so we actually tried to move load_interpolated_input intrinsics), I discovered a second bug: we weren't moving the offset SSA def to the top, breaking SSA validation. The new version of the pass actually moves load_interpolated_input intrinsics and all their dependencies, as intended. Papers over GPU hangs on Ivybridge and Baytrail caused by the recent NIR FS input rework by restoring the old behavior. (I'm not honestly sure why they hang with PLN not at the top.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno: limit non-user constant buffers to a4xxRob Clark2016-07-291-1/+1
| | | | | | | Seems to mostly work on a3xx. Except when it doesn't and kills gpu quite badly. Signed-off-by: Rob Clark <[email protected]>