aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Fix duplication of DRI imagesLouis-Francis Ratté-Boulianne2017-09-201-0/+3
| | | | | | | | | | | Some DRI image properties weren't properly duplicated in the new image. Some properties are still missing, but I'm not certain if there was a good reason to let them out in the first place. Signed-off-by: Louis-Francis Ratté-Boulianne <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Unify ir_constant::const_elements and ::componentsIan Romanick2017-09-192-2/+4
| | | | | | | | | | There was no reason to treat array types and record types differently. Unifying them saves a bunch of code and saves a few bytes in every ir_constant. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* glsl: Rename ir_constant::array_elements to ::const_elementsIan Romanick2017-09-192-2/+2
| | | | | | | | | | | The next patch will unify ::array_elements and ::components, so the name ::array_elements wouldn't be appropriate. A lot of things use the names array_elements and components, so grepping for either is pretty useless. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* Revert "i965: Reset miptree aux state on update_image_buffer"Jason Ekstrand2017-09-193-25/+1
| | | | This reverts commit e97f4b748094466567c7f3bad1a02ecee13db9c8.
* i965: Fix batch map failure check in INTEL_DEBUG=bat handling.Kenneth Graunke2017-09-181-1/+1
| | | | | | | | | | I originally wrote the code to call the maps 'batch' and 'state', until I remembered that 'batch' is the intel_batchbuffer struct pointer. The NULL check was still using the wrong variable. Caught by Coverity. CID: 1418109
* i965: Use prepare_external instead of make_shareable in setTexBuffer2Jason Ekstrand2017-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT which has tighter restrictions than just "it's shared". In particular, it says that any rendering to the image while it is bound causes the contents to become undefined. This means that we can do whatever aux tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so long as we always transition from external in Bind and to external in Release. The fact that we were using make_shareable before was a problem because it would resolve away 100% of the aux data and then throw away our reference to the aux buffer. If the aux data was shared with some other application (i.e. if we're using I915_FORMAT_MOD_Y_TILED_CCS) then we would forget that the aux data even existed for the rest of eternity. This is fine for the first frame but any subsequent calls to glxBindTexImageEXT would bind the texture as if it has no aux whatsoever and no resolves would happen and texturing would happen as if there is no aux. This was causing rendering corruption in mutter when running on top of X11 with modifiers. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2Jason Ekstrand2017-09-181-19/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old code made a new miptree that referenced the same BO as the renderbuffer and just trusted in the memory aliasing to work. There are only two ways in which the new miptree is liable to differ from the one in the renderbuffer and neither of them matter: 1) It may have a different target. The only targets that we can ever see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE and the difference between the two doesn't matter as far as the miptree is concerned; genX(update_sampler_state) only looks at the gl_texture_object and not the miptree when determining whether or not to use normalized coordinates. 2) It may have a very slightly different format. Again, this doesn't matter because we've supported texture views for quite some time so we always look at the gl_texture_object format instead of the miptree format for hardware setup anyway. On the other hand, because we were recreating the miptree, we were using intel_miptree_create_for_bo which doesn't understand modifiers. We really want this function to work without doing a resolve so long as you have modifiers so we need to fix that. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Reset miptree aux state on update_image_bufferJason Ekstrand2017-09-183-1/+25
| | | | | | | | | | | | | When we get a miptree in through glxBindImageEXT, we don't know the current aux state so we have to assume the worst-case. If the image gets recreated, everything is fine because miptreecreate_for_dri_image sets it to the default. However, if our miptree is recycled, then we may have stale aux_usage and we need to reset to the default otherwise our aux_state tracking will get messed up. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Add a drm_modifier_get_default_aux_state helperJason Ekstrand2017-09-181-2/+1
| | | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Warn for GTT fallbacks when mapping the batch/state buffers.Kenneth Graunke2017-09-181-2/+2
| | | | | | | | This shouldn't really happen in practice, but I hit it a couple of times when running a driver with a bad memory leak. We may as well hook up the warning, because if it ever triggers, we'll know something is wrong. Reviewed-by: Chris Wilson <[email protected]>
* i965: Plumb brw through to intel_batchbuffer_reset().Kenneth Graunke2017-09-183-11/+11
| | | | | | We'll want to pass this to brw_bo_map in a moment. Reviewed-by: Chris Wilson <[email protected]>
* gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVENicolai Hähnle2017-09-181-1/+4
| | | | | | | | | | | | | | | | | To be able to properly distinguish between GL_ANY_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED_CONSERVATIVE. This patch goes through all drivers, having them treat the two query types identically, except: 1. radeon incorrectly enabled conservative mode on PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE. 2. st/mesa uses the new query type. Fixes dEQP-GLES31.functional.fbo.no_attachments.* Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_tgsi: fix theoretical memory leakNicolai Hähnle2017-09-181-2/+5
| | | | | | | | It can't *really* happen since we don't use subroutines. CID: 1417491 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* i965: emit BRW_NEW_AUX_STATE on aux state changesIago Toral Quiroga2017-09-181-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a regression introduced with b96313c0e1289b296d7, which removed BRW_NEW_BLORP for a bunch of SURFACE_STATE setup code, including render targets, on the basis that blorp invalidates binding tables but not surface states, however, at least on Broadwell, this caused a regression in a CTS test, which Ken and Jason tracked down to the fact that we are not uploading new render target surface states after allocating new CCS_D surfaces for fast clears (which allocation is deferred until an actual clear occurs). The reason this only fails in BDW is that on SKL+ we use CCS_E which is allocated up front so it exists in the initial surface state, the problem can be reproduced in these platforms too if we use INTEL_DEBUG=norcb to force the CCS_D path. This patch, together with the ones preceding it, fixes the regression by ensuring that we track and flag as dirty all aux state changes. Credit goes to Jason and Ken for figuring out the reason for the regression. Fixes: KHR-GL45.transform_feedback.draw_xfb_test Reviewed-by: Jason Ekstrand <[email protected]>
* i965: emit BRW_NEW_AUX_STATE when we change the fast clear valueIago Toral Quiroga2017-09-183-11/+31
| | | | | | | v2: rename intel_miptree_set_clear_value to intel_miptree_set_clear_color (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* i965: emit BRW_NEW_AUX_STATE if we drop the aux surfaceIago Toral Quiroga2017-09-181-0/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATEIago Toral Quiroga2017-09-188-14/+14
| | | | | | | We want to use this flag to signal changes to the aux surfaces, so let's not make it about fast clearing only. Suggested by Jason. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add an INTEL_DEBUG=reemit option.Kenneth Graunke2017-09-151-1/+1
| | | | | | | | | Jason and I use this for debugging all the time. Recompiling the driver to enable it is kind of annoying. It's a great thing to try along with always_flush_batch=true and always_flush_cache=true to detect a class of problems - namely, atoms listening to an insufficient set of dirty bits. Reviewed-by: Matt Turner <[email protected]>
* i965: drop unused variablesEric Engestrom2017-09-152-2/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/tex: Unify the TexImage and TexSubImage codeJason Ekstrand2017-09-151-58/+45
| | | | | | | | | | It's nearly the same so there's no good reason why it can't be in a common function. The one difference is that _mesa_store_teximage calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage doesn't, but we don't need that anyway - intelTexImage already does it. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpyJason Ekstrand2017-09-151-14/+5
| | | | | | | | | It is set to false in both callers. It isn't needed for glTexImage because intelTexImage calls AllocTextureImageBuffer before calling texsubimage_tiled_memcpy. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/tex: Make a couple of helpers staticJason Ekstrand2017-09-152-22/+2
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Move TexSubImage functions to intel_tex_image.cJason Ekstrand2017-09-155-260/+210
| | | | | | | | These two paths are basically the same. There's no good reason to have them in different files. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/blorp: Set r8stencil_needs_update when writing stencilJason Ekstrand2017-09-151-0/+6
| | | | | | | | | | This fixes a crash on Haswell when we try to upload a stencil texture with blorp. It would also be a problem if someone tried to texture from stencil after glBlitFramebuffers. Cc: "17.2 17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* mesa/st/tests: Correct build flags and force -std=c++11Gert Wollny2017-09-151-9/+5
| | | | | | | | | | | | Include src/gallium/Automake.inc, correct the build flags accordingly. Force -std=c++11 (extensively used by the test) as otherwise it gets defined only when building against llvm >= 3.9. Fixes: 7be6d8fe12 ("mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665 Reviewed-by: Emil Velikov <[email protected]> (v1)
* i965: fix build warning on clangTapani Pälli2017-09-151-1/+2
| | | | | | | | | | | fixes following warning: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') cast is needed to avoid this change turning in to another warning: warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long') Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* st/mesa: set UseSTD430AsDefaultPacking const based on CAPTimothy Arceri2017-09-151-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: add LOAD support for UBOsTimothy Arceri2017-09-151-79/+119
| | | | | | | This will allow us to use STD430 packing by default if the driver supports it. Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: create add_buffer_to_load_and_stores() helperTimothy Arceri2017-09-151-19/+27
| | | | | | Will be used to add LOAD support to UBOs. Reviewed-by: Marek Olšák <[email protected]>
* st/glsl->tgsi: fix u64 to bool comparisons.Dave Airlie2017-09-151-1/+15
| | | | | | | | | | | Otherwise we end up using a 32-bit comparison which didn't end well. Timothy caught this while playing around with some opt passes. Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers) Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Print size of validation and relocation lists in INTEL_DEBUG=flushKenneth Graunke2017-09-141-3/+8
| | | | | | | | It's nice to have this information. While we're at it, tweak the formatting to try and vertically align numbers in the common case. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Disentangle batch and state buffer flushing.Kenneth Graunke2017-09-145-40/+30
| | | | | | | | | | | | | | | | | | | | | | We now flush the batch when either the batchbuffer or statebuffer reaches the original intended batch size, instead of when the sum of the two reaches a certain size (which makes no sense now that they're separate buffers). With this change, we also need to update our "are we near the end?" estimate to require separate batch and state buffer space. I obtained these estimates by looking at the size of draw calls in the Unreal 4 Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true). This will significantly impact the size of our batches. I've adjusted both down to try and be roughly similar to what we had been doing. On various benchmarks, a 20kB batch and 16kB statebuffer seemed to about right, but we may need to adjust this further. I tried a 16kB batch, but that regressed Synmark OglMultithread performance by a fair bit. 32kB for both would have significantly increased our batch sizes. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Delete BATCH_RESERVED handling.Kenneth Graunke2017-09-142-34/+3
| | | | | | | | | Now that we can grow the batchbuffer if we absolutely need the extra space, we don't need to reserve space for the final do-or-die ending commands. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Make BLORP properly avoid batch wrapping.Kenneth Graunke2017-09-141-14/+2
| | | | | | | | We need to set brw->no_batch_wrap to actually avoid flushing in the middle of our BLORP operation, and instead grow the batchbuffer. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Grow the batch/state buffers if we need space and can't flush.Kenneth Graunke2017-09-141-5/+131
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, we would just assert fail and die in this case. The only safeguard is the "estimated max prim size" checks when starting a draw (or compute dispatch or BLORP operation)...which are woefully broken. Growing is fairly straightforward: 1. Allocate a new larger BO. 2. memcpy the existing contents over to the new buffer 3. Set the new BO to the same GTT offset as the old BO. When emitting relocations, we write the presumed GTT offset of the target BO. If we changed it, we'd have to update all the existing values (by walking the relocation list and looking at offsets), which is more expensive. With the old BO freed, ideally the kernel could simply place the new BO at that offset anyway. 4. Update the validation list to contain the new BO. 5. Update the relocation list to have the GEM handle for the new BO (which we can skip if using I915_EXEC_HANDLE_LUT). v2: Update to handle malloc'd shadow buffers. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Use a separate state buffer, but avoid changing flushing behavior.Kenneth Graunke2017-09-147-82/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we emitted GPU commands and indirect state into the same buffer, using a stack/heap like system where we filled in commands from the start of the buffer, and state from the end of the buffer. We then flushed before the two met in the middle. Meeting in the middle is fatal, so you have to be certain that you reserve the correct amount of space before emitting commands or state for a draw. Currently, we will assert !no_batch_wrap and die if the estimate is ever too small. This has been mercifully obscure, but has happened on a number of occasions, and could in theory happen to any application that issues a large draw at just the wrong time. Estimating the amount of batch space required is painful - it's hard to get right, and getting it right involves a lot of code that would burn CPU time, and also be painful to maintain. Rolling back to a saved state and retrying is also painful - failing to save/restore all the required state will break things, and redoing state emission burns a lot of CPU. memcpy'ing to a new batch and continuing is painful, because commands we issue for a draw depend on earlier commands as well (such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state). The best plan is to never run out of space, which is totally doable but pretty wasteful - a pessimal draw requires a huge amount of space, and rarely occurs. Instead, we'd like to grow the batch buffer if we need more space and can't safely flush. We can't grow with a meet in the middle approach - we'd have to move the state to the end, which would mean updating every offset from dynamic state base address. Using separate batch and state buffers, where both fill starting at the beginning, makes it easy to grow either as needed. This patch separates the two concepts. We create a separate state buffer, with a second relocation list, and use that for brw_state_batch. However, this patch tries to retain the original flushing behavior - it adds the amount of batch and state space together, as if they were still co-existing in a single buffer. The hope is to flush at the same time as before. This is necessary to avoid provoking bugs caused by broken batch wrap handling (which we'll fix shortly). It also avoids suddenly increasing the size of the batch (due to state not taking up space), which could have a significant performance impact. We'll tune it later. v2: - Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught by Chris). Unfortunately, we lose the ability to capture state data on older kernels. - Continue to support the malloc'd shadow buffers. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Pass screen to intel_batchbuffer_reset().Kenneth Graunke2017-09-141-10/+8
| | | | | | | This will let us access screen->kernel_features in the next patch. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.Kenneth Graunke2017-09-141-56/+54
| | | | | | | | | | | We'll need to read from both buffers when decoding state. This also drops the "failed to map" fallback - it's completely useless on LLC systems where we write directly to the mapped BO. It's not that useful on non-LLC systems either. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.Kenneth Graunke2017-09-145-48/+73
| | | | | | | | | | | brw_batch_reloc emits a relocation from the batchbuffer to elsewhere. brw_state_reloc emits a relocation from the statebuffer to elsewhere. For now, they do the same thing, but when we actually split the two buffers, we'll change brw_state_reloc to use the state buffer. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Refactor relocs into a brw_reloc_list structure.Kenneth Graunke2017-09-142-19/+32
| | | | | | | | | | I'm planning on splitting batch and state into separate buffers, at which point we'll need two relocation lists. In preparation for that, this patch refactors the relocation stuff into a structure we can replicate...which looks a lot like anv_reloc_list. Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move brw_state_batch code to intel_batchbuffer.cKenneth Graunke2017-09-144-97/+47
| | | | | | | | | | | | The batch buffer and state buffer code is fairly tied together, and having it in one .c file will make refactoring easier. Also, drop some commentary above brw_state_batch. The "aperture checking performance hacks" are long since gone, so that paragraph makes little sense at this point. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Drop a useless ret == 0 check.Kenneth Graunke2017-09-141-22/+20
| | | | | | | | | Prior to the previous patch, we would pwrite the batchbuffer contents, and wanted to skip the execbuffer if that failed. Now that we memcpy, we don't set ret != 0 on failure anymore, so it will always be 0. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Use a WC map and memcpy for the batch instead of pwrite.Kenneth Graunke2017-09-141-10/+8
| | | | | | | | | | | | We'd like to eliminate the malloc'd shadow copy eventually, but there are still unresolved performance problems. In the meantime, let's at least get rid of pwrite. On Apollolake, improves Synmark OglBatch6 performance by: 1.53581% +/- 0.269589% (n=108). Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Use batch->bo->size in brw_emit_reloc assertion.Kenneth Graunke2017-09-141-1/+1
| | | | | | | This makes the assertion safe against batchbuffers growing. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Delete a batch size assertion that isn't very useful.Kenneth Graunke2017-09-141-3/+0
| | | | | | | | | | | This assertion prevents you from doing intel_batchbuffer_require_space with a size so huge it won't fit in the batchbuffer. This doesn't seem like a common mistake, and I've never seen the assert to be useful. Soon, I hope to have batches grow, at which point this won't make sense. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965/screen: Implement queryDmaBufFormatModifierAttirbsJason Ekstrand2017-09-141-1/+23
| | | | | Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* i965/screen: Report the correct number of image planesJason Ekstrand2017-09-141-1/+8
| | | | | | | | | For non-CCS images, we were reporting just one plane even though they may have multiple in the case of YUV. Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* dri/radeon: use ARRAY_SIZE macroEric Engestrom2017-09-141-1/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Deal with size differences between GLuint and GLhandleARB in ↵Jeremy Huddleston Sequoia2017-09-131-7/+15
| | | | | | | GetAttachedObjectsARB Signed-off-by: Jeremy Huddleston Sequoia <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add an INTEL_DEBUG=submit option for printing batch statistics.Kenneth Graunke2017-09-131-1/+1
| | | | | | | | | | | | | | | | | | | When a batch is submitted, INTEL_DEBUG=bat prints a message indicating which part of the code triggered the flush, and some statistics about the batch/state buffer utilization. It also decodes the batchbuffer in debug builds...which is so much output that it drowns out the utilization messages, if that's all you care about. INTEL_DEBUG=submit now just does the utilization messages. INTEL_DEBUG=bat continues to do both (as the message is a good indicator that we're starting decode of a new batch). v2: Rename from "flush" to "submit" (suggested by Chris) because we might want "flush" for PIPE_CONTROL debugging someday. Reviewed-by: Chris Wilson <[email protected]>