summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi/gfx9: don't overallocate shader binariesMarek Olšák2017-06-241-6/+0
| | | | | | It's not needed. The hw doesn't fetch ahead over page boundaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* llvmpipe: initialize default fb correctly in setupRoland Scheidegger2017-06-241-0/+4
| | | | | | | | | | | | | | | | If lp_setup_bind_framebuffer() is never called, then setup fb x1/y1 was not correctly initialized. This can happen if there's never a fb set - both cso and llvmpipe would consider setting this with no cbufs and no zsbuf a redundant change and therefore it would never get set. We rely on this setup fb rect being initialized correctly for the tri intersect tests, throwing away tris which don't intersect. Not initializing it meant we'd then say it intersected, and we'd try to bin that despite that we have no actual tiles to bin it to, leading to assertion failures (pretty harmless since tile 0/0 always exists nevertheless as tiles are statically allocated, albeit that should change at some point). (Note probably not an issue with gl state tracker) Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: unreference vertex buffers when destroying the contextMarek Olšák2017-06-231-0/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: implement the workaround for Rocket League - postponed TGSI killMarek Olšák2017-06-235-0/+37
| | | | | | | | Do KILL at the end of shaders so as not to break WQM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070 Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: pass create_screen flags to r600_common_screen_initMarek Olšák2017-06-238-10/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* llvmpipe:fix using 32bit rasterization mistakenly, causing overflowsRoland Scheidegger2017-06-234-31/+43
| | | | | | | | | | | | | We use the bounding box (triangle extents) to figure out if 32bit rasterization could potentially overflow. However, we used the bounding box which already got rounded up to 0 for negative coords for this, which is incorrect, leading to overflows and hence bogus rendering in some of our private use. It might be possible to simplify this somehow (we're now using 3 different boxes for binning) but I don't quite see how. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fill in debug vertex info for tri rasterizationRoland Scheidegger2017-06-231-1/+1
| | | | | | | | | This is pretty useful for debugging rasterization issues, so turn it on based on DEBUG (the actual existence of the fields is also conditionalized on DEBUG, lines fill it out the same too). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* Revert "radeonsi: don't emit partial flushes at the end of IBs (v2)"Marek Olšák2017-06-231-9/+5
| | | | | | | This reverts commit c9040dc9e75c81024f88f3f1bab821ad2bc73db3. People have reported it causes corruption on VI, and I see GPU hangs on GFX9.
* svga: minor whitespace fixes in svga_pipe_vertex.cBrian Paul2017-06-221-6/+10
|
* svga: check return value from svga_set_shader( SVGA3D_SHADERTYPE_GS, NULL)Brian Paul2017-06-221-0/+2
| | | | | | | | | | | If the call fails we need to flush the command buffer and retry. In this case, we were failing to unbind the GS which led to subsequent errors. This fixes a bug replaying a Cinebench R15 apitrace in a Linux guest. VMware bug 1894451 cc: [email protected] Reviewed-by: Charmaine Lee <[email protected]>
* svga: fix pre-mature flushing of the command bufferCharmaine Lee2017-06-223-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | When surface_invalidate is called to invalidate a newly created surface in svga_validate_surface_view(), it is possible that the command buffer is already full, and in this case, currently, the associated wddm winsys function will flush the command buffer and resend the invalidate surface command. However, this can pre-maturely flush the command buffer if there is still pending image updates to be patched. To fix the problem, this patch will add a return status to the surface_invalidate interface and if it returns FALSE, the caller will call svga_context_flush() to do the proper context flush. Note, we don't call svga_context_flush() if surface_invalidate() fails when flushing the screen surface cache though, because it is already in the process of context flush, all the image updates are already patched, calling svga_context_flush() can trigger a deadlock. So in this case, we call the winsys context flush interface directly to flush the command buffer. Fixes driver errors and graphics corruption running Tropics. VMware bug 1891975. Also tested with MTT glretrace, piglit and various OpenGL apps such as Heaven, CinebenchR15, NobelClinicianViewer, Lightsmark, GoogleEarth. cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* swr: invalidate attachment on transition changeGeorge Kyriazis2017-06-223-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following RT attachment order: 1. Attach surfaces attachments 0 & 1, and render with them 2. Detach 0 & 1 3. Re-attach 0 & 1 to different surfaces 4. Render with the new attachment The definition of a tile being resolved is that local changes have been flushed out to the surface, hence there is no need to reload the tile before it's written to. For an invalid tile, the tile has to be reloaded from the surface before rendering. Stage (2) was marking hot tiles for attachements 0 & 1 as RESOLVED, which means that the hot tiles can be written out to memory with no need to read them back in (they are "clean"). They need to be marked as resolved here, because a surface may be destroyed after a detach, and we don't want to have un-resolved tiles that may force a readback from a NULL (destroyed) surface. (Part of a destroy is detach all attachments first) Stage (3), during the no att -> att transition, we need to realize that the "new" surface tiles need to be fetched fresh from the new surface, instead of using the resolved tiles, that belong to a stale attachment. This is done by marking the hot tiles as invalid in stage (3), when we realize that a new attachment is being made, so that they are re-fetched during rendering in stage (4). Also note that hot tiles are indexed by attachment. - Fixes VTK dual depth-peeling tests. - No piglit changes Reviewed-by: Tim Rowley <[email protected]>
* radeonsi/gfx9: enable DCC fast clearMarek Olšák2017-06-221-4/+0
| | | | | | It seems to work now. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: don't ever flush the TC metadata cacheMarek Olšák2017-06-221-10/+3
| | | | | | | | The closed Vulkan driver doesn't do it either. Also remove some old comments that aren't useful. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: use TC L2 for fast color clear with CP DMAMarek Olšák2017-06-221-2/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix DCC fast clear for luminance and alpha formatsMarek Olšák2017-06-221-1/+10
| | | | | | | | | | | I reproduced this bug on Polaris11 and Raven. I can't get this bug on Fiji. The reason might be that Fiji doesn't use 2D tiling for the test due to higher 2D tiling alignment requirements. Fixes piglit: spec@ext_framebuffer_object@fbo-fast-clear Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't emit partial flushes at the end of IBs (v2)Marek Olšák2017-06-221-5/+9
| | | | | | | | The kernel sort of does the same thing with fences. v2: do emit partial flushes on SI Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the correct LLVMTargetMachineRef in si_build_shader_variantNicolai Hähnle2017-06-221-6/+22
| | | | | | | | | | | | si_build_shader_variant can actually be called directly from one of normal-priority compiler threads. In that case, the thread_index is only valid for the normal tm array. v2: - use the correct sel/shader->compiler_ctx_state Fixes: 86cc8097266c ("radeonsi: use a compiler queue with a low priority for optimized shaders") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fenceMarek Olšák2017-06-223-8/+28
| | | | | | | | instead of using a monotonic suballocator v2: initialize the memory at context creation Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: enable the constant engineMarek Olšák2017-06-221-4/+1
| | | | | | | I think this kernel commit fixes it: drm/amdgpu:use FRAME_CNTL for new GFX ucode Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: indirect buffers and all CP packets use TC L2Marek Olšák2017-06-224-13/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: flush CB after MSAA only when transitioning from CB to texturesMarek Olšák2017-06-222-14/+60
| | | | | | | | | | | | | The main flush before texturing is done after the FMASK decompress pass. CB after MSAA rendering is not flushed in set_framebuffer_state and also not in memory_barrier if the current color buffer is MSAA. We fully rely on the FMASK decompress pass for the flushing. Some CB decompress and resolve passes need an explicit flush before and after. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: unify CB_RESOLVE blitter invocation codeMarek Olšák2017-06-221-17/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: flush DB caches only when transitioning from DB to texturingMarek Olšák2017-06-225-25/+56
| | | | | | | | | Use the mechanism of si_decompress_textures, but instead of doing the actual decompression, just flag the DB cache flush there. This removes a lot of unnecessary DB cache flushes. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add separate HUD counters for CB and DB cache flushesMarek Olšák2017-06-224-10/+20
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* etnaviv: fix blend color for RB swapped rendertargetsLucas Stach2017-06-214-14/+45
| | | | | | | | | Same as with the colormasks, the blend color needs to be swizzled according to the rendertarget format. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* nvc0: fix transfer of larger rectangles with DmaCopy on gk104 and upBen Skeggs2017-06-201-9/+32
| | | | | | | | | | | | | | | | By treating the rectangles as 1cpp, we can run up against some internal copy engine limits and trigger a MEM2MEM_RECT_OUT_OF_BOUNDS error check at launch time. This commit enables the REMAP hardware, which allows us to specify both the component size and number of components for a transfer. We're then able to pass in the real width/nblocksx values and not hit the limits. There's a couple of "supported" CPPs in the list that we can't actually hit, but are there simply because they're possible. Signed-off-by: Ben Skeggs <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: copy engine surface params are only relevant for tiled surfacesBen Skeggs2017-06-201-18/+19
| | | | | | | | | Aside from reducing pushbuf usage in some situations, this commit should have no other effect, and is just to make it somewhat obvious that those methods have zero effect on linear surfaces. Signed-off-by: Ben Skeggs <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* swr: Include definition of missing functionGeorge Kyriazis2017-06-201-0/+1
| | | | | | | | | Inline function SWR_MULTISAMPLE_POS::PrecalcSampleData() was missing definition. Include definition in core/state_funcs.h. Fixes windows build. Reviewed-by: Tim Rowley <[email protected]>
* vc4: Clean up release build warnings using MAYBE_UNUSED.Eric Anholt2017-06-202-6/+5
| | | | | These variables are all used in an assert(), so release builds see no usages.
* vc4: Allow VBOs to be mapped during execution.Eric Anholt2017-06-201-1/+1
| | | | | | | | There's no reason we can't -- the mappings we expose are basically equivalent to persistent/coherent, already. Improves mesa-demos drawoverhead (no state change) performance by 5.21362% +/- 1.25078% (n=11).
* softpipe: remove unused softpipe_context::line_stipple_counterBrian Paul2017-06-201-2/+0
| | | | Trivial.
* radeonsi: set correct usage flag according to image access typeSamuel Pitoiset2017-06-201-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: update all resident texture descriptors when neededSamuel Pitoiset2017-06-201-57/+104
| | | | | | | | | | | | | | | To avoid useless DCC fetches when DCC is disabled, descriptors have to be updated in order to reflect this change. This is quite similar to how we update descriptors of bound textures. As a side effect, this should also prevent VM faults when bindless textures are invalidated, because the VA in the descriptor has to be updated accordingly as well. I don't see any performance improvements with DOW3. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: keep track of the sampler state for texture handlesSamuel Pitoiset2017-06-202-0/+2
| | | | | | | | Needed for updating all resident texture descriptors when dirty_tex_counter changes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix dumping shader descriptors into ddebug logsMarek Olšák2017-06-191-35/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a workaround for inexact SNORM8 blitting againMarek Olšák2017-06-191-0/+37
| | | | | | | | GFX9 is affected. We only have tests for GL_x_SNORM where x is R8, RG8, RGB8, and RGBA8. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: fix TC-compatible stencil compressionMarek Olšák2017-06-191-0/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: fix TXF_LZ with 1D texturesMarek Olšák2017-06-191-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: disable sparse buffersMarek Olšák2017-06-191-0/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon/gfx9: fix PBO texture uploads to compressed texturesNicolai Hähnle2017-06-191-1/+6
| | | | | | | | | st/mesa creates a surface that reinterprets the compressed blocks as RGBA16UI or RGBA32UI. We have to adjust width0 & height0 accordingly to avoid out-of-bounds memory accesses by CB. Cc: 17.1 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600: fix off-by-one in egd_tables.pyNicolai Hähnle2017-06-191-1/+1
| | | | | | Port of the corresponding fix in sid_tables.py. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: reduce overhead for resident textures which need color decompressionSamuel Pitoiset2017-06-184-34/+58
| | | | | | | | | This is done by introducing a separate list. si_decompress_textures() is now 5x faster. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: reduce overhead for resident textures which need depth decompressionSamuel Pitoiset2017-06-184-8/+29
| | | | | | | This is done by introducing a separate list. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use util_dynarray_foreach for bindless resourcesSamuel Pitoiset2017-06-182-129/+46
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add a new HUD query for the number of resident handlesSamuel Pitoiset2017-06-184-0/+12
| | | | | | | | | Useful for debugging performance issues when ARB_bindless_texture is enabled. This query doesn't make a distinction between texture and image handles. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600: include libelf headers only as neededEmil Velikov2017-06-171-0/+2
| | | | | | | | | | | | | | | | | Headers are required only when building with OpenCL. As we're building w/o it libelf may be missing, hence we'll error out as below: src/gallium/drivers/r600/evergreen_compute.c:27:10: fatal error: 'gelf.h' file not found ^ 1 error generated. Fixes: d96a210842 ("r600g,compute: provide local copy of functions from ac_binary.c") Reviewed-by: Jan Vesely <[email protected]> Reported-by: Mauro Rossi <[email protected]> Tested-by: Mauro Rossi <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* radeonsi: include ac_binary.h for struct ac_shader_binaryEmil Velikov2017-06-171-2/+2
| | | | | | | | | | | | | | The header embeds the struct so it needs the header inclusion instead of the dummy forward declaration. Cc: Nicolai Hähnle <[email protected]> Cc: Marek Olšák <[email protected]> Cc: Tom Stellard <[email protected]> Fixes: 32206c5e560 ("radeonsi: Add radeon_shader_binary member to struct si_shader") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* r600, radeon: move radeon_shader_binary_{init,clean} back to radeonEmil Velikov2017-06-173-23/+28
| | | | | | | | | | | | | Those are used by r600 and radeonsi, so moving them within the former was a bad idea. Fixes: d96a210842b ("r600g,compute: provide local copy of functions from ac_binary.c") Cc: Jan Vesely <[email protected]> Cc: Aaron Watry <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* svga: add new num-failed-allocations HUD queryBrian Paul2017-06-165-2/+26
| | | | | | | This counter is incremented if we fail to allocate memory for vertex/index/const buffers, textures, etc. Reviewed-by: Neha Bhende <[email protected]>