aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: only flush resource to self if no scanout buffer existsLucas Stach2017-06-261-4/+5
| | | | | | | | | | | | Currently a resource flush may trigger a self resolve, even if a scanout buffer exists, but is up to date. If a scanout buffer exists we only ever want to flush the resource to the scanout buffer. This fixes a performance regression. Fixes: dda956340ce9 (etnaviv: resolve tile status when flushing resource) Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: add support for snorm texturesChristian Gmeiner2017-06-262-3/+7
| | | | | | | | | Based on a patch from Wladimir J. van der Laan and untested due to lack of hardware. Binary blob emits those formats if GPU supports HALTI1 (faked with ibvivhook). Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* etnaviv: add R8G8 texture supportChristian Gmeiner2017-06-261-1/+1
| | | | | | | Passes texwrap GL_ARB_texture_rg piglit (with faked full texture rg support). Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* etnaviv: add support for swizzled texture formatsChristian Gmeiner2017-06-264-39/+99
| | | | | | | Passes all ext_texture_swizzle piglits. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: add support for extended texture formatsChristian Gmeiner2017-06-264-4/+10
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* swr: set an explicit clear_rect if scissor is not enabled.Bruce Cherniak2017-06-261-1/+9
| | | | | | | | | | | | | Fix regression of "no rendering" on simple apps like glxgears by setting an explicit full surface clear_rect when scissor is not enabled. This regressed with commit 00173d91 "st/mesa: don't set 16 scissors and 16 viewports if they're unused" due to an assumption that a default scissor rect is always set, which was the case prior to this optimization. Reviewed-by: Tim Rowley <[email protected]>
* swr/rast: adjust std::string usage to fix buildTim Rowley2017-06-261-3/+9
| | | | | | | | | | Some combinations of c++ compilers and standard libraries had problems with the string::replace code we were using previously. This should fix the travis-ci system. Tested-by: Eric Engestrom <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: support indirect indexing in INTERP_* opcodesNicolai Hähnle2017-06-261-20/+58
| | | | | | | | | | | | The hardware doesn't support it, so we just interpolate all array elements and then use indirect indexing on the resulting vector. Clearly, this is not very efficient. There is an argument to be had for adding if/else, or perhaps even pulling the data out of LDS directly. Both don't really seem worth the effort, considering that it seems nobody actually uses this feature. Reviewed-by: Marek Olšák <[email protected]>
* r600g: fix crash when file in R600_TRACE doesn't existConstantine Charlamov2017-06-261-4/+5
| | | | | | | | | | …and print error in such case. Which probably is not a rare event btw because fopen doesn't expand ~ to $HOME. Also get rid of unused "bool ret" variable. Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: take into account offset to system inputs at tgsi_interp_egcm()Constantine Charlamov2017-06-262-6/+7
| | | | | | | | | Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=100785 v2: I was too much twiddling whether to initialize nsys_inputs at the beginning of shader initialization or for allocation of system values, and by the time I decided to go with the first one, I forgot to change it back. Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: get rid of trailing whitespaceConstantine Charlamov2017-06-261-22/+22
| | | | | Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/asm: add support for other GDS operations.Dave Airlie2017-06-263-4/+26
| | | | | | | This adds support for the GDS operations needed to do atomic counters. Signed-off-by: Dave Airlie <[email protected]>
* r600: don't merge GDS into VTXDave Airlie2017-06-261-2/+3
| | | | | | We don't want vtx/tex instructions ending up in GDS sections. Signed-off-by: Dave Airlie <[email protected]>
* r600: for memory instructions dump index gpr for read indirects also.Dave Airlie2017-06-261-1/+2
| | | | | | This just makes sure we can see the index gpr in the asm dumps. Signed-off-by: Dave Airlie <[email protected]>
* r600: add support for vertex fetches via texture cacheDave Airlie2017-06-262-2/+20
| | | | | | | | On evergreen we can route vertex fetches via the texture cache, and this is required for some images support. So add support to the asm builder for it. Signed-off-by: Dave Airlie <[email protected]>
* r600: route indirect address register correctly for vtx fetches.Dave Airlie2017-06-261-1/+1
| | | | | | | This was found during writing the images code, we need to make sure we route the correct index register. Signed-off-by: Dave Airlie <[email protected]>
* gallium/hud: add glthread countersMarek Olšák2017-06-263-0/+91
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/hud: add API-thread-busy for monitoring the thread loadMarek Olšák2017-06-263-4/+22
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/hud: add hud_pane::hud pointerMarek Olšák2017-06-262-3/+6
| | | | | | for later use Reviewed-by: Timothy Arceri <[email protected]>
* mesa/glthread: add glthread "perf" counters and pass them to gallium HUDMarek Olšák2017-06-265-2/+23
| | | | | | | | | | | for HUD integration in following commits. This valuable profiling data will allow us to see on the HUD how well glthread is able to utilize parallelism. This is better than benchmarking, because you can see exactly what's happening and you don't have to be CPU-bound. u_threaded_context has the same counters. Reviewed-by: Timothy Arceri <[email protected]>
* gallium/hud: move struct hud_context to hud_private.hMarek Olšák2017-06-262-46/+48
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/hud: rename API-thread-busy to main-thread-busyMarek Olšák2017-06-263-5/+5
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* util: move pipe_thread_is_self from gallium to src/utilMarek Olšák2017-06-262-12/+1
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nv50/ir: Properly fold constants in SPLIT operationPierre Moreau2017-06-251-3/+4
| | | | | | | Fixes: b7d9677d ("nv50/ir: constant fold OP_SPLIT") Cc: [email protected] Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi/gfx9: don't overallocate shader binariesMarek Olšák2017-06-241-6/+0
| | | | | | It's not needed. The hw doesn't fetch ahead over page boundaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri2: implement image offset queryLucas Stach2017-06-241-0/+6
| | | | | | | | This trivially adds support for the image offset query, which is needed for the zwp_linux_dmabuf based EGL platform wayland implementation. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* llvmpipe: initialize default fb correctly in setupRoland Scheidegger2017-06-241-0/+4
| | | | | | | | | | | | | | | | If lp_setup_bind_framebuffer() is never called, then setup fb x1/y1 was not correctly initialized. This can happen if there's never a fb set - both cso and llvmpipe would consider setting this with no cbufs and no zsbuf a redundant change and therefore it would never get set. We rely on this setup fb rect being initialized correctly for the tri intersect tests, throwing away tris which don't intersect. Not initializing it meant we'd then say it intersected, and we'd try to bin that despite that we have no actual tiles to bin it to, leading to assertion failures (pretty harmless since tile 0/0 always exists nevertheless as tiles are statically allocated, albeit that should change at some point). (Note probably not an issue with gl state tracker) Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: unreference vertex buffers when destroying the contextMarek Olšák2017-06-231-0/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: implement the workaround for Rocket League - postponed TGSI killMarek Olšák2017-06-235-0/+37
| | | | | | | | Do KILL at the end of shaders so as not to break WQM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070 Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: pass create_screen flags to r600_common_screen_initMarek Olšák2017-06-2316-26/+34
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri: add a drirc workaround for Rocket LeagueMarek Olšák2017-06-232-0/+11
| | | | | | | | | | | | | | This needs to be passed to gallium drivers. No game fix is planned at this time. The addition of glsl_correct_derivatives_after_discard is generally a good thing for mesa compatibility with the broader GL driver ecosystem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070 Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri: get drirc options before creating pipe_screenMarek Olšák2017-06-234-20/+38
| | | | | | dri_init_options_get_screen_flags will return the flags for create_screen(). Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: allow passing 'unsigned flags' to create_screen()Marek Olšák2017-06-2325-64/+65
| | | | | | for drirc options Reviewed-by: Nicolai Hähnle <[email protected]>
* llvmpipe:fix using 32bit rasterization mistakenly, causing overflowsRoland Scheidegger2017-06-234-31/+43
| | | | | | | | | | | | | We use the bounding box (triangle extents) to figure out if 32bit rasterization could potentially overflow. However, we used the bounding box which already got rounded up to 0 for negative coords for this, which is incorrect, leading to overflows and hence bogus rendering in some of our private use. It might be possible to simplify this somehow (we're now using 3 different boxes for binning) but I don't quite see how. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fill in debug vertex info for tri rasterizationRoland Scheidegger2017-06-231-1/+1
| | | | | | | | | This is pretty useful for debugging rasterization issues, so turn it on based on DEBUG (the actual existence of the fields is also conditionalized on DEBUG, lines fill it out the same too). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* Revert "radeonsi: don't emit partial flushes at the end of IBs (v2)"Marek Olšák2017-06-231-9/+5
| | | | | | | This reverts commit c9040dc9e75c81024f88f3f1bab821ad2bc73db3. People have reported it causes corruption on VI, and I see GPU hangs on GFX9.
* svga: minor whitespace fixes in svga_pipe_vertex.cBrian Paul2017-06-221-6/+10
|
* svga: check return value from svga_set_shader( SVGA3D_SHADERTYPE_GS, NULL)Brian Paul2017-06-221-0/+2
| | | | | | | | | | | If the call fails we need to flush the command buffer and retry. In this case, we were failing to unbind the GS which led to subsequent errors. This fixes a bug replaying a Cinebench R15 apitrace in a Linux guest. VMware bug 1894451 cc: [email protected] Reviewed-by: Charmaine Lee <[email protected]>
* svga: fix pre-mature flushing of the command bufferCharmaine Lee2017-06-225-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | When surface_invalidate is called to invalidate a newly created surface in svga_validate_surface_view(), it is possible that the command buffer is already full, and in this case, currently, the associated wddm winsys function will flush the command buffer and resend the invalidate surface command. However, this can pre-maturely flush the command buffer if there is still pending image updates to be patched. To fix the problem, this patch will add a return status to the surface_invalidate interface and if it returns FALSE, the caller will call svga_context_flush() to do the proper context flush. Note, we don't call svga_context_flush() if surface_invalidate() fails when flushing the screen surface cache though, because it is already in the process of context flush, all the image updates are already patched, calling svga_context_flush() can trigger a deadlock. So in this case, we call the winsys context flush interface directly to flush the command buffer. Fixes driver errors and graphics corruption running Tropics. VMware bug 1891975. Also tested with MTT glretrace, piglit and various OpenGL apps such as Heaven, CinebenchR15, NobelClinicianViewer, Lightsmark, GoogleEarth. cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* swr: invalidate attachment on transition changeGeorge Kyriazis2017-06-223-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following RT attachment order: 1. Attach surfaces attachments 0 & 1, and render with them 2. Detach 0 & 1 3. Re-attach 0 & 1 to different surfaces 4. Render with the new attachment The definition of a tile being resolved is that local changes have been flushed out to the surface, hence there is no need to reload the tile before it's written to. For an invalid tile, the tile has to be reloaded from the surface before rendering. Stage (2) was marking hot tiles for attachements 0 & 1 as RESOLVED, which means that the hot tiles can be written out to memory with no need to read them back in (they are "clean"). They need to be marked as resolved here, because a surface may be destroyed after a detach, and we don't want to have un-resolved tiles that may force a readback from a NULL (destroyed) surface. (Part of a destroy is detach all attachments first) Stage (3), during the no att -> att transition, we need to realize that the "new" surface tiles need to be fetched fresh from the new surface, instead of using the resolved tiles, that belong to a stale attachment. This is done by marking the hot tiles as invalid in stage (3), when we realize that a new attachment is being made, so that they are re-fetched during rendering in stage (4). Also note that hot tiles are indexed by attachment. - Fixes VTK dual depth-peeling tests. - No piglit changes Reviewed-by: Tim Rowley <[email protected]>
* radeonsi/gfx9: enable DCC fast clearMarek Olšák2017-06-221-4/+0
| | | | | | It seems to work now. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: don't ever flush the TC metadata cacheMarek Olšák2017-06-221-10/+3
| | | | | | | | The closed Vulkan driver doesn't do it either. Also remove some old comments that aren't useful. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: use TC L2 for fast color clear with CP DMAMarek Olšák2017-06-221-2/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix DCC fast clear for luminance and alpha formatsMarek Olšák2017-06-221-1/+10
| | | | | | | | | | | I reproduced this bug on Polaris11 and Raven. I can't get this bug on Fiji. The reason might be that Fiji doesn't use 2D tiling for the test due to higher 2D tiling alignment requirements. Fixes piglit: spec@ext_framebuffer_object@fbo-fast-clear Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't emit partial flushes at the end of IBs (v2)Marek Olšák2017-06-221-5/+9
| | | | | | | | The kernel sort of does the same thing with fences. v2: do emit partial flushes on SI Reviewed-by: Nicolai Hähnle <[email protected]>
* change va max_entrypointsChandu Babu N2017-06-222-1/+3
| | | | | | | | | | | As encode support is added along with decode, increase max_entrypoints to two. vaMaxNumEntrypoints was returning incorrect value and causing memory corruption before this commit v2: assert when max_entrypoints needs to be bigger CC: [email protected] Reviewed-by: Christian König <[email protected]>
* st/va: Fix leak in VAAPI subpicturesChandu Babu N2017-06-221-0/+1
| | | | | | | sampler view allocated in vaAssociateSubpicture is not cleared in vaiDeassociateSubpicture. Reviewed-by: Christian König <[email protected]>
* radeonsi: use the correct LLVMTargetMachineRef in si_build_shader_variantNicolai Hähnle2017-06-221-6/+22
| | | | | | | | | | | | si_build_shader_variant can actually be called directly from one of normal-priority compiler threads. In that case, the thread_index is only valid for the normal tm array. v2: - use the correct sel/shader->compiler_ctx_state Fixes: 86cc8097266c ("radeonsi: use a compiler queue with a low priority for optimized shaders") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fenceMarek Olšák2017-06-223-8/+28
| | | | | | | | instead of using a monotonic suballocator v2: initialize the memory at context creation Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: enable the constant engineMarek Olšák2017-06-221-4/+1
| | | | | | | I think this kernel commit fixes it: drm/amdgpu:use FRAME_CNTL for new GFX ucode Reviewed-by: Nicolai Hähnle <[email protected]>