summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi/uvd: clean up si_video_buffer_createBenedikt Schemmer2017-09-301-30/+17
| | | | | | V2: remove code duplication and one unnessecary variable, minor whitespace fix Signed-off-by: Marek Olšák <[email protected]>
* radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bbMarek Olšák2017-09-301-3/+8
| | | | | Tested-by: Benedikt Schemmer <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: emit DLDEXP and DFRACEXP TGSI opcodesNicolai Hähnle2017-09-292-1/+26
| | | | | | | | | Note: this causes spurious regressions in some current piglit tests, because the tests incorrectly assume that there is no denorm support for doubles. I'm going to send out a fix for those tests as well. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: emit LDEXP opcodeNicolai Hähnle2017-09-292-1/+3
| | | | | | | | The LLVM intrinsic has existed for a long time. The current name was established in LLVM 3.9. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium: add LDEXP TGSI instruction and corresponding capNicolai Hähnle2017-09-2911-0/+15
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: infer that dst[1] of DFRACEXP is an integerNicolai Hähnle2017-09-291-1/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallivm: add dst register index to lp_build_tgsi_context::emit_storeNicolai Hähnle2017-09-293-11/+18
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* tgsi: clarify the semantics of DFRACEXPNicolai Hähnle2017-09-291-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The status quo is quite the mess: 1. tgsi_exec will do a per-channel computation, and store the dst[0] result (significand) correctly for each channel. The dst[1] result (exponent) will be written to the first bit set in the writemask. So per-component calculation only works partially. 2. r600 will only do a single computation. It will replicate the exponent but not the significand. 3. The docs pretend that there's per-component calculation, but even get dst[0] and dst[1] confused. 4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions, and kind-of assumes that everything is replicated, generating this for the dvec4 case: DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw Settle on the simplest behavior, which is single-component calculation with replication, document it, and adjust tgsi_exec and r600. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600: cleanup set_occlusion_query_stateNicolai Hähnle2017-09-293-14/+3
| | | | | | | | | | | This fixes a warning caused by the fork (note the change in the function signature): ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’: ../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state; Reviewed-by: Marek Olšák <[email protected]>
* r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICSNicolai Hähnle2017-09-291-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix border color translation for integer texturesNicolai Hähnle2017-09-293-29/+60
| | | | | | | | | | This fixes the extremely unlikely case that an application uses 0x80000000 or 0x3f800000 as border color for an integer texture and helps in the also, but perhaps slightly less, unlikely case that 1 is used as a border color. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: clamp border colors for upgraded depth texturesNicolai Hähnle2017-09-291-59/+60
| | | | | | | | | | | | | The hardware does this automatically for unorm formats, but we need to do it manually for unorm depth formats that have been upgraded to Z32_FLOAT. Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth and others. Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: clamp depth comparison value only for fixed point formatsNicolai Hähnle2017-09-296-14/+53
| | | | | | | | | | | | | | | | | | | The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: fix geometry shaders without output verticesNicolai Hähnle2017-09-291-3/+5
| | | | | | | | | | | Not that those are super common or useful, but hey! Fun corner cases of the API... Fixes dEQP-GLES31.functional.geometry_shading.emit.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: move descriptor logs to after corresponding draw/compute packetNicolai Hähnle2017-09-292-8/+6
| | | | | | | | It has to happen after descriptor uploads since otherwise we'll print out the wrong GPU list / incorrectly claim descriptor corruption. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: remove ac_shader_abi::chip_classNicolai Hähnle2017-09-291-2/+0
| | | | | | | Redundant with the recently added ac_llvm_context::chip_class. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: fix a commentNicolai Hähnle2017-09-291-1/+1
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch casesBrian Paul2017-09-281-0/+2
| | | | | | Silences a compiler warning. Reviewed-by: Roland Scheidegger <[email protected]>
* svga: trivial whitespace clean-ups in svga_screen.cBrian Paul2017-09-281-11/+13
|
* svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTIONNeha Bhende2017-09-281-1/+3
| | | | | | | | | | | Since our driver support arb_provoking_vertex, we can start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests Tested piglit, glretrace on Hw 11 and Hw 13 Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* etnaviv: optimize RS transfersLucas Stach2017-09-281-4/+25
| | | | | | | | | | | | | Currently we are blitting the whole resource when the RS is used to de-/tile a resource. This can be very inefficient for large resources where the transfer is only changing a small part of the resource (happens a lot with glTexSubImage2D). Optimize this by only blitting the tile aligned subregion of the resource, which the transfer is going to change. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: add resource subregion copyLucas Stach2017-09-282-0/+32
| | | | | | | | This is useful if we only need to copy part of a larger resource, mostly when using the RS engine to de-/tile on pipe transfers. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* etnaviv: support tile aligned RS blitsLucas Stach2017-09-281-8/+78
| | | | | | | | The RS can blit abitrary tile aligned subregions of a resource by adjusting the buffer offset. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]>
* broadcom/vc4: Fix release buildEric Anholt2017-09-271-1/+1
| | | | | | | | | I remember thinking "gosh, it would be nice if I could do a kernel-style 'if (!IS_ENABLED(DEBUG))' instead of using an #ifdef, so the code was compiled on both builds", and then forgot to test a release build anyway. Fixes: a8fd58eae596 ("vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.") Reported-by: Derek Foreman <[email protected]>
* vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.Eric Anholt2017-09-274-0/+41
| | | | | | | This has proven to be incredibly useful for debugging CMA allocation failures and driving memory management improvements. However, we don't want to burden entry and exit from the BO cache with the labeling ioctl's overhead on release builds.
* gallium/radeon: consolidate PIPE_BIND_SHARED/SCANOUT handlingMarek Olšák2017-09-272-13/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove useless check in si_blit_decompress_color()Samuel Pitoiset2017-09-271-1/+3
| | | | | | | | That's unnecessary to double-check that dcc_offset is not 0 because all callers already check that. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: more use of vi_dcc_formats_are_incompatible()Samuel Pitoiset2017-09-271-2/+1
| | | | | | | Found by inspection. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: Remove unneeeded comparisonGeorge Kyriazis2017-09-261-2/+1
| | | | | | No need to check if screen->pipe != pipe, so we can just assign it. Just do it. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Handle resource across context changesGeorge Kyriazis2017-09-264-10/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Swr caches fb contents in tiles. Those tiles are stored on a per-context basis. When switching contexts that share resources we need to make sure that the tiles of the old context are being stored and the tiles of the new context are being invalidated (marked as invalid, hence contents need to be reloaded). The context does not get any dirty bits to identify this case. This has to be, then, coordinated by the resources that are being shared between the contexts. Add a "curr_pipe" hook in swr_resource that will allow us to identify a MakeCurrent of the above form during swr_update_derived(). At that time, we invalidate the tiles of the new context. The old context, will need to have already store its tiles by that time, which happens during glFlush(). glFlush() is being called at the beginning of MakeCurrent. So, the sequence of operations is: - At the beginning of glXMakeCurrent(), glFlush() will store the tiles of all bound surfaces of the old context. - After the store, a fence will guarantee that the all tile store make it to the surface - During swr_update_derived(), when we validate the new context, we check all resources to see what changed, and if so, we invalidate the current tiles. Fixes rendering problems with CEI/Ensight. Reviewed-by: Bruce Cherniak <[email protected]>
* broadcom/vc4: Fix infinite retry in vc4_bo_alloc()Boris Brezillon2017-09-261-6/+4
| | | | | | | | | | | | | | | cleared_and_retried is always reset to false when jumping to the retry label, thus leading to an infinite retry loop. Fix that by moving the cleared_and_retried variable definitions at the beginning of the function. While we're at it, move the create variable with the other local variables and explicitly reset its content in the retry path. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Fixes: 78087676c98aa8884ba92 "vc4: Restructure the simulator mode."
* broadcom/vc4: Keep pipe_sampler_view->texture matching the original texture.Eric Anholt2017-09-266-39/+42
| | | | | | | | | | | I was overwriting view->texture with the shadow resource when we need to do shadow copies (retiling or baselevel rebase), but that tripped up some critical new sanity checking in state_tracker (making sure that stObj->pt hasn't changed from view->texture through TexImage-related paths). To avoid that, move the shadow resource to the vc4_sampler_view struct. Fixes: f0ecd36ef8e1 ("st/mesa: add an entirely separate codepath for setting up buffer views")
* svga: silence unused var warning in optimized build with MAYBE_UNUSEDBrian Paul2017-09-261-1/+1
| | | | Trivial
* r600: fork and import gallium/radeonMarek Olšák2017-09-2663-975/+15238
| | | | | | | | | | | This marks the end of code sharing between r600 and radeonsi. It's getting difficult to work on radeonsi without breaking r600. A lot of functions had to be renamed to prevent linker conflicts. There are also minor cleanups. Acked-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* swr/rast: Handle instanceID offset / Instance Stride enableTim Rowley2017-09-251-7/+39
| | | | | | | | | | Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require similar changes, will need address this if it is determined that this path is still in use. Handle Force Sequential Access in FetchJit::Create. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove code supporting legacy llvm (<3.9)Tim Rowley2017-09-253-105/+15
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTENDTim Rowley2017-09-251-10/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Slightly more efficient blend jitTim Rowley2017-09-251-20/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Properly sized null GS bufferTim Rowley2017-09-251-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move SWR_GS_CONTEXT from thread local storage to stackTim Rowley2017-09-251-12/+11
| | | | | | | Move structure, as the size is significantly reduced due to dynamic allocation of the GS buffers. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-252-1/+12
| | | | | | | Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to FETCH_COMPILE_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: New GS state/context APITim Rowley2017-09-253-212/+253
| | | | | | | One piglit regression, which was a false pass: [email protected]@execution@geometry@dynamic_input_array_index Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel formatTim Rowley2017-09-253-17/+28
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* scons: use python3-compatible generatorEric Engestrom2017-09-251-4/+2
| | | | | | | These changes were generated using python's `2to3` tool. Suggested-by: Ilia Mirkin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* scons: use python3-compatible print()Eric Engestrom2017-09-253-5/+5
| | | | | | | | | These changes were generated using python's `2to3` tool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852 Reported-by: Alex Granni <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* etnaviv: Add missing includes after 6ace0b8Wladimir J. van der Laan2017-09-221-0/+2
| | | | | | | | | | Add missing includes after 6ace0b8 (etnaviv: don't enable RT full-overwrite when logicop is enabled), otherwise the etnaviv driver won't build because of missing macros. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Tested-by: Andres Gomez <[email protected]>
* etnaviv: fix 16bpp clearsLucas Stach2017-09-221-1/+1
| | | | | | | | | | | | | | | util_pack_color may leave undefined values in the upper half of the packed integer. As our hardware needs the upper 16 bits to mirror the lower 16bits, this breaks clears of those formats if the undefined values aren't masked off. I've only observed the issue with R5G6B5_UNORM surfaces, other 16bpp formats seem to work fine. Fixes: d6aa2ba2b2 (etnaviv: replace translate_clear_color with util_pack_color) Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* swr/rast: remove llvm fence/atomics from generated filesTim Rowley2017-09-221-0/+8
| | | | | | | | | | | We currently don't use these instructions, and since their API changed in llvm-5.0 having them in the autogen files broke the mesa release tarballs which ship with generated autogen files. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847 CC: [email protected] Tested-by: Laurent Carlier <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* etnaviv: don't enable RT full-overwrite when logicop is enabledLucas Stach2017-09-222-6/+14
| | | | | | | | | Logicop is a form of blending with the framebuffer, so we must allow framebuffer reads when logicop is enabled. Fixes: piglit gl-1.0-logicop on GC3000, which has logicop support Signed-off-by: Lucas Stach <[email protected]>
* gallium: Add PIPE_SHADER_CAP_INT64_ATOMICSJan Vesely2017-09-2112-0/+13
| | | | | | | Denotes availability of 64bit int atomic instructions Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>