aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* r600: fork and import gallium/radeonMarek Olšák2017-09-2663-975/+15238
| | | | | | | | | | | This marks the end of code sharing between r600 and radeonsi. It's getting difficult to work on radeonsi without breaking r600. A lot of functions had to be renamed to prevent linker conflicts. There are also minor cleanups. Acked-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* swr/rast: Handle instanceID offset / Instance Stride enableTim Rowley2017-09-251-7/+39
| | | | | | | | | | Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require similar changes, will need address this if it is determined that this path is still in use. Handle Force Sequential Access in FetchJit::Create. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove code supporting legacy llvm (<3.9)Tim Rowley2017-09-253-105/+15
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTENDTim Rowley2017-09-251-10/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Slightly more efficient blend jitTim Rowley2017-09-251-20/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Properly sized null GS bufferTim Rowley2017-09-251-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move SWR_GS_CONTEXT from thread local storage to stackTim Rowley2017-09-251-12/+11
| | | | | | | Move structure, as the size is significantly reduced due to dynamic allocation of the GS buffers. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-252-1/+12
| | | | | | | Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to FETCH_COMPILE_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: New GS state/context APITim Rowley2017-09-253-212/+253
| | | | | | | One piglit regression, which was a false pass: [email protected]@execution@geometry@dynamic_input_array_index Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel formatTim Rowley2017-09-253-17/+28
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* scons: use python3-compatible generatorEric Engestrom2017-09-251-4/+2
| | | | | | | These changes were generated using python's `2to3` tool. Suggested-by: Ilia Mirkin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* scons: use python3-compatible print()Eric Engestrom2017-09-253-5/+5
| | | | | | | | | These changes were generated using python's `2to3` tool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852 Reported-by: Alex Granni <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* etnaviv: Add missing includes after 6ace0b8Wladimir J. van der Laan2017-09-221-0/+2
| | | | | | | | | | Add missing includes after 6ace0b8 (etnaviv: don't enable RT full-overwrite when logicop is enabled), otherwise the etnaviv driver won't build because of missing macros. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Tested-by: Andres Gomez <[email protected]>
* etnaviv: fix 16bpp clearsLucas Stach2017-09-221-1/+1
| | | | | | | | | | | | | | | util_pack_color may leave undefined values in the upper half of the packed integer. As our hardware needs the upper 16 bits to mirror the lower 16bits, this breaks clears of those formats if the undefined values aren't masked off. I've only observed the issue with R5G6B5_UNORM surfaces, other 16bpp formats seem to work fine. Fixes: d6aa2ba2b2 (etnaviv: replace translate_clear_color with util_pack_color) Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* swr/rast: remove llvm fence/atomics from generated filesTim Rowley2017-09-221-0/+8
| | | | | | | | | | | We currently don't use these instructions, and since their API changed in llvm-5.0 having them in the autogen files broke the mesa release tarballs which ship with generated autogen files. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847 CC: [email protected] Tested-by: Laurent Carlier <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* etnaviv: don't enable RT full-overwrite when logicop is enabledLucas Stach2017-09-222-6/+14
| | | | | | | | | Logicop is a form of blending with the framebuffer, so we must allow framebuffer reads when logicop is enabled. Fixes: piglit gl-1.0-logicop on GC3000, which has logicop support Signed-off-by: Lucas Stach <[email protected]>
* gallium: Add PIPE_SHADER_CAP_INT64_ATOMICSJan Vesely2017-09-2112-0/+13
| | | | | | | Denotes availability of 64bit int atomic instructions Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe, gallivm: implement lod queries (LODQ opcode)Roland Scheidegger2017-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | This uses all the existing code to calculate lod values for mip linear filtering. Though we'll have to disable the simplifications (if we know some parts of the lod calculation won't actually matter for filtering purposes due to mip clamps etc.). For better or worse, we'll also disable lod calculation hacks (mostly should make a difference for cube maps) always - the issue with per-pixel lod being difficult is mostly because we then have different mipmaps needed for the actual texel fetch, which isn't a problem with lodq. We still use approximation for the log2 - for that reason I believe the float part of the lod is only accurate to about 4-5 bits (and one bit less with 1d textures actually) which is hopefully good enough (though d3d10 technically requires 6 bits - could use quadratic interpolation instead of linear to get 8 bits or so). Since lodq requires unclamped lod, we also have to move some sampler key calculations to texture sampling code - even if we know we're going to access mipmap 0 we still have to calculate lod and apply lod_bias for lodq. Passes piglit ARB_texture_query_lod tests (after having fixed the test). Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: set MIP_POINT_PRECLAMP to 0Nicolai Hähnle2017-09-201-1/+1
| | | | | | | | | | | | | | | | | This fixes a bug with nearest ("point") mip selection when the fractional part of max_lod is in (0.5,1). In this case, the spec mandates that we still select the mip level ceil(max_lod) in the clamping case. However, MIP_POINT_PRECLAMP will clamp before the mip selection, which is wrong. Supposedly this setting was originally copied from the closed Vulkan driver, but as far as I can tell, closed Vulkan was actually changed back recently :) Fixes dEQP-GLES3.functional.texture.mipmap.2d.max_lod.{nearest,linear}_nearest Fixes: f7420ef5b464 ("radeonsi: enable some sampler fields to match the closed driver") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix array textures layer coordinateNicolai Hähnle2017-09-201-1/+10
| | | | | | | | | Like for cube map (array) gather, we need to round to nearest on <= VI. Fixes tests in dEQP-GLES3.functional.shaders.texture_functions.texture.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* etnaviv: move sw query defines to etnaviv_query_sw.hChristian Gmeiner2017-09-202-2/+4
| | | | | | | Also add new define ETNA_SW_QUERY_BASE. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* etnaviv: move sw get_driver_query_info(..)Christian Gmeiner2017-09-203-12/+28
| | | | | | | | | This change makes etna_get_driver_query_info(..) more generic and puts the knowledge of supported queries directly besides the implementation. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* broadcom/vc4: Fix use-after-free when deleting a program.Eric Anholt2017-09-181-6/+15
| | | | | | | | | | | By leaving the compiled shader in the context's stage state, the next compile of a new FS would look in the old compiled FS for figuring out whether to set various dirty flags for the VS compile. Clear out the pointer when deleting the program, and make sure that we always mark the state as dirty if the previous program had been lost. Fixes valgrind warnings on glsl-max-varyings. Fixes: 2350569a78c6 ("vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.")
* broadcom/vc4: Fix crashes since the gallium blitter reworks.Eric Anholt2017-09-181-1/+3
| | | | | Even if we're not clearing color, the blitter has started dereferencing the color value.
* broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.Eric Anholt2017-09-181-24/+32
| | | | | | | | | | | | | The blitter will bind just the depth buffer, which flushes the current job if we had both a color and depth/stencil. If the clear was doing partial depth/stencil (quad-based) and color (tile-based), we'd go on to try to set up the rest of the tile clear in the now flushed job. Instead, move the partial clear up before we start setting up the job for the current FBO state, and re-fetch the job if we're continuing on to a tile-based clear. Fixes valgrind failures in fbo-depthtex. Fixes: 9421a6065c4e ("vc4: Fix fallback to quad clears of depth in GLX.")
* broadcom/vc4: Fix use-after-free for flushing when writing to a texture.Eric Anholt2017-09-181-2/+7
| | | | | | | | I was trying to continue the hash table loop, not the inner loop. This tended to work out, because we would have *just* freed the job struct. Fixes some valgrind failures in fbo-depthtex. Fixes: f597ac396640 ("vc4: Implement job shuffling")
* radeonsi: reallocate if a non-sharable textures is being sharedMarek Olšák2017-09-181-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: PIPE_BIND_SHARED should allow inter-process sharingMarek Olšák2017-09-181-5/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* freedreno: compile fixNicolai Hähnle2017-09-181-1/+1
| | | | | Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Reported-by: Jan Vesely <[email protected]>
* gallium: Add PIPE_SHADER_CAP_FP16Jan Vesely2017-09-1812-0/+20
| | | | | | | | | Denotes native half precision float operations capability v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16 fix indentation Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: fix compile errorBenedikt Schemmer2017-09-181-1/+1
| | | | | | | Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Signed-off-by: Benedikt Schemmer <[email protected]> Previously-pointed-out-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow out-of-order rasterization in commutative blending casesNicolai Hähnle2017-09-185-4/+68
| | | | | | | | | | | | We do not enable this by default for additive blending, since it slightly breaks OpenGL invariance guarantees due to non-determinism. Still, there may be some applications can benefit from white-listing via the radeonsi_commutative_blend_add drirc setting without any real visible artifacts. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: add drirc option "radeonsi_assume_no_z_fights"Nicolai Hähnle2017-09-184-4/+8
| | | | | | | | | | | | | | This option enables a performance optimization where typical non-blending draws with depth buffer may be rasterized out-of-order (on VI+, multi-SE chips). This optimization can lead to incorrect results when an applications renders multiple objects with the same Z value at the same pixel, so we will never enable it by default. But there may be applications that could benefit from white-listing. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUsNicolai Hähnle2017-09-187-6/+193
| | | | | | | | | This does not take commutative blending into account yet. R600_DEBUG=nooutoforder disables it. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/radeon: pass old_(perfect_)enable to set_occlusion_query_stateNicolai Hähnle2017-09-184-4/+11
| | | | | | | The callee can derive the current enable state itself. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVENicolai Hähnle2017-09-1819-11/+83
| | | | | | | | | | | | | | | | | To be able to properly distinguish between GL_ANY_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED_CONSERVATIVE. This patch goes through all drivers, having them treat the two query types identically, except: 1. radeon incorrectly enabled conservative mode on PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE. 2. st/mesa uses the new query type. Fixes dEQP-GLES31.functional.fbo.no_attachments.* Reviewed-by: Marek Olšák <[email protected]>
* amd/common: remove has_ds_bpermute argument from ac_build_ddxyNicolai Hähnle2017-09-183-4/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add chip_class to ac_llvm_contextNicolai Hähnle2017-09-181-1/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: round cube array slice in ac_prepare_cube_coordsNicolai Hähnle2017-09-181-0/+1
| | | | | | | | | | | | The NIR-to-LLVM pass already does this; now the same fix covers radeonsi as well. Fixes various tests of dEQP-GLES31.functional.texture.filtering.cube_array.combinations.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: workaround for gather4 on integer cube mapsNicolai Hähnle2017-09-181-6/+100
| | | | | | | | | | This is the same workaround that radv already applied in commit 3ece76f03dc0 ("radv/ac: gather4 cube workaround integer"). Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* r600: add .gitignore for egd_tables.hDave Airlie2017-09-151-0/+1
|
* radeonsi: enable STD430 packing of UBOs by defaultTimothy Arceri2017-09-151-1/+1
| | | | | | | Before this change we were defaulting to STD140 which is slightly less efficient at packing arrays. Reviewed-by: Marek Olšák <[email protected]>
* gallium: introduce PIPE_CAP_LOAD_CONSTBUFTimothy Arceri2017-09-1515-0/+15
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: make use of LOAD for UBOsTimothy Arceri2017-09-151-10/+21
| | | | | | v2: always set can_speculate and allow_smem to true Reviewed-by: Marek Olšák <[email protected]>
* virgl: drop const dimensions on first block.Dave Airlie2017-09-151-0/+27
| | | | | | | | | | | | The virgl protocol version of tgsi doesn't handle this yet, transform it back to the old ways. Thanks to Nicolai Hähnle <[email protected]> for also writing nearly the same patch. Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D Tested-by: Rob Herring <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: move si_get_wave_info() to AMD common codeSamuel Pitoiset2017-09-141-93/+3
| | | | | | | | This will allow us to use it from radv. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* swr: use ARRAY_SIZE macroEric Engestrom2017-09-141-4/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* gallium/{r600, radeonsi}: Fix segfault with color format (v2)Denis Pauk2017-09-142-1/+13
| | | | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552 v2: Patch cleanup proposed by Nicolai Hähnle. * deleted changes in si_translate_texformat. Cc: Nicolai Hähnle <[email protected]> Cc: Ilia Mirkin <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: hard-code pixel center for interpolateAtSample without multisample ↵Nicolai Hähnle2017-09-133-1/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | buffers The GLSL rules for interpolateAtSample are unfortunate: "Returns the value of the input interpolant variable at the location of sample number sample. If multisample buffers are not available, the input variable will be evaluated at the center of the pixel. If sample sample does not exist, the position used to interpolate the input variable is undefined." This fix will fallback to monolithic shader compilation when interpolateAtSample is used without multisampling. One alternative would be to always upload 16 sample positions, filling the buffer up with repetition when the actual number of samples is less, and then ANDing the sample ID with 0xf. However, that punishes all well-behaving users of interpolateAtSample, when in reality, only conformance tests should be affected by the issue. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.* Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: apply a mask to gl_SampleMaskIn in the PS prologNicolai Hähnle2017-09-133-5/+76
| | | | | | | | | | | | | gl_SampleMaskIn is supposed to contain set bits only for the samples that are covered by the current fragment shader invocation, but the VGPR initialization hardware loads the set of all bits that are covered at the current pixel. Fixes various tests in dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>