aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Don't forget to bump the number of writes when doing TFU ops.Eric Anholt2018-12-141-0/+2
| | | | | | generatemipmap is just filling out the rest of the mipmap that's already been written (by a mapping or a draw call), so it didn't matter. As I reuse the TFU code for linear-to-UIF conversions, it'll start mattering.
* v3d: Set up the right stride for raster TFU.Eric Anholt2018-12-141-1/+1
| | | | | I didn't have any raster images in the generatemipmap path, so the pixels-vs-bytes mixup didn't matter here.
* v3d: Don't forget to wait for our TFU job before rendering from it.Eric Anholt2018-12-141-0/+8
| | | | | | | | Otherwise we may race to read old contents. This didn't show up in the CTS and piglit for me, but it did once I started using the TFU to do linear->UIF blits for X11. Fixes: 2ebca177dc18 ("v3d: Use the TFU to do generatemipmap.")
* nvc0: always keep TSC slot 0 bound to fix TXFIlia Mirkin2018-12-142-0/+21
| | | | | | | | | | | | Same as on nv50, the TXF op always uses the TSC bound to slot 0, returning blank values if nothing is bound. An earlier change arranges for the TSC entries list to always have valid data at entry 0, so here we just make use of it. Fixes arb_texture_buffer_object-subdata-sync among others. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: replace use of explicit default_tsc with entry 0Ilia Mirkin2018-12-146-22/+25
| | | | | | | | | | | This was used for implementing FBFETCH. However that uses TXF, which doesn't do much with a TSC. The only important bit is that sRGB-decoding works as expected, which we can achieve since all samplers we ever generate enable sRGB-decoding. Always point to entry 0 in the TSC table, and ensure that even before it ever gets initialized, the sRGB-decoding enable bit is set. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a6xx: fix corrupted uniformsRob Clark2018-12-141-1/+2
| | | | | | | | | | For older gen's fd_wfi() is used to conditionally insert a WFI if there hasn't already been one since last draw. But this doesn't work out well with stateobj since the order the stateobj is evaluated might not be what you expect. (Ie. stateobj might not be evaluated until a later draw if there is no geometry from the current draw in a given tile.) Signed-off-by: Rob Clark <[email protected]>
* etnaviv: drop redundant ctx function parameterChristian Gmeiner2018-12-141-4/+3
| | | | | | | | There is no need to have an extra ctx paramter as all the other parameters carry all the needed information. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* freedreno/a6xx: fix resource_copy_region()Rob Clark2018-12-131-9/+24
| | | | | | | | | | | | | | pctx->resource_copy_region() needs to fall back to sw copy for non-renderable formats. But previously for things that we could not use the blitter for, would fall back to 3d. Which won't work if 3d can't render to the dst format either. Instead rework things to fallback to fd_resource_copy_region(), which will try 3d core and then fall back to memcpy(). Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot Signed-off-by: Rob Clark <[email protected]>
* freedreno: move fd_resource_copy_region()Rob Clark2018-12-133-62/+73
| | | | | | Code-motion prep for next patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: more blitter fixesRob Clark2018-12-131-10/+22
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-12-138-30/+39
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium/aux: add is_unorm() helperRob Clark2018-12-132-0/+24
| | | | | | We already had one for is_snorm() but not unorm. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: fix blitter crashRob Clark2018-12-131-0/+17
| | | | | | | | | | Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot Also fixes gpu hangs with some formats that are supported, but which we don't know what internal-format to use for the blitter, for ex dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot Signed-off-by: Rob Clark <[email protected]>
* freedreno: also set DUMP flag on shadersRob Clark2018-12-135-20/+22
| | | | | | | | If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP flag as a hint to kernel that this is a useful buffer to snapshot for debug dumps. Signed-off-by: Rob Clark <[email protected]>
* freedreno: debug GEM obj namesRob Clark2018-12-138-17/+19
| | | | | | | With a recent enough kernel, set debug names for GEM BOs, which will show up in $debugfs/gem Signed-off-by: Rob Clark <[email protected]>
* mesa/st: Expose compute shaders when NIR support is advertised.Eric Anholt2018-12-131-1/+2
| | | | | | | | We have a NIR path, and V3D doesn't have TGSI input for compute (only what TTN can handle for the various gallium-internal shaders). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* virgl: work around bad assumptions in virglrendererErik Faye-Lund2018-12-131-1/+32
| | | | | | | | | | | | | | | | | | | | Virglrenderer does the wrong thing when given an instance divisor; it tries to use the element-index rather than the binding-index as the argument to glVertexBindingDivisor(). This worked fine as long as there was a 1:1 relationship between elements and bindings, which was the case util 19a91841c34 "st/mesa: Use Array._DrawVAO in st_atom_array.c.". So let's detect instance divisors, and restore a 1:1 relationship in that case. This will make old versions of virglrenderer behave correctly. For newer versions, we can consider making a better interface, where the instance divisor isn't specified per element, but rather per binding. But let's save that for another day. Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 19a91841c34 "st/mesa: Use Array._DrawVAO in st_atom_array.c." Reviewed-by: Mathias Fröhlich <[email protected]> Tested-By: Gert Wollny <[email protected]>
* virgl: wrap vertex element state in a structErik Faye-Lund2018-12-132-9/+21
| | | | | | | | | This just has one member for now; the handle. But this is about to change. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Tested-By: Gert Wollny <[email protected]>
* virgl: simplify virgl_hw_set_index_bufferErik Faye-Lund2018-12-131-3/+2
| | | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Tested-By: Gert Wollny <[email protected]>
* virgl: simplify virgl_hw_set_vertex_buffersErik Faye-Lund2018-12-131-4/+2
| | | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Tested-By: Gert Wollny <[email protected]>
* meson: libfreedreno depends upon libdrm (for fence support)Rhys Kidd2018-12-121-3/+1
| | | | | | | | | | | | Error message building freedreno Gallium driver with meson: ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory \#include <libsync.h> Fixes: 4aa69cc4257 ("meson: build freedreno") Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* virgl: force linear texturing supportErik Faye-Lund2018-12-121-2/+3
| | | | | | | | | | | | | | | When I made sure that half-float texture-filtering was required for ES3, I didn't realize that virgl doesn't report support for this correctly. This regressed the GLES version available on top of several drivers, including i965 from 3.2 to 2.0. This is going to need protocol changes to fix properly, so let's just restore the previous behavior by enabling floating-point filtering unconditionally for now. Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: fcf9fcee3c8 "mesa/main: do not require float-texture filtering for es3" Reviewed-by: Gurchetan Singh <[email protected]>
* gallivm: remove unused float coord wrapping for aos samplingRoland Scheidegger2018-12-121-507/+23
| | | | | | | | | | | | | | | | | | | | | | | AoS sampling tries to use integers for coord wrapping when possible, as it should be faster. However, for AVX, this was suboptimal, because only floats can use 8x32bit vectors, whereas integers have to be split into 4x32bit vectors. (I believe part of why it was slower was also that at least earlier llvm versions had trouble optimizing it properly, since you can still do simple bit ops with 8x32bit vectors, so a sequence of int add / and / int add / and with such vectors would actually end up doing 128bit inserts/extracts between the operations instead of just doing the cheap 128bit ands.) Hence, a special float coord wrapping path was added to AoS sampling. But this path was actually disabled for a long time already, since we found that just splitting everything before entering the AoS path was still sligthly faster usually, so none of this float coord wrapping code was used anymore (AoS sampling code, when avx2 isn't supported, never sees vectors with length > 4). I thought it might be useful some day again, but I'm not interested anymore in optimizing for very weird instruction sets which have support for 256bit vectors for floats but not for ints, so just drop it. Reviewed-by: Jose Fonseca <[email protected]>
* nv50/ir: fix use-after-free in ConstantFolding::visitKarol Herbst2018-12-091-33/+49
| | | | | | | | | | | opnd() might delete the passed in instruction, but it's used through i->srcExists() later in visit v2: use continue instead return v3: use brackets for the outer if/else chain Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: use atomic operations for driver statisticsKarol Herbst2018-12-091-3/+4
| | | | | | | multiple threads can write to those at the same time Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: initialize relDegree staticlyKarol Herbst2018-12-091-7/+16
| | | | | | | this race condition is pretty harmless, but also pretty trivial to fix Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* shader-packingEric Anholt2018-12-071-1/+2
|
* tfuEric Anholt2018-12-071-1/+1
|
* vc4: Fix a leak of the transfer helper on screen destroy.Eric Anholt2018-12-071-0/+3
| | | | Fixes: d009463a6549 ("vc4: Switch to using u_transfer_helper for MSAA maps.")
* v3d: Fix a leak of the transfer helper on screen destroy.Eric Anholt2018-12-071-0/+2
| | | | Fixes: 7a30517cce8f ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")
* v3d: Add VIR dumping of TMU config p0/p1.Eric Anholt2018-12-071-14/+6
| | | | I had a bit of it for V3D 3.x, but didn't update it for 4.x.
* v3d: Garbage collect unused uniforms code.Eric Anholt2018-12-071-88/+0
|
* v3d: Split most of TEXTURE_SHADER_STATE setup out of sampler views.Eric Anholt2018-12-071-58/+69
| | | | For shader image load/store, we want most of this logic to be shared.
* v3d: Avoid confusing auto-indenting in TEXTURE_SHADER_STATE packingEric Anholt2018-12-071-4/+4
| | | | | Having "v3dx_pack() {" under each #if branch would confuse emacs's indenter.
* v3d: Fix handling of texture first_layer offsets for 3D textures.Eric Anholt2018-12-071-5/+5
| | | | | I think this bug predated adding v3d_layer_offset(). Noticed during an unrelated refactor.
* v3d: Return the right gl_SampleMaskIn[] value.Eric Anholt2018-12-071-8/+0
| | | | | It's supposed to be the dispatched sample mask for this pixel, not the GL state's sample mask.
* v3d: Don't forget to flush writes to UBOs.Eric Anholt2018-12-072-5/+16
| | | | | If someone did TF into a UBO, we might have left the TF job un-flushed at the point of reading.
* v3d: Make an array for frag/vert texture state in the context.Eric Anholt2018-12-077-42/+21
| | | | | This simplifies a bunch of our texture handling, while introducing the slots necessary for adding new shader stages.
* v3d: Put default vertex attribute values into the state uploader as well.Eric Anholt2018-12-073-8/+12
| | | | | The default attributes are long-lived (the state struct is cached), and only 256 bytes each.
* v3d: Create a state uploader for packing our shaders together.Eric Anholt2018-12-074-13/+35
| | | | | | Shaders are usually quite short, and are private to the context. We can save memory and reduce the work the kernel needs to do at exec time by packing them together in a stream uploader for long-lived state.
* v3d: Update simulator cache flushing code to match the kernel better.Eric Anholt2018-12-071-13/+19
| | | | | We were missing the invalidate between bin and render (possibly relevant for SSBOs), and still trying to flush the nonexistent L2C on 3.3+.
* v3d: Use the TFU to do generatemipmap.Eric Anholt2018-12-077-1/+175
| | | | | This is a separate, dedicated hardware unit for texture layout conversions and mipmap generation.
* v3d: Add the V3D TFU submit interface to the simulator.Eric Anholt2018-12-073-20/+90
| | | | | | | | | The TFU lets us format raster and SAND images into formats that can be read by the texture engine, and do mipmap generation. The UAPI comes from drm-next e69aa5f9b97f ("Merge tag 'drm-misc-next-2018-12-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
* v3d: Use combined input/output segments.Eric Anholt2018-12-071-4/+7
| | | | | | | The HW apparently has some issues (or at least a much more complicated VCM calculation) with non-combined segments, and the closed source driver also uses combined I/O. Until I get the last CTS failure resolved (which does look plausibly like some VPM stomping), let's use combined I/O too.
* v3d: Add missing OES_half_float_linear support.Eric Anholt2018-12-071-0/+1
| | | | | | | | We were exposing ARB_texture_float, but apparently not the OES subset flag. Fixes regression from GLES3 support to GLES2. Fixes: fcf9fcee3c8a ("mesa/main: do not require float-texture filtering for es3")
* v3d: Add support for RGBA_SRGB along with BGRA_SRGB.Eric Anholt2018-12-071-0/+2
| | | | | This is the actual native format for the hardware, without swizzling. Noticed while debugging why GLES3 disappeared.
* freedreno/ir3: track max flow control depth for a5xx/a6xxRob Clark2018-12-072-4/+4
| | | | | | Rather than just hard-coding BRANCHSTACK size. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sync instr/disasmRob Clark2018-12-071-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: blitter fixesRob Clark2018-12-072-3/+80
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-12-077-35/+56
| | | | Signed-off-by: Rob Clark <[email protected]>