summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Fix vc4_fence_server_sync() on pre-syncobj kernels.Eric Anholt2018-08-071-1/+2
| | | | | | | | | We won't have an FD if we're just having the server wait on a fence created by eglCreateSyncKHR(). Our seqno fences will happen in order, so server-side waits are no-ops in that case. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.buffers.gen_delete Fixes: b0acc3a5628c ("broadcom/vc4: Native fence fd support")
* vc4: Ignore samplers for finding uniform offsets.Eric Anholt2018-08-071-3/+14
| | | | | | | | | | Fixes: dEQP-GLES2.shaders.struct.uniform.sampler_array_fragment dEQP-GLES2.shaders.struct.uniform.sampler_array_vertex dEQP-GLES2.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.shaders.struct.uniform.sampler_nested_vertex Cc: [email protected]
* vc4: Extend dumping of uniforms in QIR and in the command stream.Eric Anholt2018-08-073-13/+68
| | | | Similar to what I did for V3D, provide some description of the uniforms.
* vc4: Pull uinfo->data[i] dereference out to the top of the loop.Eric Anholt2018-08-071-20/+18
| | | | | | Reduces the size of vc4_uniforms.o by about 10%. We would basically always end up loading the cachline of uinfo->data[i] anyway, so it should be good for performance as well as making the code a bit cleaner.
* vc4: Make sure to emit a tile coordinates between two MSAA loads.Eric Anholt2018-08-071-12/+11
| | | | | | | | | | The HW only executes a load once the tile coordinates packet happens, and only tracks one at a time, so by emitting our two MSAA loads back to back we would end up with an undefined color or Z buffer. The simulator doesn't seem to care, but sync up the RCL generation with the kernel anyway. Fixes dEQP-EGL.functional.render.multi_context.gles2.rgb888_window
* vc4: Respect a sampler view's first_layer field.Eric Anholt2018-08-071-1/+3
| | | | | | | Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture Cc: [email protected]
* virgl: add ARB_shader_clock supportDave Airlie2018-08-082-1/+3
| | | | Reviewed-by: Erik Faye-Lund <[email protected]>
* radeonsi: set GLC=1 for all write-only shader resourcesMarek Olšák2018-08-071-2/+19
|
* radeonsi: don't load block dimensions into SGPRs if they are not variableMarek Olšák2018-08-073-7/+7
|
* swr: don't export swr_create_screen_internalEmil Velikov2018-08-072-2/+1
| | | | | | | | | | | | | | | | With earlier rework the user and provider of the symbol are within the same binary. Thus there's no point in exporting the function. Spotted while reviewing patch from Chuck, that nearly added another unneeded PUBLIC function. Cc: Chuck Atkins <[email protected]> Cc: Tim Rowley <[email protected]> Fixes: f50aa21456d "(swr: build driver proper separate from rasterizer") Signed-off-by: Emil Velikov <[email protected]> Tested-by: Chuck Atkins <[email protected]> Reviewed-By: George Kyriazis <[email protected]<mailto:[email protected]>> Tested-by: Chuck Atkins <[email protected]<mailto:[email protected]>>
* virgl: update virgl_hw.h from virglrendererErik Faye-Lund2018-08-071-1/+26
| | | | | | | | This just makes sure we're currently up-to-date with what virglrenderer has. Signed-off-by: Erik Faye-Lund <[email protected]> Acked-by: Dave Airlie <[email protected]>
* virgl: rename msaa_sample_positions -> sample_locationsErik Faye-Lund2018-08-072-5/+5
| | | | | | | | | | | | This matches what this field is called in virglrenderer's copy of this. This reduces the diff between the two different versions of virgl_hw.h, and should make it easier to upgrade the file in the future. Signed-off-by: Erik Faye-Lund <[email protected]> Acked-by: Dave Airlie <[email protected]>
* vc4: Fix a leak of the no-vertex-elements workaround BO.Eric Anholt2018-08-061-0/+2
| | | | Fixes: bd1925562ad1 ("vc4: Convert the driver to emitting the shader record using pack macros.")
* vc4: Fix context creation when syncobjs aren't supported.Eric Anholt2018-08-061-2/+6
| | | | | | Noticed when trying to run current Mesa on rpi's downstream kernel. Fixes: b0acc3a5628c ("broadcom/vc4: Native fence fd support")
* v3d: Emit the VCM_CACHE_SIZE packet.Eric Anholt2018-08-062-0/+9
| | | | | | | This is needed to ensure that we don't get blocked waiting for VPM space with bin/render overlapping. Cc: "18.2" <[email protected]>
* v3d: Drop "VC5" from the renderer string.Eric Anholt2018-08-061-1/+1
| | | | VC5 isn't a useful name any more, just stick to v3d.
* nvc0/ir: return 0 in imageLoad on incomplete texturesKarol Herbst2018-08-042-3/+31
| | | | | | | | | | | | | | We already guarded all OP_SULDP against out of bound accesses, but we ended up just reusing whatever value was stored in the dest registers. Fixes CTS test shader_image_load_store.incomplete_textures v2: fix for loads not ending up with predicates (bindless_texture) v3: fix replacing the def Cc: <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* gm200/ir: optimize rcp(sqrt) to rsqKarol Herbst2018-08-041-1/+10
| | | | | | | | | | | | | | | | mitigates hurt shaders after adding sqrt: total instructions in shared programs : 5456166 -> 5454825 (-0.02%) total gprs used in shared programs : 647522 -> 647551 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58288696 -> 58274448 (-0.02%) local shared gpr inst bytes helped 0 0 0 516 516 hurt 0 0 27 2 2 Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* gm200/ir: add native OP_SQRT supportKarol Herbst2018-08-044-2/+14
| | | | | | | | | | | | | | | | | | | | | ./GpuTest /test=pixmark_piano 1024x640 30sec: 301 -> 327 points shader-db: total instructions in shared programs : 5472103 -> 5456166 (-0.29%) total gprs used in shared programs : 647530 -> 647522 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58459304 -> 58288696 (-0.29%) local shared gpr inst bytes helped 0 0 27 8281 8281 hurt 0 0 21 431 431 v2: use NVISA_GM200_CHIPSET Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* radeonsi: cosmetic changesMarek Olšák2018-08-045-6/+5
|
* amd: remove support for LLVM 5.0Marek Olšák2018-08-033-36/+3
| | | | | | Users are encouraged to switch to LLVM 6.0 released in March 2018. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add new R600_DEBUG test "testclearbufperf"Darren Powell2018-08-028-11/+170
| | | | | Signed-off-by: Darren Powell <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* swr: Remove unnecessary memset callVlad Golovkin2018-08-021-1/+0
| | | | | | | | | | Zeroing memory after calloc is not necessary. This also allows to avoid possible crash when allocation fails, because memset is called before checking screen for NULL. Fixes: a29d63ecf71546c4798c6 "swr: refactor swr_create_screen to allow for proper cleanup on error" Reviewed-by: Eric Engestrom <[email protected]>
* ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)Marek Olšák2018-08-014-9/+43
| | | | | | | | To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 finish sooner on the older CPUs. (otherwise it gets killed and we fail the test) Acked-by: Dave Airlie <[email protected]>
* vc4: Fix automake linking error.Juan A. Suarez Romero2018-08-011-0/+9
| | | | | | | | | | | | | | | CXXLD gallium_dri.la ../../../../src/gallium/drivers/vc4/.libs/libvc4.a(vc4_cl_dump.o): In function `vc4_dump_cl': src/gallium/drivers/vc4/vc4_cl_dump.c:45: undefined reference to `clif_dump_init' src/gallium/drivers/vc4/vc4_cl_dump.c:82: undefined reference to `clif_dump_destroy' ../../../../src/broadcom/cle/.libs/libbroadcom_cle.a(cle_libbroadcom_cle_la-v3d_decoder.o): In function `v3d_field_iterator_next': src/broadcom/cle/v3d_decoder.c:902: undefined reference to `clif_lookup_bo' Fixes: e92959c4e0 ("v3d: Pass the whole clif_dump structure to v3d_print_group().") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107423 CC: Eric Anholt <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* python: Use the unicode_escape codecMathieu Bridon2018-08-011-1/+1
| | | | | | | | | | | | Python 2 had string_escape and unicode_escape codecs. Python 3 only has the latter. These work the same as far as we're concerned, so let's use the future-proof one. However, the reste of the code expects unicode strings, so we need to decode them again. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* virgl: enable FBFETCH if virglrenderer supports itErik Faye-Lund2018-08-012-1/+3
| | | | | | | | | | | | | | | | | This fixes the following dEQP-GLES31 cases from NotSupported to Pass for me: - dEQP-GLES31.functional.blend_equation_advanced.state_query.* - dEQP-GLES31.functional.blend_equation_advanced.basic.* - dEQP-GLES31.functional.blend_equation_advanced.srgb.* - dEQP-GLES31.functional.blend_equation_advanced.msaa.* - dEQP-GLES31.functional.blend_equation_advanced.barrier.* - dEQP-GLES31.functional.draw_buffers_indexed.overwrite_*advanced_blend_eq* - dEQP-GLES31.functional.state_query.indexed.blend_equation_advanced_* - dEQP-GLES31.functional.debug.negative_coverage.*.advanced_blend.* Signed-off-by: Erik Faye-Lund <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* virgl: add texture_barrier stubErik Faye-Lund2018-08-011-0/+7
| | | | | | | | | | | | | | | | In gallium, supporting FBFETCH means supporting non-coherent fetches, but in virglrenderer, due to technical reasons this is backed by coherent fetches instead. This means we don't need to do anything for the barriers. However, if we don't have a texture_barrier implementation, we get crashes because the non-coherent extensions is exposed. So, let's leave this as a NOP for now. [airlied: I've got a more complete impl of this somewhere, once we land the host side]. Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Erik Faye-Lund <[email protected]>
* virgl: enable robustness if the host exposes itDave Airlie2018-08-012-1/+3
| | | | Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: Support ARB_framebuffer_no_attachmentsDave Airlie2018-08-014-1/+23
| | | | | | This uses new protocol to send the default sizes to the host. Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: add initial ARB_compute_shader supportDave Airlie2018-08-017-7/+153
| | | | | | This hooks up compute shader creation and launch grid support. Reviewed-by: Gurchetan Singh <[email protected]>
* radeonsi: report supported EQAA combinations from is_format_supportedMarek Olšák2018-07-311-16/+20
| | | | | | Framebuffer without attachments now supports 16 samples. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: use storage_samples instead of color_samples in most placesMarek Olšák2018-07-316-41/+26
| | | | | | | and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <[email protected]>
* gallium: add storage_sample_count parameter into is_format_supportedMarek Olšák2018-07-3131-6/+101
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTSMarek Olšák2018-07-3116-0/+17
| | | | Tested-by: Dieter Nützel <[email protected]>
* virgl: also mark sampler views as dirtyGurchetan Singh2018-08-011-1/+2
| | | | | | | | | | | | | | | | When texture buffers are used as images in compute shaders, the guest never sees the modified data since the TBO is always marked as clean. Fixes most dEQP-GLES31.functional.image_load_store.buffer.* tests. Example test cases: dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui dEQP-GLES31.functional.image_load_store.buffer.qualifiers.coherent_r32f dEQP-GLES31.functional.image_load_store.buffer.format_reinterpret.rgba8_rgba8ui Note: virglrenderer side patch also needed to bind TBOs correctly Reviewed-by: Dave Airlie <[email protected]>
* virgl: add memory barrier supportDave Airlie2018-08-015-0/+29
| | | | Reviwed-by: Gert Wollny <[email protected]>
* virgl: add TXQS supportDave Airlie2018-08-012-1/+3
| | | | Reviwed-by: Gert Wollny <[email protected]>
* virgl: add initial images support (v2)Dave Airlie2018-08-018-0/+105
| | | | | | v2: add max image samples support Reviwed-by: Gert Wollny <[email protected]>
* etnaviv: fix typo in query namesChristian Gmeiner2018-07-311-2/+2
| | | | | | | Fixes: d0bed0b4944d ("etnaviv: support HI performance counters") Cc: [email protected] Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Chris Healy <[email protected]>
* v3d: Include commands to run the BCL and RCL in CLIF dumps.Eric Anholt2018-07-301-10/+1
|
* v3d: Rename "configuration" and "config" in the XML to "cfg"Eric Anholt2018-07-304-30/+33
| | | | | | This matches what CLIF parsing expects, and makes TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more legible TILE_BINNING_MODE_CFG_COMMON.
* v3d: s/colour/color in the XML.Eric Anholt2018-07-303-20/+20
| | | | | | The CLIF format expects american english spelling, and the rest of Mesa is too. I was previously adhering to the spec's spelling, which is counterproductive.
* v3d: Rename primitives to prims in the XML to match CLIF names.Eric Anholt2018-07-302-5/+5
| | | | This makes us match up with the V3D HW team's names a bit more.
* v3d: Add a separate flag for CLIF ABI output versus human-readable CLs.Eric Anholt2018-07-302-3/+4
| | | | | | A few of the upcoming changes would make the V3D_DEBUG=cl output less readable, so let's make proper CLIF file production be under a separate V3D_DEBUG=clif flag.
* v3d: Add pack header support for f187 values.Eric Anholt2018-07-302-15/+5
| | | | | | V3D only has one of these (the top 16 bits of a float32) left in its CLs, but VC4 had many more. This gets us proper pretty-printing of the values instead of a large uint.
* v3d: Move depth offset packet setup to CSO creation time.Eric Anholt2018-07-304-33/+34
| | | | | This should be some simpler memcpying at draw time, and makes the next change easier.
* r600: reduce num compute threads to 1024.Dave Airlie2018-07-311-1/+1
| | | | | | | | | | I copied this value from radeonsi, but it was wrong, 1024 seems to be correct answer from looking at gpuinfo. This should fix a few compute shader related hangs. (at least in CTS) Cc: <[email protected]> (airlied: pushed because it avoids hangs)
* freedreno/a5xx: fix txf_msRob Clark2018-07-303-0/+12
| | | | | | Somehow this got lost from the initial MSAA patch. Signed-off-by: Rob Clark <[email protected]>
* nvc0: serialize before updating some constant buffer bindings on Maxwell+Rhys Perry2018-07-304-47/+81
| | | | | | | | | | | | | | | | | To avoid serializing, this has the user constant buffer always be 65536 bytes and enabled unless it's required that something else is used for constant buffer 0. Fixes artifacts with at least XCOM: Enemy Within, 0 A.D. and Unigine Valley, Heaven and Superposition. v2: changed uniform_buffer_bound to be bool instead of a uint32_t v3: remove magic constants v3: remove pointless code in nvc0_validate_driverconst Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100177 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>