summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* util: move os_time.[ch] to src/utilNicolai Hähnle2017-11-0955-380/+53
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: always use async compiles when creating shader/compute statesNicolai Hähnle2017-11-092-34/+50
| | | | | | | | | | | | | | With Gallium threaded contexts, creating shader/compute states is effectively a screen operation, so we should not use context state. In particular, this allows us to avoid using the context's LLVM TargetMachine. This isn't an issue yet because u_threaded_context filters out non-async debug callbacks, and we disable threaded contexts for debug contexts. However, we may want to change that in the future. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix potential use-after-free of debug callbacksNicolai Hähnle2017-11-091-0/+4
| | | | | | Found by inspection. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move pipe debug callback to si_contextNicolai Hähnle2017-11-096-19/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* util: move pipe_barrier into src/util and rename to util_barrierNicolai Hähnle2017-11-094-88/+13
| | | | | | | | | The #if guard is probably not 100% equivalent to the previous PIPE_OS check, but if anything it should be an over-approximation (are there pthread implementations without barriers?), so people will get either a good implementation or compile errors that are easy to fix. Reviewed-by: Marek Olšák <[email protected]>
* gallium: add async debug message forwarding helperNicolai Hähnle2017-11-094-0/+192
| | | | | | v2: use util_vasprintf for Windows portability Reviewed-by: Marek Olšák <[email protected]> (v1)
* gallium: clarify the constraints on sampler_view_destroyNicolai Hähnle2017-11-092-6/+20
| | | | | | | | | | | | | | | | | r600 expects the context that created the sampler view to still be alive (there is a per-context list of sampler views). svga currently bails when the context of destruction is not the same as creation. The GL state tracker, which is the only one that runs into the multi-context subtleties (due to share groups), already guarantees that sampler views are destroyed before their context of creation is destroyed. Most drivers are context-agnostic, so the warning message in pipe_sampler_view_release doesn't really make sense. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: reduce the scope of sel->mutex in si_shader_select_with_keyNicolai Hähnle2017-11-091-4/+4
| | | | | | | | | | We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. v2: fix double-unlock Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use ready fences on all shaders, not just optimized onesNicolai Hähnle2017-11-093-26/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 -------- -------- si_shader_select_with_key begin compiling the first variant (guarded by sel->mutex) si_bind_XX_shader select first_variant by default as state->current si_shader_select_with_key match state->current and early-out Since thread 2 never takes sel->mutex, it may go on rendering without a PM4 for that shader, for example. The solution taken by this patch is to broaden the scope of shader->optimized_ready to a fence shader->ready that applies to all shaders. This does not hurt the fast path (if anything it makes it faster, because we don't explicitly check is_optimized). It will also allow reducing the scope of sel->mutex locks, but this is deferred to a later commit for better bisectability. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render Reviewed-by: Marek Olšák <[email protected]>
* r600g: use SIMPLE_FLOAT for blending to enable some optimizationsIlia Mirkin2017-11-082-0/+2
| | | | | | | | | | | Radeonsi also sets this flag. Seems to avoid pulling up the desintation RT value when the dst blend factor is zero if it's not otherwise being loaded. Among other things, it allows blending to overwrite infinity/NaN values in the destination RT. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50: make blending work so that zero wins in a multiplicationIlia Mirkin2017-11-081-0/+5
| | | | | | | This matches nvc0 behavior, tested with the fbo-float-nan piglit. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann<[email protected]>
* amdgpu: use simple mtxTimothy Arceri2017-11-095-44/+45
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: use simple mtx in core mesaTimothy Arceri2017-11-091-11/+11
| | | | | | | | | Results from x11perf -copywinwin10 on Eric's SKL: 4.33338% ± 0.905054% (n=40) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Yogesh Marathe <[email protected]>
* broadcom/vc5: Add vc5_drm.h to the release tarballAndreas Boll2017-11-081-0/+1
| | | | | | | | | | | Fixes: 45bb8f29571 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.") Cc: 17.3 <[email protected]> Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* clover: use the unified check for c++11 instead of the gcc version numberGert Wollny2017-11-081-3/+3
| | | | | | | | So far clover based its test for compiler support on the version of gcc, while in reality support for c++11 is required. This patch replaces the version check by the check unified for all modules that require c++11. Reviewed-by: Emil Velikov <[email protected]>
* swr: Replace the check for c++11 by the unified versionGert Wollny2017-11-081-2/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* targets/opencl: don't hardcode the icd file install to /etc/...Emil Velikov2017-11-081-1/+1
| | | | | | | | | | | | | | | | Use $(sysconfdir) instead of hardcoding /etc. While the OpenCL spec expects the file in /etc, people building their stack can override that, esp. !Linux users. Furthermore this removes a fundamental violation, which results in the system file being overwritten even as one explicitly sets --prefix and/or DESTDIR. Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-By: Aaron Watry <[email protected]>
* gallivm: Use new LLVM fast-math-flags APITobias Droste2017-11-081-0/+4
| | | | | | | | | | | | | LLVM 6 changed the API on the fast-math-flags: https://reviews.llvm.org/rL317488 NOTE: This also enables the new flag 'ApproxFunc' to allow for approximations for library functions (sin, cos, ...). I'm not completly convinced, that this is something mesa should do. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* amd/addrlib: update to latest versionMarek Olšák2017-11-081-1/+0
| | | | | | | | | | | | This uses C++11 initializer lists. I just overwrote all Mesa files with internal addrlib and discarded hunks that we should probably keep, but I might have missed something. The code depending on ADDR_AM_BUILD is removed. We can add it back next time if needed. Acked-by: Nicolai Hähnle <[email protected]>
* braodcom/vc5: Flush the job when it grows over 1GB.Eric Anholt2017-11-073-0/+10
| | | | | Fixes GL_OUT_OF_MEMORY from streaming-texture-leak (and will hopefully keep piglit from ooming on my no-swap platform, as well).
* broadcom/vc5: Fix pausing of transform feedback.Eric Anholt2017-11-071-1/+1
| | | | | | Gallium disables it by removing the streamout buffers, not by binding a program that doesn't have TF outputs. Fixes piglit "ext_transform_feedback2/counting with pause"
* broadcom/vc5: Add support for GL_RASTERIZER_DISCARDEric Anholt2017-11-071-0/+2
| | | | Fixes piglit discard-drawarrays.
* broadcom/vc5: Add partial transform feedback query support.Eric Anholt2017-11-073-17/+64
| | | | | | We have to compute the queries in software, so we're counting the primitives by hand. We still need to make sure to not increment the PRIMITIVES_EMITTED if we overflowed, but leave that for later.
* broadcom/vc5: Add occlusion query support.Eric Anholt2017-11-076-20/+121
| | | | Fixes all of piglit's OQ tests.
* broadcom/vc5: Skip emitting textures that aren't used.Eric Anholt2017-11-071-2/+4
| | | | | Fixes crashes when ARB_fp uses texture[1] but not 0, as in piglit's fp-fragment-position.
* broadcom/vc5: Add missing SRGBA8 ETC2 support.Eric Anholt2017-11-071-0/+1
| | | | Fixes piglit oes_compressed_etc2_texture-miptree srgb8-alpha8.
* broadcom/vc5: Disable early Z test when the FS writes Z.Eric Anholt2017-11-071-1/+2
| | | | Fixes piglit early-z.
* broadcom/vc5: Shift the min/max lod fields by the BASE_LEVEL.Eric Anholt2017-11-072-4/+15
| | | | | | | | The lod clamping is what limits you between base and last level, and the base level field is just there to help decide where the min/mag change happens. Fixes tex-miplevel-selection GL2:texture()
* broadcom/vc5: Add support for anisotropic filtering.Eric Anholt2017-11-071-0/+9
|
* broadcom/vc5: Fix mipmap filtering enums.Eric Anholt2017-11-071-6/+8
| | | | | | | | The ordering of the values was even less obvious than I thought, with both the mip filter and the min filter being in different bits depending on whether the mip filter is none. Fixes piglit fs-textureLod-miplevels.shader_test
* broadcom/vc5: Fix height padding of small UIF slices.Eric Anholt2017-11-071-1/+5
| | | | | | | | The HW doesn't pad the slice's height to make a full 4x4 group of UIF blocks. We just need to pad to columns, and the start of the next column appears in the bottom of the previous column's last block. Fixes piglit fs-textureOffset-2D.
* broadcom/vc5: Print the actual offsets in HW for our resource layout debug.Eric Anholt2017-11-071-34/+55
| | | | | The alignment of level 0 is non-obvious, so it's hard to turn a faulting address into a slice without this.
* broadcom/vc5: Set the available VS outputs to match the FS inputs.Eric Anholt2017-11-071-1/+4
| | | | Fixes piglit glsl-es-3.00/minimum-maximums.txt.
* broadcom/vc5: Set the max texture LOD bias.Eric Anholt2017-11-071-1/+1
| | | | | The field is signed 8.8, so the usual 16.0f fits. Fixes piglit gl-2.1-minmax.
* broadcom/vc5: Fix translation of stencil ops.Eric Anholt2017-11-072-8/+30
| | | | | They aren't quite in the same order as the gallium defines. Fixes piglit gl-2.0-two-sided-stencil.
* broadcom/vc5: Move stencil state packing to the CSO.Eric Anholt2017-11-073-27/+47
| | | | Only the stencil ref comes in as dynamic state at emit time.
* broadcom/vc5: Introduce a helper for pre-packing our V3DXX structs.Eric Anholt2017-11-072-165/+155
| | | | | | This is so much more pleasant to write than the manual V3D33_whatever_pack() calls, and will be useful for when we start doing actual per-V3D compiles.
* broadcom/vc5: Add a cl_emit() variant for merging with a pre-packed struct.Eric Anholt2017-11-072-19/+29
| | | | Cleans up the hand-written code, at the cost of another ugly macro.
* broadcom/vc5: Skip emitting depth offset while disabled.Eric Anholt2017-11-071-1/+4
| | | | | The enable flag is also in the rasterizer state, so it will be emitted once it's needed.
* broadcom/vc5: Don't emit stencil config if not doing stencil test.Eric Anholt2017-11-071-1/+2
| | | | | | As with blending, we'll have the bit flagged again when it gets reenabled in CONFIGURATION_BITS, so there's no need to emit test state if we're not testing.
* broadcom/vc5: Don't emit updated blend factors/funcs while disabled.Eric Anholt2017-11-071-1/+5
| | | | | The dirty bit will be flagged again when re-enbaled. Keeps us from emitting blend state in CLs that never do blending.
* broadcom/vc5: Make sure the TMU indirect struct is appropriately aligned.Eric Anholt2017-11-071-0/+2
| | | | | I was hoping that this would help with fbo-generatemipmap hangs, but no luck.
* broadcom/vc5: Use DEPTH24_STENCIL8 for rendering to depth-only textures.Eric Anholt2017-11-071-1/+1
| | | | | | | | | The HW puts the pad bits at the top for DEPTH_COMPONENT24, but we need it at the bottom for texturing. Using the format with stencil probably means we won't be able to do Z24 and separate S8, but I wasn't planning on supporting that anyway. Fixes hiz-depth-read-fbo-d24-s0
* radeonsi: add si_screen::has_ls_vgpr_init_bugMarek Olšák2017-11-074-3/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use ac_create_target_machineMarek Olšák2017-11-071-15/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use ac_get_llvm_processor_nameMarek Olšák2017-11-073-38/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: don't set gs_table_depthMarek Olšák2017-11-071-2/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven onlyMarek Olšák2017-11-071-4/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove unused field in the PCI ID tableMarek Olšák2017-11-071-1/+1
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gallium: Guard assertions by NDEBUG instead of DEBUGMichel Dänzer2017-11-071-1/+1
| | | | | | | This matches the standard assert.h header. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>