summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* haiku: build fixes around debug definesJerome Duval2017-02-241-2/+2
|
* swr: fix index buffers with non-zero indicesGeorge Kyriazis2017-02-235-6/+62
| | | | | | | | | | | | | Fix issue with index buffers that do not contain a 0 index. 0 index can be a non-valid index if the (copied) vertex buffers are a subset of the user's (which happens because we only copy the range between min & max). Core will use an index passed in from the driver to replace invalid indices. Only do this for calls that contain non-zero indices, to minimize performance Reviewed-by: Bruce Cherniak <[email protected]> cost.
* swr: add fetch shader cacheGeorge Kyriazis2017-02-236-15/+50
| | | | | | | | | For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak <[email protected]>
* st/wgl: flush with ST_FLUSH_WAIT before releasing shared contextsCharmaine Lee2017-02-182-2/+15
| | | | | | | | | | | Before releasing a shared context, flush the context with ST_FLUSH_WAIT to make sure all commands are executed. This ensures that rendering to any shared resources is completed before they will be referenced by another context. Fixes an intermittent flickering with Photoshop. (VMware bug# 1779340) Reviewed-by: Brian Paul <[email protected]>
* st: add ST_FLUSH_WAIT to st_context_flush()Charmaine Lee2017-02-181-0/+1
| | | | | | | When st_context_flush() is called with ST_FLUSH_WAIT, the function will return after the fence is completed. Reviewed-by: Brian Paul <[email protected]>
* radeon: fix r600 builds when old version of llvm is presentTimothy Arceri2017-02-231-2/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* r600/radeonsi: enable glsl/tgsi on-disk cacheTimothy Arceri2017-02-232-0/+46
| | | | | | | | | | For gpu generations that use LLVM we create a timestamp string containing both the LLVM and Mesa build times, otherwise we just use the Mesa build time. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughsTimothy Arceri2017-02-233-0/+39
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add get_disk_shader_cache() callbackTimothy Arceri2017-02-232-0/+19
| | | | | | | V2: Provide more detail in callback description and add description to screen.rst Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "st/vdpau: Fix multithreading"Thomas Hellstrom2017-02-221-21/+1
| | | | | | | | | | This reverts commit f1e5dfbe3c8951a6c8acf41bf5e6c2d090098b2c. For a detailed discussion see https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* vl: u_upload_alloc might fail to allocate buffer in bicubic filterNayan Deshmukh2017-02-221-3/+5
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: reorder fields in pipe_draw_infoMarek Olšák2017-02-221-23/+26
| | | | | | | | | sizeof(struct pipe_draw_info) = 104 -> 88 Also, vertices_per_patch is switched to ubyte, because it can't be more than 32. Seemed-reasonable-to: Roland Scheidegger
* gallium/hud: handle a thread switch for API-thread-busy monitoringMarek Olšák2017-02-221-4/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: prevent an infinite loopMarek Olšák2017-02-221-2/+3
| | | | | | v2: use UINT64_MAX / 11 Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: isolate util_queue_fence implementationMarek Olšák2017-02-226-26/+30
| | | | | | it's cleaner this way. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: fix random crashes when the app calls exit()Marek Olšák2017-02-222-2/+78
| | | | | | | | | | | | This fixes: vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI && "Pass ID not registered!"' failed. v2: use list_head, switch the call order in destroy Cc: 13.0 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/vl: Simplify the matrix filter fragment shaderThomas Hellstrom2017-02-221-40/+16
| | | | | | | | | It looks like it was partly copied from the median filter fragment shader and unnecessesarily saved a lot of temporary values. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/vdpau: Fix multithreadingThomas Hellstrom2017-02-221-1/+21
| | | | | | | | | | | | The vdpau state tracker allows multiple threads access to the same gallium context simultaneously. We can fix this either by locking the same mutex each time the context is used or by using a different gallium context for each mutex domain. Here we do the latter, although I'm not sure that's really the best option. Signed-off-by: Thomas Hellstrom <[email protected]> Acked-by: Sinclair Yeh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl: Parameter substitution in the csc matrix computationThomas Hellstrom2017-02-221-12/+17
| | | | | | | | Makes the code significantly more readable. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl: Simplify usage of full range matricesThomas Hellstrom2017-02-221-38/+17
| | | | | | | | | | | | | | | | | | | When looking at the full range matrices, it becomes obvious that the difference between the standard matrices and the full range matrices is that the full range matrices are multiplied by 1.164. Together with offsetting the y value with -16/255, this will scale and offset RGB with the desired quantities. However, the standard SMPTE 240M matrix seems to differ a bit since the U and V coefficients are only multiplied with 1.138 to get the full range matrix. This would actually alter the color somewhat so I figure that's an error. The full range matrix is consistent with Nvidia's VDPAU implementation. We can also incorporate the ybias in the brightness simplifying the calculation somewhat. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl Fix brightness matrix descriptionThomas Hellstrom2017-02-221-4/+4
| | | | | | | | | The brightness matrix doesn't actually match the procamp matrix and what's calculated in vl_csc_get_matrix. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl: Don't map vertex buffers on creationThomas Hellstrom2017-02-221-1/+0
| | | | | | | | | It will cause multiple simultaneous maps of the same vertex buffer and flushed-while-mapped warnings. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl: Add sampler views to video filter fragment shadersThomas Hellstrom2017-02-223-0/+15
| | | | | | | | Needed for at least the svga driver. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/vl: declare sampler views in compositor shadersThomas Hellstrom2017-02-221-5/+32
| | | | | | | | The svga driver relies on the existence of these sampler views. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/docs: use imgmath instead of pngmathEric Engestrom2017-02-221-1/+1
| | | | | | | | WARNING: sphinx.ext.pngmath has been deprecated. Please use sphinx.ext.imgmath instead. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/docs: fix section title formattingEric Engestrom2017-02-221-2/+2
| | | | | | | src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/docs: add missing newlinesEric Engestrom2017-02-221-0/+33
| | | | | | | | Without these, mathjax considers these as the continuation of the previous line. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/docs: add missing math formattingEric Engestrom2017-02-221-0/+4
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/docs: fix sublist formattingEric Engestrom2017-02-221-0/+2
| | | | | | | | | src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation. Sub lists need to be surrounded by a blank line. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: fix issues with monolithic shadersMarek Olšák2017-02-211-1/+2
| | | | | | | | | | | | | | | | R600_DEBUG=mono has had no effect since: commit 1fabb297177069e95ec1bb7053acb32f8ec3e092 Author: Marek Olšák <[email protected]> Date: Tue Feb 14 22:08:32 2017 +0100 radeonsi: have separate LS and ES main shader parts in the shader selector Also, this assertion was failing: si_state_shaders.c:1307: si_shader_select_with_key: Assertion `!shader->is_optimized' failed. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set no-signed-zeros-fp-mathMarek Olšák2017-02-212-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommended by Matt Arsenault. 46757 shaders in 28742 tests Totals: SGPRS: 2068851 -> 2066907 (-0.09 %) VGPRS: 1604056 -> 1602676 (-0.09 %) Spilled SGPRs: 1402 -> 1382 (-1.43 %) Spilled VGPRs: 113 -> 113 (0.00 %) Private memory VGPRs: 1332 -> 1332 (0.00 %) Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread Code Size: 58815520 -> 58716788 (-0.17 %) bytes LDS: 1162 -> 1162 (0.00 %) blocks Max Waves: 354616 -> 354905 (0.08 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 786452 -> 784508 (-0.25 %) VGPRS: 530000 -> 528620 (-0.26 %) Spilled SGPRs: 958 -> 938 (-2.09 %) Spilled VGPRs: 85 -> 85 (0.00 %) Private memory VGPRs: 636 -> 636 (0.00 %) Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread Code Size: 26349936 -> 26251204 (-0.37 %) bytes LDS: 304 -> 304 (0.00 %) blocks Max Waves: 108962 -> 109251 (0.27 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)Marek Olšák2017-02-213-5/+24
| | | | | | v2: define lp_float_mode Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read themMarek Olšák2017-02-213-15/+77
| | | | | | | | | | We were unconditionally storing these outputs, sometimes even one component at a time, but apps never read them in TES. Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can easily disable them on demand. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip LDS stores in TCS if there are no LDS output readsMarek Olšák2017-02-211-1/+16
| | | | | | | | | | | This removes a lot of useless LDS stores. A few games read TESSINNER/OUTER, but not any other outputs. Most games don't read any outputs. The only app doing LDS output reads is UE4 Lightsroom Interior. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: add basic info about tessellation OUT and IN usesMarek Olšák2017-02-212-0/+34
| | | | | | not all of them will be used immediately Reviewed-by: Nicolai Hähnle <[email protected]>
* etnaviv: remove number of pixel pipes validationChristian Gmeiner2017-02-211-10/+0
| | | | | | | | | | | | | This validation was added before the etnaviv drm driver landed in the linux kernel. Due some pre-merge API changes we had to fix-up this value but with a mainline kernel this is not a problem anymore. Lets remove that validation which also gets rid of problem caught by Coverity, reported to me by imirkin. Cc: "17.0" <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* etnaviv: move pctx initialisation to avoid a null dereferenceChristian Gmeiner2017-02-211-6/+6
| | | | | | | | | | | In case ctx->stream == NULL the fail label gets executed where pctx gets dereferenced - too bad pctx is NULL in that case. Caught by Coverity, reported to me by imirkin. Cc: "17.0" <[email protected]> Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* etnaviv: add missing fallthrough annotationChristian Gmeiner2017-02-211-0/+1
| | | | | | | Caught by Coverity, reported to me by imirkin. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium: do not #include foo.h within extern C {}Emil Velikov2017-02-211-2/+2
| | | | | | | | Analogous to previous commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIKNicolai Hähnle2017-02-216-19/+43
| | | | | | | | | | The same PS epilog workaround as for 8-bit integer formats is required, since the CB doesn't do clamping. Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: handle MultiDrawIndirect in si_get_draw_start_countNicolai Hähnle2017-02-211-7/+53
| | | | | | | | | | | | | | | | | | | | | Also handle the GL_ARB_indirect_parameters case where the count itself is in a buffer. Use transfers rather than mapping the buffers directly. This anticipates the possibility that the buffers are sparse (once ARB_sparse_buffer is implemented), in which case they cannot be mapped directly. Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type on <= CIK. v2: - unmap the indirect buffer correctly - handle the corner case where we have indirect draws, but all of them have count 0. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* winsys/amdgpu: reduce max_alloc_size based on GTT limitsNicolai Hähnle2017-02-211-2/+4
| | | | | | | | | | | | Allocating huge buffers in VRAM is not a problem, but when those buffers start being migrated, the kernel runs into errors because it cannot split those buffer up for moving through GTT. This should fix intermittent failures of GL45-CTS.texture_buffer.texture_buffer_max_size Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* nvc0: use PascalB for most Pascal boardsBen Skeggs2017-02-212-1/+9
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* r300g: only allow byteswapped formats on big endianGrazvydas Ignotas2017-02-211-0/+5
| | | | | | | | | They cause regressions on little endian. Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869 Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallivm: Reenable PPC VSX (v3)Ben Crocker2017-02-201-1/+13
| | | | | | | | | | | | | | | Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1, now that LLVM bug 26775 and its corollary, 25503, are fixed. Amendment: remove extraneous spaces in macro def & invocations. We would prefer a runtime check, e.g. via an LLVMQueryString (analogous to glGetString, eglQueryString) or LLVMGetVersion API, but no such API exists at this time. Signed-off-by: Ben Crocker <[email protected]> [Emil Velikov: remove LLVM_VERSION macro] Signed-off-by: Emil Velikov <[email protected]>
* gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)Ben Crocker2017-02-201-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | If llvm::sys::getHostCPUName() returns "generic", override it with "pwr8" (on PPC64LE). This is a work-around for a bug in LLVM: a table entry for "POWER8NVL" is missing, resulting in (big-endian) "generic" being returned on little-endian Power8NVL systems. The result is that code that attempts to load the least significant 32 bits of a 64-bit quantity in memory loads the wrong half. This omission should be fixed in the next version of LLVM (4.0), but this work-around should be left in place in case some future version of POWER<n> also ends up unrepresented in LLVM's table. This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion tests on POWER8NVL processors. (V4: add similar comment in the code.) Signed-off-by: Ben Crocker <[email protected]> Cc: 12.0 13.0 17.0 <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallivm: Improve debug output (V2)Ben Crocker2017-02-202-1/+18
| | | | | | | | | | | | | | | Improve debug output from gallivm_compile_module and lp_build_create_jit_compiler_for_module, printing the -mcpu and -mattr options passed to LLC. V2: enclose MAttrs debug_printf block and llc -mcpu debug_printf in "if (gallivm_debug & <flags>)..." Signed-off-by: Ben Crocker <[email protected]> Cc: 12.0 13.0 17.0 <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v2) [Emil Velikov: rebase] Signed-off-by: Emil Velikov <[email protected]>
* gallium/u_suballoc: update commentsMarek Olšák2017-02-201-3/+5
| | | | as requested by Brian. Trivial.
* android: radeonsi: fix sid_table.h generated header include pathMauro Rossi2017-02-201-1/+3
| | | | | | | | | | | | | | | generated-sources-dir-for macro replaces intermediates-dir-for and LOCAL_MODULE_CLASS is defined as required by new macro, in order to avoid the following building error: external/mesa/src/gallium/drivers/radeonsi/si_debug.c:29:10: fatal error: 'sid_tables.h' file not found ^ 1 error generated. Fixes: 730574c58e8 ("android: ac/debug: move sid_tables.h generation and IB decode to amd/common") Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionallyMarek Olšák2017-02-195-6/+14
| | | | | | | | It's OK for r300g (because r300g can't write to buffers via the GPU), but not later hardware. This issue was spotted randomly. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]>