summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallivm,ac: add function attributes at call sites instead of declarationsMarek Olšák2017-03-014-54/+91
| | | | | | | | | | | | | | | | They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic. We need this to force readnone or inaccessiblememonly on some amdgcn intrinsics. This is only used with LLVM 4.0 and later. Intrinsics only used with LLVM <= 3.9 don't need the LEGACY flag. gallivm and ac code is in the same patch, because splitting would be more complicated with all the LEGACY uses all over the place. v2: don't change the prototype of lp_add_function_attr. Reviewed-by: Jose Fonseca <[email protected]> (v1)
* gallivm,ac: remove unused FUNC_ATTR_LAST enumsMarek Olšák2017-03-011-1/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* automake: r600: radeonsi: correctly manage libamd_common.la linkingEmil Velikov2017-02-282-4/+5
| | | | | | | | | | | | | | | | | Since both r600 and radeonsi use code from libamd_common they need to static link it. At the same time, adding a common library to LIB_DEPS is fragile [can lean to multiple symbol definitions] and non-obvious - I had to do a double-take how things work atm. So follow the libradeon.la approach and put common libraries in TARGET_RADEON_COMMON Fixes: 936f5407a7d ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600") Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Acked-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600Michel Dänzer2017-02-282-1/+5
| | | | | | | | | | | | | | | | | | Fixes build failure with --enable-opencl --enable-xvmc: make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc' CXXLD libXvMCgallium.la ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read' ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' collect2: error: ld returned 1 exit status Makefile:760: recipe for target 'libXvMCgallium.la' failed Fixes: dc4c551a345d ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()") Acked-by: Timothy Arceri <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* gallium/r600: fix r600 build when OpenCL is enabledTimothy Arceri2017-02-281-0/+5
| | | | Fixes build regression caused by d90bf4ef3e1db7
* radeon: remove unused radeon_elf_util.{c,h}Timothy Arceri2017-02-287-256/+0
| | | | | | We now use the shared code in AMD common instead. Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch to ac_shader_binary_config_start()Timothy Arceri2017-02-282-3/+4
| | | | | | | | For radeonsi we could probably switch to ac_shader_binary_read_config(). However the functions have diverged so just share this helper for now. Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch from radeon_elf_read() to ac_elf_read()Timothy Arceri2017-02-284-6/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch from radeon_shader_binary to ac_shader_binaryTimothy Arceri2017-02-2812-73/+36
| | | | Reviewed-by: Marek Olšák <[email protected]>
* clover: Dump linked binary to a different fileJan Vesely2017-02-271-2/+6
| | | | | | | | | | this allows to pass the generated files directly to llc or bugpoint v2: add atomic counter ID v3: remove extra scope operator, constify Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium/u_queue: set num_threads correctly if not all threads startGrazvydas Ignotas2017-02-271-1/+1
| | | | | | | | | If i-th thread could not be created it means we have i threads, not i+1, because we start from 0. Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads" Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/u_queue: fix a crash with atexit handlersGrazvydas Ignotas2017-02-271-0/+1
| | | | | | | | | | | | | | | | Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls exit()") added a atexit handler which calls util_queue_killall_and_wait() for each queue to stop the threads. However the app is also free to use atexit handlers to clean up things, leading to util_queue_destroy() call which will also call util_queue_killall_and_wait() for the same queue again, causing threads being joined twice, and that is undefined. This happens with libglut, for example. A simple fix is to just set num_threads to 0 as there are no more valid threads after util_queue_killall_and_wait() returns. Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()" Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/nine: Drop USER_INDEX_BUFFERS checkMike Lothian2017-02-252-3/+1
| | | | | | | | | | | | | | | | | | | | | | | This fixes 4a883966c1f74f43afc145d2c3d27af7b8c5e01a where the PIPE_CAP was removed. Now USER_INDEX_BUFFERS are always enabled remove the check and only check for cmst_active directly. v2: Axel pointed out the code was still needed when cmst was inactive, Rebase on master too v3: Drop struct member user_ibufs also && fixup shortlog (Edward). v4: Fix negation v5: Use the right variable name csmt != cmst Fixes: 4a883966c1f7 ("gallium: remove PIPE_CAP_USER_INDEX_BUFFERS") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99953 Reported-and-tested-by: Vinson Lee <[email protected]> (v1) Cc: Marek Olšák <[email protected]> Cc: Axel Davy <[email protected]> Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Mike Lothian <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* st/nine: make use of common uploaders v4Constantine Charlamov2017-02-254-74/+37
| | | | | | | | | | Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP v4: per Axel comment: changed style of the comment
* svga: fix MSVC build error after PIPE_CAP_USER_INDEX_BUFFERS removalBrian Paul2017-02-241-1/+1
| | | | | | | Need to specify the zero for the struct initializer. My earlier test of the patch series was with MinGW, not MSVC. Trivial.
* vc4: Lazily emit our FS/VS input loads.Eric Anholt2017-02-244-75/+93
| | | | | | | | | | | | | | | | | | | This reduces register pressure in both types of shaders, by reordering the input loads from the var->data.driver_location order to whatever order they appear first in the NIR shader. These instructions aren't reorderable at our QIR scheduling level because the FS takes two in lockstep to do an interpolation, and the VS takes multiple read instructions in a row to get a whole vec4-level attribute read. shader-db impact: total instructions in shared programs: 76666 -> 76590 (-0.10%) instructions in affected programs: 42945 -> 42869 (-0.18%) total max temps in shared programs: 9395 -> 9208 (-1.99%) max temps in affected programs: 2951 -> 2764 (-6.34%) Some programs get their max temps hurt, depending on the order that the load_input intrinsics appear, because we end up being unable to copy propagate an older VPM read into its only use.
* vc4: Refactor the load_input code out of the intrinsic code.Eric Anholt2017-02-241-25/+42
| | | | It's going gain most of ntq_setup_inputs(), so simplify it first.
* vc4: Track the last block we emitted at the top level.Eric Anholt2017-02-243-5/+10
| | | | | This will be used for delaying our VPM reads (which must be unconditional) until just before they're used.
* vc4: Emit max number of temps in the shader-db output.Eric Anholt2017-02-241-0/+23
| | | | | | | We need to be paying attention to optimization's impact on this -- even if we reduce instruction count, increasing max temps in general is likely to cause us to fail to register allocate on some shaders, which means that those won't run at all.
* radeonsi: fix broken tessellation on Carrizo and StoneyMarek Olšák2017-02-251-1/+3
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850 Cc: 13.0 17.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* trace: remove pipe_resource wrappingMarek Olšák2017-02-257-260/+40
| | | | | | | | | | Not needed. ddebug does the same thing. The limitation is that drivers can only use pipe_resource::screen through pipe_resource_reference. This unbreaks trace, because pipe_context uploaders aren't wrapped, so trace doesn't understand buffers returned by them. Reviewed-by: Brian Paul <[email protected]>
* gallium: remove PIPE_CAP_USER_INDEX_BUFFERSMarek Olšák2017-02-2517-20/+0
| | | | | | | | all drivers support it Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* svga: implement user index buffersMarek Olšák2017-02-252-2/+13
| | | | | Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* freedreno: add support for user index buffersMarek Olšák2017-02-252-1/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* etnaviv: add support for user index buffersMarek Olšák2017-02-252-1/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add new helpers for user index buffer uploadingMarek Olšák2017-02-252-0/+35
| | | | | | | v3: split from the etnaviv patch; fix new_ib.buffer leak Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* gallium/util: (trivial) fix util_clear_render_targetRoland Scheidegger2017-02-241-7/+8
| | | | | | the format of the rt can be different than the one of the texture, so must propagate the format explicitly to the helper. Broken since 3f9c5d62441eba38e8b1592aba965ed5db6fd89b (but unused by st/mesa).
* vc4: automake: add the kernel/README to the tarballEmil Velikov2017-02-241-0/+2
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* st/va: Fix up YV12 to NV12 putImage conversionThomas Hellstrom2017-02-242-57/+28
| | | | | | | | | | Use the utility u_copy_nv12_from_yv12 to implement this similarly to how it's been done in the VPAU state tracker. The old code mixed up planes and fields and didn't correctly handle video surfaces in interlaced format. Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* st/vdpau: Provide YV12 to NV12 putBits conversion v2Thomas Hellstrom2017-02-243-19/+130
| | | | | | | | | | | | | | | | | mplayer likes putting YV12 data, and if there is a buffer format mismatch, the vdpau state tracker would try to reallocate the video surface as an YV12 surface. A virtual driver doesn't like reallocating and doesn't like YV12 surfaces, so if we can't support YV12, try an YV12 to NV12 conversion instead. Also advertize that we actually can do the getBits and putBits conversion. v2: A previous version of this patch prioritized conversion before reallocating. This has been changed to prioritize reallocating in this version. Cc: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* softpipe: enable clear_texture with util_clear_textureLars Hamre2017-02-242-1/+4
| | | | | | | | Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* llvmpipe: enable clear_texture with util_clear_textureLars Hamre2017-02-242-2/+4
| | | | | | | | Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: implement util_clear_textureLars Hamre2017-02-242-145/+248
| | | | | | | | | | | | | v3: have util_clear_texture mirror the pipe function (Roland Scheidegger) v2: rework util clear functions such that they operate on a resource instead of a surface (Roland Scheidegger) Creates a util_clear_texture function for implementing the GL_ARB_clear_texture in softpipe and llvmpipe. Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* haiku/winsys: fix dt prototype argsJerome Duval2017-02-241-1/+2
|
* haiku: build fixes around debug definesJerome Duval2017-02-241-2/+2
|
* swr: fix index buffers with non-zero indicesGeorge Kyriazis2017-02-235-6/+62
| | | | | | | | | | | | | Fix issue with index buffers that do not contain a 0 index. 0 index can be a non-valid index if the (copied) vertex buffers are a subset of the user's (which happens because we only copy the range between min & max). Core will use an index passed in from the driver to replace invalid indices. Only do this for calls that contain non-zero indices, to minimize performance Reviewed-by: Bruce Cherniak <[email protected]> cost.
* swr: add fetch shader cacheGeorge Kyriazis2017-02-236-15/+50
| | | | | | | | | For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak <[email protected]>
* st/wgl: flush with ST_FLUSH_WAIT before releasing shared contextsCharmaine Lee2017-02-182-2/+15
| | | | | | | | | | | Before releasing a shared context, flush the context with ST_FLUSH_WAIT to make sure all commands are executed. This ensures that rendering to any shared resources is completed before they will be referenced by another context. Fixes an intermittent flickering with Photoshop. (VMware bug# 1779340) Reviewed-by: Brian Paul <[email protected]>
* st: add ST_FLUSH_WAIT to st_context_flush()Charmaine Lee2017-02-181-0/+1
| | | | | | | When st_context_flush() is called with ST_FLUSH_WAIT, the function will return after the fence is completed. Reviewed-by: Brian Paul <[email protected]>
* radeon: fix r600 builds when old version of llvm is presentTimothy Arceri2017-02-231-2/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* r600/radeonsi: enable glsl/tgsi on-disk cacheTimothy Arceri2017-02-232-0/+46
| | | | | | | | | | For gpu generations that use LLVM we create a timestamp string containing both the LLVM and Mesa build times, otherwise we just use the Mesa build time. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughsTimothy Arceri2017-02-233-0/+39
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add get_disk_shader_cache() callbackTimothy Arceri2017-02-232-0/+19
| | | | | | | V2: Provide more detail in callback description and add description to screen.rst Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "st/vdpau: Fix multithreading"Thomas Hellstrom2017-02-221-21/+1
| | | | | | | | | | This reverts commit f1e5dfbe3c8951a6c8acf41bf5e6c2d090098b2c. For a detailed discussion see https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* vl: u_upload_alloc might fail to allocate buffer in bicubic filterNayan Deshmukh2017-02-221-3/+5
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: reorder fields in pipe_draw_infoMarek Olšák2017-02-221-23/+26
| | | | | | | | | sizeof(struct pipe_draw_info) = 104 -> 88 Also, vertices_per_patch is switched to ubyte, because it can't be more than 32. Seemed-reasonable-to: Roland Scheidegger
* gallium/hud: handle a thread switch for API-thread-busy monitoringMarek Olšák2017-02-221-4/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: prevent an infinite loopMarek Olšák2017-02-221-2/+3
| | | | | | v2: use UINT64_MAX / 11 Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: isolate util_queue_fence implementationMarek Olšák2017-02-226-26/+30
| | | | | | it's cleaner this way. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: fix random crashes when the app calls exit()Marek Olšák2017-02-222-2/+78
| | | | | | | | | | | | This fixes: vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI && "Pass ID not registered!"' failed. v2: use list_head, switch the call order in destroy Cc: 13.0 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>