aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtzMarek Olšák2017-03-031-5/+1
|
* radeonsi: remove last use of llvm.SI.resinfoMarek Olšák2017-03-031-48/+49
| | | | and move one function up to reuse the code.
* radeonsi: move image intrinsic building to amd/commonMarek Olšák2017-03-031-92/+62
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.export with amdgcn.exp.*Marek Olšák2017-03-031-3/+5
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move llvm.SI.export building to amd/commonMarek Olšák2017-03-031-162/+144
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: unify build_type_name_for_intr functionsMarek Olšák2017-03-031-44/+5
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as wellMarek Olšák2017-03-031-1/+2
| | | | | | It was harmless, because we also set unorm in the sampler state. Reviewed-by: Dave Airlie <[email protected]>
* gallivm, ac: add writeonly and inaccessiblememonly attributesMarek Olšák2017-03-032-0/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* tgsi/scan: record load/store/atomic image usageMarek Olšák2017-03-033-11/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* tgsi/ureg: return correct token count in ureg_get_tokensGrazvydas Ignotas2017-03-031-1/+1
| | | | | | | | | | | | | Valgrind reports that the shader cache writes uninitialized data to disk. Turns out ureg_get_tokens() is returning the count of allocated tokens instead of how many are actually used, so the cache writes out unused space at the end. Use the real count instead. This change should not cause regressions elsewhere because the only ureg_get_tokens() user that cares about token count is the shader cache. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add support for an on-disk shader cacheTimothy Arceri2017-03-031-7/+60
| | | | | | | | | | | | | V2: - when loading from disk cache also binary insert into memory cache. - check that the binary loaded from disk is the correct size. If not delete the cache item and skip loading from cache. V3: - remove unrequired variable Reviewed-by: Grigori Goronzy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* clover: Work around build failure with AltiVec.Matt Turner2017-03-021-0/+3
| | | | | | Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504 Acked-by: Francisco Jerez <[email protected]>
* swr: fix crash in swr_update_derived following st/mesa state changesBruce Cherniak2017-03-022-3/+46
| | | | | | | | | | | | | | | | | | | | Recent change to st/mesa state update logic caused major regressions to swr validation code. swr uses the same validation logic (swr_update_derived) for both draw and Clear calls. New st/mesa state update logic results in certain state objects not being set/bound during Clear. This was causing null ptr exceptions. Creation of static dummy state objects allows setting these pointers during Clear validation, without interfering with relevant state validation. Once fixed, new logic also highlighted an error in dirty bit checking for fragment shader and clip validation. (The alternative is to have a simplified validation routine for Clear. Which may do that at some point.) Reviewed-by: Tim Rowley <[email protected]>
* swr: enable clear_texture with util_clear_textureBruce Cherniak2017-03-022-1/+2
| | | | | | Passes corresponding piglit tests. Reviewed-by: Edward O'Callaghan <[email protected]>
* svga: fix crash regression since e027935a795Brian Paul2017-03-022-4/+5
| | | | | | | | | | | | During the first update of the hw_clear_state atoms, we may not yet have a current rasterizer state object. So, svga->curr.rast may be NULL and we crash. Add a few null pointer checks to work around this. Note that these are only needed in the state update functions which are called for 'clear' validation. Reviewed-by: Charmaine Lee <[email protected]>
* svga: s/unsigned/pipe_prim_type/Brian Paul2017-03-024-2/+8
| | | | | | And add some default switch cases to silence compiler warnings. Reviewed-by: Charmaine Lee <[email protected]>
* svga: whitespace fixes in svga_context.hBrian Paul2017-03-021-10/+9
| | | | Trivial.
* svga: whitespace and formatting fixes in svga_stage.cBrian Paul2017-03-021-44/+40
| | | | Trivial.
* gallivm,ac: add function attributes at call sites instead of declarationsMarek Olšák2017-03-014-54/+91
| | | | | | | | | | | | | | | | They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic. We need this to force readnone or inaccessiblememonly on some amdgcn intrinsics. This is only used with LLVM 4.0 and later. Intrinsics only used with LLVM <= 3.9 don't need the LEGACY flag. gallivm and ac code is in the same patch, because splitting would be more complicated with all the LEGACY uses all over the place. v2: don't change the prototype of lp_add_function_attr. Reviewed-by: Jose Fonseca <[email protected]> (v1)
* gallivm,ac: remove unused FUNC_ATTR_LAST enumsMarek Olšák2017-03-011-1/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* automake: r600: radeonsi: correctly manage libamd_common.la linkingEmil Velikov2017-02-282-4/+5
| | | | | | | | | | | | | | | | | Since both r600 and radeonsi use code from libamd_common they need to static link it. At the same time, adding a common library to LIB_DEPS is fragile [can lean to multiple symbol definitions] and non-obvious - I had to do a double-take how things work atm. So follow the libradeon.la approach and put common libraries in TARGET_RADEON_COMMON Fixes: 936f5407a7d ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600") Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Acked-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600Michel Dänzer2017-02-282-1/+5
| | | | | | | | | | | | | | | | | | Fixes build failure with --enable-opencl --enable-xvmc: make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc' CXXLD libXvMCgallium.la ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read' ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' collect2: error: ld returned 1 exit status Makefile:760: recipe for target 'libXvMCgallium.la' failed Fixes: dc4c551a345d ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()") Acked-by: Timothy Arceri <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* gallium/r600: fix r600 build when OpenCL is enabledTimothy Arceri2017-02-281-0/+5
| | | | Fixes build regression caused by d90bf4ef3e1db7
* radeon: remove unused radeon_elf_util.{c,h}Timothy Arceri2017-02-287-256/+0
| | | | | | We now use the shared code in AMD common instead. Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch to ac_shader_binary_config_start()Timothy Arceri2017-02-282-3/+4
| | | | | | | | For radeonsi we could probably switch to ac_shader_binary_read_config(). However the functions have diverged so just share this helper for now. Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch from radeon_elf_read() to ac_elf_read()Timothy Arceri2017-02-284-6/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: switch from radeon_shader_binary to ac_shader_binaryTimothy Arceri2017-02-2812-73/+36
| | | | Reviewed-by: Marek Olšák <[email protected]>
* clover: Dump linked binary to a different fileJan Vesely2017-02-271-2/+6
| | | | | | | | | | this allows to pass the generated files directly to llc or bugpoint v2: add atomic counter ID v3: remove extra scope operator, constify Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium/u_queue: set num_threads correctly if not all threads startGrazvydas Ignotas2017-02-271-1/+1
| | | | | | | | | If i-th thread could not be created it means we have i threads, not i+1, because we start from 0. Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads" Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/u_queue: fix a crash with atexit handlersGrazvydas Ignotas2017-02-271-0/+1
| | | | | | | | | | | | | | | | Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls exit()") added a atexit handler which calls util_queue_killall_and_wait() for each queue to stop the threads. However the app is also free to use atexit handlers to clean up things, leading to util_queue_destroy() call which will also call util_queue_killall_and_wait() for the same queue again, causing threads being joined twice, and that is undefined. This happens with libglut, for example. A simple fix is to just set num_threads to 0 as there are no more valid threads after util_queue_killall_and_wait() returns. Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()" Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/nine: Drop USER_INDEX_BUFFERS checkMike Lothian2017-02-252-3/+1
| | | | | | | | | | | | | | | | | | | | | | | This fixes 4a883966c1f74f43afc145d2c3d27af7b8c5e01a where the PIPE_CAP was removed. Now USER_INDEX_BUFFERS are always enabled remove the check and only check for cmst_active directly. v2: Axel pointed out the code was still needed when cmst was inactive, Rebase on master too v3: Drop struct member user_ibufs also && fixup shortlog (Edward). v4: Fix negation v5: Use the right variable name csmt != cmst Fixes: 4a883966c1f7 ("gallium: remove PIPE_CAP_USER_INDEX_BUFFERS") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99953 Reported-and-tested-by: Vinson Lee <[email protected]> (v1) Cc: Marek Olšák <[email protected]> Cc: Axel Davy <[email protected]> Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Mike Lothian <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* st/nine: make use of common uploaders v4Constantine Charlamov2017-02-254-74/+37
| | | | | | | | | | Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP v4: per Axel comment: changed style of the comment
* svga: fix MSVC build error after PIPE_CAP_USER_INDEX_BUFFERS removalBrian Paul2017-02-241-1/+1
| | | | | | | Need to specify the zero for the struct initializer. My earlier test of the patch series was with MinGW, not MSVC. Trivial.
* vc4: Lazily emit our FS/VS input loads.Eric Anholt2017-02-244-75/+93
| | | | | | | | | | | | | | | | | | | This reduces register pressure in both types of shaders, by reordering the input loads from the var->data.driver_location order to whatever order they appear first in the NIR shader. These instructions aren't reorderable at our QIR scheduling level because the FS takes two in lockstep to do an interpolation, and the VS takes multiple read instructions in a row to get a whole vec4-level attribute read. shader-db impact: total instructions in shared programs: 76666 -> 76590 (-0.10%) instructions in affected programs: 42945 -> 42869 (-0.18%) total max temps in shared programs: 9395 -> 9208 (-1.99%) max temps in affected programs: 2951 -> 2764 (-6.34%) Some programs get their max temps hurt, depending on the order that the load_input intrinsics appear, because we end up being unable to copy propagate an older VPM read into its only use.
* vc4: Refactor the load_input code out of the intrinsic code.Eric Anholt2017-02-241-25/+42
| | | | It's going gain most of ntq_setup_inputs(), so simplify it first.
* vc4: Track the last block we emitted at the top level.Eric Anholt2017-02-243-5/+10
| | | | | This will be used for delaying our VPM reads (which must be unconditional) until just before they're used.
* vc4: Emit max number of temps in the shader-db output.Eric Anholt2017-02-241-0/+23
| | | | | | | We need to be paying attention to optimization's impact on this -- even if we reduce instruction count, increasing max temps in general is likely to cause us to fail to register allocate on some shaders, which means that those won't run at all.
* radeonsi: fix broken tessellation on Carrizo and StoneyMarek Olšák2017-02-251-1/+3
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850 Cc: 13.0 17.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* trace: remove pipe_resource wrappingMarek Olšák2017-02-257-260/+40
| | | | | | | | | | Not needed. ddebug does the same thing. The limitation is that drivers can only use pipe_resource::screen through pipe_resource_reference. This unbreaks trace, because pipe_context uploaders aren't wrapped, so trace doesn't understand buffers returned by them. Reviewed-by: Brian Paul <[email protected]>
* gallium: remove PIPE_CAP_USER_INDEX_BUFFERSMarek Olšák2017-02-2517-20/+0
| | | | | | | | all drivers support it Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* svga: implement user index buffersMarek Olšák2017-02-252-2/+13
| | | | | Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* freedreno: add support for user index buffersMarek Olšák2017-02-252-1/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* etnaviv: add support for user index buffersMarek Olšák2017-02-252-1/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add new helpers for user index buffer uploadingMarek Olšák2017-02-252-0/+35
| | | | | | | v3: split from the etnaviv patch; fix new_ib.buffer leak Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* gallium/util: (trivial) fix util_clear_render_targetRoland Scheidegger2017-02-241-7/+8
| | | | | | the format of the rt can be different than the one of the texture, so must propagate the format explicitly to the helper. Broken since 3f9c5d62441eba38e8b1592aba965ed5db6fd89b (but unused by st/mesa).
* vc4: automake: add the kernel/README to the tarballEmil Velikov2017-02-241-0/+2
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* st/va: Fix up YV12 to NV12 putImage conversionThomas Hellstrom2017-02-242-57/+28
| | | | | | | | | | Use the utility u_copy_nv12_from_yv12 to implement this similarly to how it's been done in the VPAU state tracker. The old code mixed up planes and fields and didn't correctly handle video surfaces in interlaced format. Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* st/vdpau: Provide YV12 to NV12 putBits conversion v2Thomas Hellstrom2017-02-243-19/+130
| | | | | | | | | | | | | | | | | mplayer likes putting YV12 data, and if there is a buffer format mismatch, the vdpau state tracker would try to reallocate the video surface as an YV12 surface. A virtual driver doesn't like reallocating and doesn't like YV12 surfaces, so if we can't support YV12, try an YV12 to NV12 conversion instead. Also advertize that we actually can do the getBits and putBits conversion. v2: A previous version of this patch prioritized conversion before reallocating. This has been changed to prioritize reallocating in this version. Cc: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* softpipe: enable clear_texture with util_clear_textureLars Hamre2017-02-242-1/+4
| | | | | | | | Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* llvmpipe: enable clear_texture with util_clear_textureLars Hamre2017-02-242-2/+4
| | | | | | | | Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>