summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: disable sinking common instructions down to the end blockSamuel Pitoiset2017-03-151-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add missing compute shader entry in tgsi_get_processor_name()Samuel Pitoiset2017-03-151-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: clean up tex_fetch_ptrs()Samuel Pitoiset2017-03-151-6/+4
| | | | | | | | Will also help when the src sampler register will be TGSI_FILE_CONSTANT for bindless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: use drmGetDevice2 APIEmil Velikov2017-03-151-2/+2
| | | | | | | | | | | Analogous to previous commit Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Mike Lothian <[email protected]>
* r600: refactor binding code for attach buffer to CB.Dave Airlie2017-03-151-33/+78
| | | | | | | | | This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor out CB setup.Dave Airlie2017-03-151-104/+143
| | | | | | | | | This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor texture resource words setup code.Dave Airlie2017-03-151-88/+131
| | | | | | | | This refactors out the code to setup a texture resource so we can reuse it later from the images code. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: factor out the code to initialise a buffer resource.Dave Airlie2017-03-151-29/+51
| | | | | | | | | | This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: make framebuffer atom rely on dual src blend state.Dave Airlie2017-03-154-2/+7
| | | | | | | | In order to make ARB_shader_image_load_store, we have to share the CB space with RATs, so we should only steal the dual src space if we have dual src enabled. Signed-off-by: Dave Airlie <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-145-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* gallium/radeon: disable the shader cache if dumping shadersMarek Olšák2017-03-131-0/+5
| | | | | | otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: mark all bound shader buffer ranges as initializedMarek Olšák2017-03-131-0/+3
| | | | | | | | This should prevent cases when a buffer was incorrectly mapped without synchronization just because this wasn't done. Cc: 13.0 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium/hud: check NULL return from u_upload_allocJulien Isorce2017-03-131-0/+5
| | | | | | | | | | | | | | | | | | | | Fixes the following segmentation fault: signal SIGSEGV: invalid address (fault address: 0x0) frame #0: 0x00007fffe718e117 radeonsi_dri.so hud_draw_background_quad hud_context.c:170 167 168 assert(hud->bg.num_vertices + 4 <= hud->bg.max_num_vertices); 169 -> 170 vertices[num++] = (float) x1; 171 vertices[num++] = (float) y1; 172 173 vertices[num++] = (float) x1; (lldb) bt * frame #0: 0x00007fffe718e117 radeonsi_dri.so`hud_draw_background_quad frame #1: 0x00007fffe718f458 radeonsi_dri.so`hud_draw frame #2: 0x00007fffe712967f radeonsi_dri.so`dri_flush Signed-off-by: Marek Olšák <[email protected]>
* winsys/radeon: check null return from radeon_cs_create_fence in cs_flushJulien Isorce2017-03-131-11/+13
| | | | | | | | | | | | Follow-up of patch: "radeon_cs_create_fence: check null return from radeon_winsys_bo_create" radeon_drm_cs_flush radeon_cs_create_fence radeon_winsys_bo_create Signed-off-by: Julien Isorce <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* winsys/radeon: check null in radeon_cs_create_fenceJulien Isorce2017-03-131-0/+3
| | | | | | | | | | | | | | Fixes the following segmentation fault: radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c -> if (!bo->handle) (gdb) bt 0 radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c 1 0x00007fffe73575de in radeon_cs_create_fence radeon_drm_cs.c 2 0x00007fffe7358c48 in radeon_drm_cs_flush radeon_drm_cs.c Signed-off-by: Julien Isorce <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* freedreno/ir3: fragz cannot be half precisionRob Clark2017-03-131-0/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: optimize less in glslRob Clark2017-03-131-1/+1
| | | | | | | | | | | | | | | | | | | Rely on nir for optimization, to reduce compile times. Very minimal impact on shader-db: total instructions in shared programs: 104170 -> 104199 (0.03%) total dwords in shared programs: 209664 -> 209728 (0.03%) total full registers used in shared programs: 7156 -> 7161 (0.07%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 24222 -> 24224 (0.01%) half full const instr dwords helped 12 107 103 112 98 hurt 11 104 105 115 102 But shader db runtime dropped from ~29.3s user to ~20.4s user. Signed-off-by: Rob Clark <[email protected]>
* svga: handle P016 format as wellChristian König2017-03-131-0/+1
| | | | | | Fixes: 62cff793785 ("gallium: add P016 format") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100180 Reviewed-by: Emil Velikov <[email protected]>
* st/va: add config support for 10bit decoding v2Christian König2017-03-132-4/+21
| | | | | | | | | Advertise 10bpp support if the driver supports decoding to a P016 surface. v2: Advertise 10bpp for the decoder as well. Signed-off-by: Christian König <[email protected]> Signed-off-by: Mark Thompson <[email protected]>
* st/va: add support for allocating 10bpp surfacesChristian König2017-03-131-9/+15
| | | | | | | We support P010 and P016 as targets for 10bpp video decoding. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* st/va: add support for P010 and P016 formats v3Christian König2017-03-134-3/+22
| | | | | | | | | | | | No hardware I know off can actually support P010 natively. But we can easily support P016 and as long as nobody decodes anything into the lower 6bits it doesn't make any difference to P010. v2: allow P0160 for post processing as well v3: fix post processing once more Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* st/va: clear the video surface on allocationChristian König2017-03-131-4/+35
| | | | | | | This makes debugging of decoding problems quite a bit easier. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* st/va: cleanup error handling in vlVaCreateSurfaces2Christian König2017-03-131-25/+27
| | | | | | | No need to have that twice. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* radeon/uvd: enable 10bit HEVC decode v2Christian König2017-03-132-8/+20
| | | | | | | | | Just use whatever the state tracker allocated. v2: fix msb mode Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* radeon/UVD: fix the decoding target pitch calculationChristian König2017-03-131-1/+1
| | | | | | | | The firmware expects the value in pixel not bytes. Didn't made a difference so far because we only used 8bpp surfaces. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* vl/video_buffer: add support for P016Christian König2017-03-131-0/+10
| | | | | | | Just simply the description of the planes. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* gallium: add P016 formatChristian König2017-03-134-0/+42
| | | | | | | Same layout as NV12, but 16bit per channel instead of 8. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* gallium/util: replace pipe_thread_setname() with u_thread_setname()Timothy Arceri2017-03-123-15/+3
| | | | | | | They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_thread_get_time_nano() with u_thread_get_time_nano()Timothy Arceri2017-03-121-17/+1
| | | | | | | They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_thread_create() with u_thread_create()Timothy Arceri2017-03-127-33/+7
| | | | | | | They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: move u_queue.{c,h} to src/utilTimothy Arceri2017-03-123-442/+0
| | | | | | | This will allow us to use it outside of gallium for things like compressing shader cache entries. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: make use of new u_thread.h in u_queue.{c,h}Timothy Arceri2017-03-122-4/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: use standard malloc/calloc/free in u_queue.cTimothy Arceri2017-03-121-10/+10
| | | | | | This will help us moving the file to the shared src/util dir. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: move u_string.h to src/util/u_string.hTimothy Arceri2017-03-124-207/+2
| | | | | | | | This will help us move u_queue.c here eventually and also provide string function wrappers for anyone wishing to port disk_cache.c to windows. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: remove unused header from u_string.hTimothy Arceri2017-03-121-1/+0
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: remove unused util_strbuf*Timothy Arceri2017-03-121-37/+0
| | | | | | Looks like they have been unused since 2008 b8a7eef242f6bb97d90f. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: remove unused util_memmove()Timothy Arceri2017-03-121-20/+0
| | | | | | | This is not used anywhere and Visual Studio looks to have supported memmove() for a long time if not always. Reviewed-by: Marek Olšák <[email protected]>
* st/xa: suffix xa-indent{,.sh} and add a shebang lineEmil Velikov2017-03-102-2/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium/tools: use correct shebang for python scriptsEmil Velikov2017-03-106-6/+6
| | | | | | | | These are python2 scripts and the generic "python" may point to python3. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium/tools: do not hardcode bash locationEmil Velikov2017-03-102-2/+2
| | | | | | | | It is not guaranteed to be in /bin Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium/tests: remove execute bit from TGSI shader - vert-uadd.shEmil Velikov2017-03-101-0/+0
| | | | | | | | | Just like the the dozens of other shaders, the file is parsed by separate tool and not executed. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium: remove shebang from python scriptsEmil Velikov2017-03-105-5/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium: remove execute bit from the python script(s)Emil Velikov2017-03-102-0/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* svga: remove shebang from svgadump/svga_dump.pyEmil Velikov2017-03-101-1/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* svga: remove execute bit from svga_dump.pyEmil Velikov2017-03-101-0/+0
| | | | | | | | | | | | The file is used to generate svgadump/svga_dump.c... in theory at least. Atm. the file is checked in-tree but that is about to change later commits. As we get to that we'll use $PYTHON2 or equivalent as used throughout the tree. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: remove shebang from ir3_nir_trig.pyEmil Velikov2017-03-101-1/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: remove execute bit from ir3_nir_trig.pyEmil Velikov2017-03-101-0/+0
| | | | | | | | The file is meant to be called with $(PYTHON2) and not executed directly. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nv50/ir: check for origin insn in findOriginForTestWithZeroPierre Moreau2017-03-091-0/+2
| | | | | | | | Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* swr: s/uint/enum pipe_render_cond_flag/Vinson Lee2017-03-081-1/+1
| | | | | | | | | | | | | | Fix build error. swr_context.cpp: In function ‘void swr_blit(pipe_context*, const pipe_blit_info*)’: swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive] ctx->render_cond_mode); ~~~~~^~~~~~~~~~~~~~~~ Fixes: b0d39384307d ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/nine: pass NULL to ureg_get_tokens()Timothy Arceri2017-03-092-4/+2
| | | | | | | The number of tokens in never used and the pointer is NULL checked so just pass NULL. Reviewed-by: Axel Davy <[email protected]>