summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: check the IR type before waiting for a compute compilation fenceMarek Olšák2017-03-201-1/+3
| | | | | | | This should fix OpenCL getting stuck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288 Reviewed-by: Samuel Pitoiset <[email protected]>
* si_descriptor: move velems nullity check before dereferenceJulien Isorce2017-03-201-4/+11
| | | | | | | | | CID 1399479: Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking velems suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* si_pipe: remove nullity check after dereferenceJulien Isorce2017-03-201-3/+0
| | | | | | | | | sscreen cannot be NULL CID 1354483 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: Fix out of bounds accessBartosz Tomczyk2017-03-202-20/+22
| | | | | | | | | fc_sp variable should indicate number of elements in fc_stack array, but fc_sp was increased at beginning of fc_pushlevel function. It leads to situation where idx=0 was never used, and last 32 element was stored outside fs_stack array. Signed-off-by: Marek Olšák <[email protected]>
* r600g: update sb documentationConstantine Kharlamov2017-03-201-3/+6
| | | | | | | v2: s/r600/r600g in the title Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: make condition clearerConstantine Kharlamov2017-03-201-6/+8
| | | | | | | | | | | | | | | | | The second check in the old code looked pretty much unreachable, esp. because it's not obvious that "max_entries" could be zero. To find out that it was intentional I had to run some checks, and to dig into the old versions of the file. So, rewrite the check to make the intention clear. v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap lines of condition. Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Acked-by: Dieter Nützel <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* nv30: create uploader after pipe->screen is setIlia Mirkin2017-03-191-6/+6
| | | | | | Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: enable TEX_LZ and TXF_LZIlia Mirkin2017-03-183-4/+17
| | | | | | | | | There should be minimal gain, if any, for nvc0, but nv50 may end up noticing more often that the lod argument is uniform. This, in turn, will remove the need for some unnecessary transformations, which were being hit due to the checks being done pre-ssa. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: treat FMA like MAD for operand propagationKarol Herbst2017-03-181-0/+1
| | | | | | | | | | | | | | | | | | Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium/radeon: formalize that create_batch_query doesn't need pipe_contextMarek Olšák2017-03-173-13/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: formalize that create_query doesn't need pipe_contextMarek Olšák2017-03-173-32/+32
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* gallium/radeon: reference pipe_resource in pipe_transferMarek Olšák2017-03-172-2/+5
| | | | | | for threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: compile all TGSI compute shaders asynchronouslyMarek Olšák2017-03-171-44/+81
| | | | | | required by threaded gallium Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: require that compiler threads are enabledMarek Olšák2017-03-172-11/+13
| | | | | | | threaded gallium can't use pipe_context's LLVM target machine, because create_shader_selector can be called from a non-driver thread. Reviewed-by: Timothy Arceri <[email protected]>
* trace: remove leftover assertions after pipe_resource wrapping removalMarek Olšák2017-03-171-6/+0
|
* swr: support layer output in geometry shadersIlia Mirkin2017-03-151-0/+2
| | | | | | | | | This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with 2DMS_ARRAY, which is expected given the lackluster MSAA support. However all the regular types pass. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: validate backend state numAttributesTim Rowley2017-03-151-0/+2
| | | | | | | | General protection and prevents us from smashing the stack on the first clear state validation (a7b8d50bcb). Fixes crash using icc. Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: implement TGSI opcodes TEX_LZ and TXF_LZMarek Olšák2017-03-152-6/+16
| | | | | | | | | | This massively decreases VGPR spilling for DiRT Showdown, because we no longer have to use v4i32 for 2D fetches when level == 0. We now use v2i32 for those cases. DiRT Showdown - Spilled VGPRs: -26 (-81%) This surprisingly doesn't have any useful effect on performance (+ 0.05%).
* gallium: add PIPE_CAP_TGSI_TEX_TXF_LZMarek Olšák2017-03-1515-0/+15
|
* radeonsi: disable sinking common instructions down to the end blockSamuel Pitoiset2017-03-151-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: clean up tex_fetch_ptrs()Samuel Pitoiset2017-03-151-6/+4
| | | | | | | | Will also help when the src sampler register will be TGSI_FILE_CONSTANT for bindless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600: refactor binding code for attach buffer to CB.Dave Airlie2017-03-151-33/+78
| | | | | | | | | This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor out CB setup.Dave Airlie2017-03-151-104/+143
| | | | | | | | | This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor texture resource words setup code.Dave Airlie2017-03-151-88/+131
| | | | | | | | This refactors out the code to setup a texture resource so we can reuse it later from the images code. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: factor out the code to initialise a buffer resource.Dave Airlie2017-03-151-29/+51
| | | | | | | | | | This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: make framebuffer atom rely on dual src blend state.Dave Airlie2017-03-154-2/+7
| | | | | | | | In order to make ARB_shader_image_load_store, we have to share the CB space with RATs, so we should only steal the dual src space if we have dual src enabled. Signed-off-by: Dave Airlie <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-144-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* gallium/radeon: disable the shader cache if dumping shadersMarek Olšák2017-03-131-0/+5
| | | | | | otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: mark all bound shader buffer ranges as initializedMarek Olšák2017-03-131-0/+3
| | | | | | | | This should prevent cases when a buffer was incorrectly mapped without synchronization just because this wasn't done. Cc: 13.0 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* freedreno/ir3: fragz cannot be half precisionRob Clark2017-03-131-0/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: optimize less in glslRob Clark2017-03-131-1/+1
| | | | | | | | | | | | | | | | | | | Rely on nir for optimization, to reduce compile times. Very minimal impact on shader-db: total instructions in shared programs: 104170 -> 104199 (0.03%) total dwords in shared programs: 209664 -> 209728 (0.03%) total full registers used in shared programs: 7156 -> 7161 (0.07%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 24222 -> 24224 (0.01%) half full const instr dwords helped 12 107 103 112 98 hurt 11 104 105 115 102 But shader db runtime dropped from ~29.3s user to ~20.4s user. Signed-off-by: Rob Clark <[email protected]>
* svga: handle P016 format as wellChristian König2017-03-131-0/+1
| | | | | | Fixes: 62cff793785 ("gallium: add P016 format") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100180 Reviewed-by: Emil Velikov <[email protected]>
* radeon/uvd: enable 10bit HEVC decode v2Christian König2017-03-132-8/+20
| | | | | | | | | Just use whatever the state tracker allocated. v2: fix msb mode Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* radeon/UVD: fix the decoding target pitch calculationChristian König2017-03-131-1/+1
| | | | | | | | The firmware expects the value in pixel not bytes. Didn't made a difference so far because we only used 8bpp surfaces. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* gallium/util: replace pipe_thread_setname() with u_thread_setname()Timothy Arceri2017-03-121-1/+1
| | | | | | | They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_thread_create() with u_thread_create()Timothy Arceri2017-03-124-4/+4
| | | | | | | They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <[email protected]>
* svga: remove shebang from svgadump/svga_dump.pyEmil Velikov2017-03-101-1/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* svga: remove execute bit from svga_dump.pyEmil Velikov2017-03-101-0/+0
| | | | | | | | | | | | The file is used to generate svgadump/svga_dump.c... in theory at least. Atm. the file is checked in-tree but that is about to change later commits. As we get to that we'll use $PYTHON2 or equivalent as used throughout the tree. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: remove shebang from ir3_nir_trig.pyEmil Velikov2017-03-101-1/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: remove execute bit from ir3_nir_trig.pyEmil Velikov2017-03-101-0/+0
| | | | | | | | The file is meant to be called with $(PYTHON2) and not executed directly. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nv50/ir: check for origin insn in findOriginForTestWithZeroPierre Moreau2017-03-091-0/+2
| | | | | | | | Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* swr: s/uint/enum pipe_render_cond_flag/Vinson Lee2017-03-081-1/+1
| | | | | | | | | | | | | | Fix build error. swr_context.cpp: In function ‘void swr_blit(pipe_context*, const pipe_blit_info*)’: swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive] ctx->render_cond_mode); ~~~~~^~~~~~~~~~~~~~~~ Fixes: b0d39384307d ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vc4: Fix math with a condition flag set.Eric Anholt2017-03-082-3/+18
| | | | | | | | | | | Math results land in r4, regardless of the condition. To implement them, we just need to ensure that the results are moved out of r4 (as often happens anyway, the values is live across another math instruction), so that we can attach the condition to the MOV. Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a couple others, that were assertion failing that their conditions hadn't been handled during the QIR->QPU stage.
* vc4: Fix register pressure cost estimates when a src appears twice.Eric Anholt2017-03-081-3/+13
| | | | | | | | | | This ended up confusing the scheduler for things like fabs (implemented as fmaxabs x, x) or squaring a number, and it would try to avoid scheduling them because it appeared more expensive than other instructions. Fixes failure to register allocate in dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db effects (+.35% max temps)
* vc4: Report to shader-db how many threads a fragment shader has.Eric Anholt2017-03-081-0/+7
| | | | | Doing instruction count analysis when we emit the thread switches that will save us from tons of stalls is kind of missing the point.
* Revert "vc4: Lazily emit our FS/VS input loads."Eric Anholt2017-03-084-93/+75
| | | | | | This reverts commit 292c24ddac5acc35676424f05291c101fcd47b3e. It broke a lot of GLES2 deqp, and I see at least one problem that will require some serious rework to fix.
* radeonsi: fix elimination of literal VS outputsMarek Olšák2017-03-081-4/+7
| | | | | | broken when switched to the new intrinsics. Reviewed-by: Samuel Pitoiset <[email protected]>
* android: r600: fix libmesa_amd_common dependencyMauro Rossi2017-03-081-0/+1
| | | | | | | | | | | | | | | Adding libmesa_amd_common dependency and exporting its headers, avoids the following building error: external/mesa/src/gallium/drivers/r600/evergreen_compute.c:29:10: fatal error: 'ac_binary.h' file not found ^ 1 error generated. Fixes: 3bbbb63 "automake: r600: radeonsi: correctly manage libamd_common.la linking" Fixes: 503fb13 "radeon/ac: switch to ac_shader_binary_config_start()" v2 [Emil Velikov: drop unneeded LOCAL_EXPORT_C_INCLUDE_DIRS] Signed-off-by: Emil Velikov <[email protected]>
* radeonsi: s/uint/enum pipe_shader_type/Brian Paul2017-03-082-2/+4
| | | | | | This can probably be done in more places in the driver. Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()Brian Paul2017-03-0813-13/+13
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>