summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-2015-1/+42
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vc4: Switch our vertex attr lowering to being NIR-based.Eric Anholt2015-10-202-143/+200
| | | | | | | | | | This exposes more information to NIR's optimization, and should be particularly useful when we do range-based optimization. total uniforms in shared programs: 32066 -> 32065 (-0.00%) uniforms in affected programs: 21 -> 20 (-4.76%) total instructions in shared programs: 93104 -> 92630 (-0.51%) instructions in affected programs: 31901 -> 31427 (-1.49%)
* vc4: Add limited support for ibfe/ubfe.Eric Anholt2015-10-201-0/+42
| | | | | This is just enough to cover our unpack modes, which will be used by some new NIR-based lowering in the next commit.
* tgsi/scan: use properties for clip/cull distance writemasksMarek Olšák2015-10-201-14/+14
| | | | | | No changes needed for drivers already relying on tgsi_shader_info. Reviewed-by: Brian Paul <[email protected]>
* gallium: add new properties for clip and cull distance usageMarek Olšák2015-10-203-1/+15
| | | | | | | | The TGSI usage mask can't be used, because these are declared as an output array of 2 elements. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: enable BC_OPTIMIZE if centroid isn't usedMarek Olšák2015-10-201-1/+5
| | | | | | This solution was recommended by a Catalyst developer. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix the export_prim_id field size in the shader keyMarek Olšák2015-10-201-2/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: support thread-safe shaders shared by multiple contextsMarek Olšák2015-10-209-199/+224
| | | | | | | | | | | | The "current" shader pointer is moved from the CSO to the context, so that the CSO is mostly immutable. The only drawback is that the "current" pointer isn't saved when unbinding a shader and it must be looked up when the shader is bound again. This is also a prerequisite for multithreaded shader compilation. Reviewed-by: Michel Dänzer <[email protected]>
* gallium: add PIPE_CAP_SHAREABLE_SHADERSMarek Olšák2015-10-2015-0/+16
| | | | | | I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: add support for ARB_texture_viewMarek Olšák2015-10-202-7/+22
| | | | | | | | | | | | All tests pass. We don't need to do much - just set CUBE if the view target is CUBE or CUBE_ARRAY, otherwise set the resource target. The reason this can be so simple is that texture instructions have a greater effect on the target than the sampler view. Thanks Glenn for the piglit test. Reviewed-by: Michel Dänzer <[email protected]>
* vc4: Use nir_foreach_variableBoyan Ding2015-10-203-7/+7
| | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/omx/dec/h264: fix field picture type 0 poc disorderLeo Liu2015-10-191-4/+8
| | | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Cc: "10.6 11.0" <[email protected]>
* scons: Build nir/glsl_types.cpp once.Jose Fonseca2015-10-196-27/+2
| | | | | | | | | | | | Undoes early hacks, and ensures nir/glsl_types.cpp is built once, and only once. The root problem is that SCons doesn't know about NIR nor any source file in the NIR_FILES source list. Tested with libgl-gdi and libgl-xlib scons targets. Reviewed-by: Brian Paul <[email protected]>
* svga: fix incorrect round-down arithmeticBrian Paul2015-10-191-1/+1
| | | | | | | | | Spotted by Roland. Luckily, this code should never really be hit since the const buffer size and offset should already be multiples of 16. I could probably add more assertions to that effect, but let's just fix the arithmetic for now. Reviewed-by: Roland Scheidegger <[email protected]>
* st/va: Added support for NV12 to IYUV conversion in vlVaGetImageIndrajit Das2015-10-191-3/+5
| | | | Reviewed-by: Christian König <[email protected]>
* st/va: Used correct parameter to derive the value of the "h" variable in ↵Indrajit Das2015-10-191-1/+1
| | | | | | | | vlVaCreateImage Cc: "11.0" <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* ilo: set VME for 3DSTATE_PSChia-I Wu2015-10-181-1/+6
| | | | | When the bit is not set, we can see sampling artifacts on triangle edges when the mip filter is not GEN6_MIPFILTER_NONE.
* ilo: ignore prefer_linear_threshold when zeroChia-I Wu2015-10-182-3/+3
| | | | This was the intended behavior but it did not work as intended until now.
* ilo: remove some unused kernel paramsChia-I Wu2015-10-182-22/+0
|
* ilo: remove unused ilo_shader_get_type()Chia-I Wu2015-10-182-12/+0
|
* ilo: remove u_debug.h inclusion from ilo_core.hChia-I Wu2015-10-182-1/+2
| | | | Move it to ilo_debug.h.
* ilo: remove u_memory.h inclusion from ilo_core.hChia-I Wu2015-10-183-1/+3
| | | | We do not make allocations generally in the core.
* nvc0: do not bind input params at compute state init on FermiSamuel Pitoiset2015-10-181-8/+0
| | | | | | | | | | | | | | | | | | It looks like binding a constant buffer on compute overwrites the 3D state. To avoid that, we already re-bind all the 3D constant buffers after launching a compute grid but this is not enough. Binding the constant buffer of input parameters for the compute state at initialization corrupts the 3D constant buffers, and it's just useless to bind it because this is not needed until we really launch a grid. This fixes some piglit regressions related to interpolation tests introduced in "nvc0: enable compute support by default on Fermi". Fixes: 00d6186 (nvc0: enable compute support by default on Fermi) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: don't use the AMDGPU intrinsic for CMPMarek Olšák2015-10-171-9/+22
| | | | | | | No difference according to shader-db. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: use LRP from gallivmMarek Olšák2015-10-171-2/+0
| | | | | | | | | | | | | | | | | | Totals: SGPRS: 344552 -> 344368 (-0.05 %) VGPRS: 197132 -> 197552 (0.21 %) Code Size: 7375376 -> 7366304 (-0.12 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1679360 -> 1615872 (-3.78 %) bytes per wave Totals from affected shaders: SGPRS: 47736 -> 47552 (-0.39 %) VGPRS: 27952 -> 28372 (1.50 %) Code Size: 1392724 -> 1383652 (-0.65 %) bytes LDS: 39 -> 39 (0.00 %) blocks Scratch: 513024 -> 449536 (-12.38 %) bytes per wave Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't emit AMDGPU intrinsics for integer abs, min, maxMarek Olšák2015-10-171-10/+50
| | | | | | | No difference according to shader-db. (with the new S_ABS_I32 pattern) Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: don't emit AMDGPU intrinsics for EX2, ROUND, TRUNCMarek Olšák2015-10-171-3/+3
| | | | | | | No difference according to shader-db. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: initialize output, temp, and address registers to "undef"Marek Olšák2015-10-171-4/+15
| | | | | | | | | | | | | | | | | | | | | This removes "v_mov v0, 0" which typically occurs before exports. Totals: SGPRS: 345216 -> 344552 (-0.19 %) VGPRS: 197684 -> 197132 (-0.28 %) Code Size: 7390408 -> 7375376 (-0.20 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1842176 -> 1679360 (-8.84 %) bytes per wave Totals from affected shaders: SGPRS: 101336 -> 100672 (-0.66 %) VGPRS: 53920 -> 53368 (-1.02 %) Code Size: 2170176 -> 2155144 (-0.69 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 1015808 -> 852992 (-16.03 %) bytes per wave Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallivm: implement the correct version of LRPMarek Olšák2015-10-171-6/+13
| | | | | | | | | The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment
* gallivm: set correct opcode info from unary/binary/ternary emitsMarek Olšák2015-10-171-3/+6
| | | | | | | | and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)
* radeonsi: implement vertex color clampingMarek Olšák2015-10-175-4/+52
| | | | | | This is only supported in the compatibility profile (without GS and tess). Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement fragment color clampingMarek Olšák2015-10-176-2/+18
| | | | | | using the shader key for now. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: clean up other scratch buffer functionsMarek Olšák2015-10-171-15/+8
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: clean up copy-pasted scratch buffer updatesMarek Olšák2015-10-171-26/+13
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: unify shader create functionsMarek Olšák2015-10-171-40/+9
| | | | | | The shader specifies the processor type, so use that instead. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: unify shader delete functionsMarek Olšák2015-10-171-67/+17
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix a GS copy shader leakMarek Olšák2015-10-171-1/+3
| | | | | Cc: [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove an unused ctx parameter in si_shader_destroyMarek Olšák2015-10-174-6/+6
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: print export_prim_id from the shader keyMarek Olšák2015-10-171-0/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable NaNs for LS and HSMarek Olšák2015-10-171-2/+4
| | | | | | | They're disabled for all other shaders except compute, but I forgot to do this for tess stages. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: clean up si_llvm_init_export_argsMarek Olšák2015-10-171-42/+35
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: move pipe_shader_from_tgsi_processor function to utilMarek Olšák2015-10-172-24/+24
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/hud: fix possible NULL pointer dereferenceMarek Olšák2015-10-171-0/+3
| | | | Trivial.
* scons: fix MSVC, MinGW buildBrian Paul2015-10-174-2/+21
| | | | Duplicate the glsl_types_hack.cpp work-around from the libgl-xlib target.
* nvc0: add support for performance monitoring metrics on FermiSamuel Pitoiset2015-10-174-3/+500
| | | | | | | | | As explained in the CUDA toolkit documentation, "a metric is a characteristic of an application that is calculated from one or more event values." Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: (mostly) remove libglsl_utilRob Clark2015-10-164-5/+1
| | | | | | | | | | | Now that NIR does not depend on glsl, we can (mostly[*]) get rid of the libglsl_util hack. [*] glsl_compiler is the one remaining user of libglsl_util Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* nir: remove dependency on glslRob Clark2015-10-162-0/+6
| | | | | | | | | | | | | | | Move glsl_types into NIR, now that the dependency on glsl_symbol_table has been split out. Possibly makes sense to rename things at this point, but if we do that I'd like to keep it split out into a separate patch to make git history easier to follow (IMHO). v2: fix android build v3: I f***ing hate scons.. but at least it builds Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* tgsi: initialize ctx.file in tgsi_dump_instruction()Brian Paul2015-10-161-0/+1
| | | | | Fixes segfault because of uninitialized file pointer. Trivial.
* nvc0: add a note about MP counters on GF100/GF110Samuel Pitoiset2015-10-161-0/+5
| | | | | | | | | MP counters on GF100/GF110 (compute capability 2.0) are buggy because there is a context-switch problem that we need to fix. Results might be wrong sometimes, be careful! Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add MP counters variants for GF100/GF110Samuel Pitoiset2015-10-162-77/+483
| | | | | | | | | GF100 and GF110 chipsets are compute capability 2.0, while the other Fermi chipsets are compute capability 2.1. That's why, some MP counters are different between these chipsets and we need to handle variants. Signed-off-by: Samuel Pitoiet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>