summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a3xx: LAYERSZ2 appears to have no effect on arraysIlia Mirkin2015-03-281-2/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* llvmpipe: simplify address calculation for 4x4 blocksRoland Scheidegger2015-03-284-76/+35
| | | | | | | | | | | | | | These functions looked quite complicated, even though what they actually did was trivial (ever since we dropped swizzled rendering). Also drop lookup of format block per bytes done for each block, and do it once per scene instead. This improves everybody's favorite "benchmark" by 3% or so, though lp_rast_shade_quads_all() which calls this shows up still quite high for a function which does little more than call the jit function. (This would most likely be much better handled by the jit function itself, the strides are passed through anyway already, though for being able to handle layers it would definitely add some complexity.) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix texture function name (key) when using txf/ldRoland Scheidegger2015-03-281-2/+5
| | | | | | | | | | When using the texel fetch functions rather than ordinary texturing, the arguments are all int vecs instead of float vecs, not to mention the actual function would look completely different. Hence this must be included in the texture function name (which serves as the key) otherwise things crash badly when a shader accesses the same texture and sampler unit with both txf/ld and ordinary texturing instructions with otherwise matching keys.
* nv50/ir/gk110: fix offset flag position for TXD opcodeIlia Mirkin2015-03-271-0/+1
| | | | | Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: take postFactor into account when doing peephole optimizationsIlia Mirkin2015-03-271-4/+8
| | | | | | | | | | Multiply operations can have a post-factor on them, which other ops don't support. Only perform the peephole optimizations when there is no post-factor involved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758 Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* gallivm: Fix build since llvm r233411Jan Vesely2015-03-271-0/+4
| | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallivm: use llvm function calls for texturing instead of inliningRoland Scheidegger2015-03-272-26/+438
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are issues with inlining everything, most notably llvm will use much more memory (and be slower) when compiling. Ideally we'd probably use functions for shader functions too but texture sampling usually is responsible for quite some IR (it can easily reach 80% of total IR instructions) so this seems like a good start. This still generates a different function for all different combinations just like before, however it is possible llvm is missing some optimization opportunities - it is believed though such opportunities should be somewhat rare, but at least for now it can still be switched off (at compile time only). It should probably make compiled code also smaller because the same function should be used for different variants in the same module (so for the opaque/partial or linear/elts variants). No piglit change (though it does indeed speed up unrealistic tests like fp-indirections2 by a factor of 30 or so). Has a small negative performance impact in openarena - I suspect this could be fixed by running some IPO passes (despite the private linkage, llvm right now does NO optimization at all wrt anything going past the call, even if there's just one caller - so things like values stored before the call and then always written by the function etc. will not be optimized away, nor will dead arguments (which we mostly shouldn't have) be eliminated, always constant arguments promoted etc.). v2: use proper return values instead of pointer function arguments. llvm supports aggregate return values, which do wonders here eliminating unnecessary stack variables - everything in fact will be returned in registers even without any IPO optimizations. It makes the code simpler too. With this I could not measure a peformance impact in openarena any longer (though since there's still no constant value propagation etc. into the tex functions this does not mean it couldn't have a negative impact elsewhere). v3: fix some minor issues suggested by Jose, and do disassembly (and the profiling) without hacks. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: pass jit_context pointer through to samplingRoland Scheidegger2015-03-2711-112/+171
| | | | | | | | | | | | | | The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/vl: partially revert "Use util_cpu_to_le{16,32} in many more places."Christian König2015-03-271-1/+5
| | | | | | | | The data in memory is in big endian format and needs to be converted into CPU byte order. So the patch actually reversed what needs to be done. Signed-off-by: Christian König <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* tgsi: fix out-of-bounds access for cube arraysIlia Mirkin2015-03-261-1/+1
| | | | | | | | | | The CUBE_ARRAY case uses r[4]. Make sure that the stack variable is there. Noticed by Coverity. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: remove u_linkageIlia Mirkin2015-03-265-220/+0
| | | | | | | | Does not appear to be used in tree. Coverity spotted some errors in the bitmask stuff, but the whole thing appears to be unused. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/hud: avoid overflowing hud graph name sizeIlia Mirkin2015-03-261-1/+2
| | | | | | | Spotted by Coverity. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: Use HAVE___BUILTIN_FFS* macros.Jonathan Gray2015-03-241-2/+16
| | | | | | | | | Make use of the builtin ffs macros and split out ffsll to a seperate block. Needed for at least OpenBSD which does not have ffsll in libc. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vc4: Add a dump-the-surface-contents routine.Eric Anholt2015-03-242-0/+101
| | | | | This has been useful once again while trying to debug stride issues between render targets and texturing.
* vc4: Allow DRI3 on simulation, as well.Eric Anholt2015-03-241-0/+5
| | | | The problem I'd seen before seems to be gone.
* vc4: Fix pitch alignment of linear textures.Eric Anholt2015-03-241-1/+1
| | | | | Fixes some non-power-of-two texture rendering when I force ARGB8888 to raster.
* vc4: Write the alignment of level width consistently in validation.Eric Anholt2015-03-241-2/+2
| | | | | | 16 / cpp happens to be the same as utile_w on the only raster format supported (4 bytes per pixel), but simulator/hw source code generally talks in terms of utiles.
* vc4: Fix use of a bool as an enum.Eric Anholt2015-03-241-1/+1
| | | | The enum compared to was 0, so it worked out, but it sure looked wrong.
* vc4: Decide the HW's format before laying out the miptree.Eric Anholt2015-03-241-3/+3
| | | | | | I'm experimenting with a workaround for raster texture misrendering on hardware, and this lets me look at the format chosen when computing strides.
* vc4: Use our device-specific ioctls for create/mmap.Eric Anholt2015-03-241-15/+36
| | | | | | They don't do anything special for us, but I've been told by kernel maintainers that relying on dumb for my acceleration-capable buffers is not OK.
* vc4: Make a new #define for making code conditional on the simulator.Eric Anholt2015-03-243-15/+25
| | | | | | I'd like to compile as much of the device-specific code as possible when building for simulator, and using if (using_simulator) instead of ifdefs helps.
* vc4: Add some useful debug printfs for miptrees.Eric Anholt2015-03-241-0/+37
| | | | I keep rewriting these.
* Revert "nv50,nvc0: remove bogus 64_FLOAT formats"Ilia Mirkin2015-03-231-0/+5
| | | | | | | | | | | | | | | This reverts commit 20346808cf4f1ee4f320afaf18f94043fb146f2e. The conversion is actually done since these are the *B macro variants and no vtx format is supplied, which makes them go through the translate module. This restores the following piglit tests to passing: draw-vertices user gl-2.0-vertexattribpointer Signed-off-by: Ilia Mirkin <[email protected]>
* clover: Return 0 as storage size for local kernel args that are not set v2Tom Stellard2015-03-231-1/+1
| | | | | | | | | | | | | | | | | The storage size for local kernel args can be queried before the arguments are set by using the CL_KERNEL_LOCAL_MEM_SIZE param of clGetKernelWorkGroupInfo(). The spec says that if local kernel arguments have not been specified, then we should assume their size is 0. v2: - Implement using c++11 member initialization. Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Cc: 10.5 10.4 <[email protected]>
* gallivm: Use MCInstrInfo in the disassembler for querying instruction infoTom Stellard2015-03-231-7/+1
| | | | This fixes the build since llvm r232885 and also simplifies the code.
* clover: use get_device_vendor instead of get_vendorGiuseppe Bilotta2015-03-231-1/+1
| | | | | | | | | | | | The pipe's get_vendor method returns something more akin to a driver vendor string in most cases, instead of the actual device vendor. Use get_device_vendor instead, which was introduced specifically for this purpose. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium: implement get_device_vendor() for existing driversGiuseppe Bilotta2015-03-2313-0/+83
| | | | | | | | | The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallium: introduce get_device_vendor() entrypoint for pipesGiuseppe Bilotta2015-03-232-0/+14
| | | | | | | | | This will be needed by Clover to return the correct information to CL_DEVICE_VENDOR info queries. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium: remove trailing whitespace in p_screen.hGiuseppe Bilotta2015-03-231-1/+1
| | | | | | Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* clover: The unit for CL_DEVICE_MEM_BASE_ADDR_ALIGN is bits not bytesTom Stellard2015-03-231-0/+3
| | | | | Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* clover: Add all the mandatory 1.1 extensions to the extension stringTom Stellard2015-03-231-1/+7
| | | | | Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* clover: Add a space at the end of CL_DEVICE_OPENCL_C_VERSIONTom Stellard2015-03-231-1/+1
| | | | | | | This is required by the spec. Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallivm: Silence unused variable warnings on release builds.Jose Fonseca2015-03-222-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>
* galahad: actually remove the driverEmil Velikov2015-03-2110-1998/+0
| | | | | | | | Should have been part of 429a4355259(galahad: remove driver). Seems like I've erroneously committed the trimmed patch. Reported-by: Marek Olšák <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* galahad: remove driverEmil Velikov2015-03-2112-40/+12
| | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* gallium/docs: remove information about identity driverEmil Velikov2015-03-211-7/+0
| | | | | | | Removed from tree. Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* winsys/sw/fbdev: remove unused software winsysEmil Velikov2015-03-216-343/+0
| | | | | | | st/egl was its only user. Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* winsys/sw/wayland: remove unused winsysEmil Velikov2015-03-215-367/+0
| | | | | | | st/egl was its only user. Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* st/gbm: remove state-trackerEmil Velikov2015-03-219-543/+0
| | | | | | | st/egl was its only user. Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* llvmpipe: use global llvm context for PIPE_SUBSYSTEM_EMBEDDEDRoland Scheidegger2015-03-211-0/+11
| | | | | | | | | | | | | | | | | There's 2 reasons why we'd want to use the global context: 1) There still seems to be one memory "leak" left when using multiple llvm contexts (it is not a true leak as the memory disappears into some still addressable pool but nevertheless the memory consumption grows). See http://cgit.freedesktop.org/~jrfonseca/llvm-jitstress/ 2) These contexts get kinda big - even when disposing modules etc. after compiling a shader the LLVMContext can easily be over 100kB. So when there's lots of llvm contexts arounds it adds up. The downside is that at least right now this is absolutely not thread safe, so this only works safely in environments where multiple pipe contexts are not used concurrently. Reviewed-by: Jose Fonseca <[email protected]>
* u_primconvert: add primitive restart supportDave Airlie2015-03-207-87/+203
| | | | | | | | | | | | | | | | | | | | This add primitive restart support to the prim conversion. This involves changing the API for the translate functions as we need to pass the prim restart index and the original number of indices into the translate functions. primitive restart is support for quads, quad strips and polygons. This deal with the case where the actual number of output primitives is less than the initially calculated number, by filling the rest of the output primitives with the restart index, the other option is to reduce the output prim number, but that will make the generator code a bit messier. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallivm: remove unused 'builder' variableBrian Paul2015-03-191-1/+0
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Use INFINITY directly.Jose Fonseca2015-03-181-8/+1
| | | | | | Already done below. Reviewed-by: Brian Paul <[email protected]>
* freedreno/ir3: fix infinite recursion in schedRob Clark2015-03-181-1/+1
| | | | | | | One more case we need to handle. One of the src instructions for the indirect could also end up being ourself. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix spellingRob Clark2015-03-181-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coordsMarek Olšák2015-03-182-2/+2
| | | | | | | | | radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.) Discovered by Coverity. Reported by Ilia Mirkin. Cc: 10.5 10.4 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: constify r600_shader_tgsi_instruction lists.Emil Velikov2015-03-171-5/+5
| | | | | | | Massive list of constant data. Annotate it as such. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: kill off r600_shader_tgsi_instruction::{tgsi_opcode,is_op3}Emil Velikov2015-03-171-591/+589
| | | | | | | | Both of which are no longer used. Use designated initializer to make things obvious as people add/remove TGSI_OPCODEs. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: use the tgsi opcode from parse.FullToken.FullInstructionEmil Velikov2015-03-171-5/+8
| | | | | | | | | ... rather than the local one in inst_info->tgsi_opcode. This will allow us to simplify struct r600_shader_tgsi_instruction. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: abort properly when running out of buffer space in lp_disassemblyRoland Scheidegger2015-03-171-4/+8
| | | | | | | Before this actually ran into an infinite loop printing out "invalid"... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>