summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* st/nine: Fix use of D3DSP_NOSWIZZLEAxel Davy2015-02-061-1/+1
| | | | | | | | | D3DSP_NOSWIZZLE already contains the shift. Detected with Clang. Reviewed-by: Tiziano Bacocco <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Check for the correct number of constants.Axel Davy2015-02-062-4/+5
| | | | | | | | | | | | | | | | | This removes unneeded hack for Anno 1404. This app is not checking the number of supporting constants, and rely on the shader compilation to fail if it puts too many constants. This patch also checks for the correct number of constants for ps. Note that we don't check the official limitations for old vs and ps versions. The restrictions were fixed, unlike for the number of vertex shader constants for later versions. Likely apps use the correct number, and it's not a problem for us if it wants use more. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Introduce failure handling for shader parsing.Axel Davy2015-02-061-8/+30
| | | | | | | | | Instead of crashing on buggy shaders, we should return an error. This patch introduces this behaviour in the case of invalid constant access Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Print warnings for r500 when shader is likely to go wrongAxel Davy2015-02-061-0/+6
| | | | | | | | | | r500 hasn't enough float constants for vs to fill all needs. Overlapping issues can happen with complex shaders. The fix would be to recompile shaders to include the integer and boolean constants, instead of reserving slots for them. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Declare constants only up to the maximum needed.Axel Davy2015-02-061-27/+11
| | | | | | | | | | | | | | | Previously 276 constants were declared everytime. This patch makes shaders declare constants up to the maximum constant needed and moves the moment we print the TGSI shader after the moment we declare the constants. This is needed for r500, since when indirect addressing is used, it cannot reduce the amount of constants needed, and that it is restricted to 256 constant slots. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Refactor how user constbufs sizes are calculatedAxel Davy2015-02-065-23/+24
| | | | | | | | Count explicitly the slots for float, int and bool constants, and deduce the constbuf size in nine_shader. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Explicit nine requirementsAxel Davy2015-02-062-49/+66
| | | | | | | | | | | | | | | | | | | | This patch raises nine requirements and disables nine for old hw that don't match them. Currently for these cards only games that don't have tight requirements would work well with nine. However nine is missing several checks regarding these limitations. To make code and future patches less heavy, dropping support for these old card seems a good solution. That makes r500 the only dx9 generation cards supported by nine. It seems the one with the less limitations for nine. Still not everything is ok, and we'll have for example to implement shader recompilation for these cards to include integer and boolean constants in the shader. Eventually when this is done, we can reintroduce support for older cards. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* gallium: Add MULTISAMPLE_Z_RESOLVE capAxel Davy2015-02-0615-0/+20
| | | | | | | | | | | | | | | | Resolving a multisampled depth texture into a single sampled texture is supported on >= SM4.1 hw. It is possible some previous hw support it. The ability was tested on radeonsi and nvc0. Apparently is is also supported for radeon >= r700. This patch adds the MULTISAMPLE_Z_RESOLVE cap and add it to the drivers. It is advertised for drivers for which it is sure the ability is supported. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT.Jose Fonseca2015-02-051-1/+2
| | | | | | | | | | | Nothing special needs to be done. Even though llvmpipe copies constant (ie uniform) buffers internally, the application is supposed to flush and sync, so all should work. All bufferstorage piglit tests pass. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: Don't implement u_bit_scan64 on MSVC.Jose Fonseca2015-02-041-0/+2
| | | | | | | | | | As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by radeonsi which is never built with MSVC. This is just a stop-gap fix to unbreak MSVC build until we refactor these mathematical portability wrappers into src/util. Trivial.
* gallium/util: Define ffsll on MinGW.Jose Fonseca2015-02-041-0/+1
| | | | | | Trivial. (Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)
* radeonsi: implement polygon stipplingMarek Olšák2015-02-047-5/+79
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add polygon stipple texture slotMarek Olšák2015-02-041-5/+8
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: deduce rasterizer primitive type at the beginning of draw_vboMarek Olšák2015-02-042-13/+17
| | | | | | I will need this for polygon stippling. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow 64 descriptors per arrayMarek Olšák2015-02-042-34/+34
| | | | | | | We need a slot for the stipple texture and the pixel shader already uses 32 textures (16 API slots + 16 FMASK slots). Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for sampler views where resource = NULLMarek Olšák2015-02-042-6/+22
| | | | | | | The hardware obeys swizzles even if the resource is NULL. This will be used by set_polygon_stipple. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for NULL texture sampler views that return (0,0,0,1)Marek Olšák2015-02-041-2/+28
| | | | | | This used to hang. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix a crash when binding a NULL sampler view listMarek Olšák2015-02-041-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move the buffer descriptor to the end of the image descriptorMarek Olšák2015-02-044-7/+9
| | | | | | This will allow supporting NULL textures. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't use tgsi_parse_context to get processor typeMarek Olšák2015-02-041-7/+1
| | | | | | Also remove unused "tokens". Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix instanced arrays with non-zero start instanceMarek Olšák2015-02-041-3/+3
| | | | | | | Fixes piglit ARB_base_instance/arb_base_instance-drawarrays. Cc: 10.3 10.4 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: don't append to streamout buffers that haven't been used yetMarek Olšák2015-02-042-1/+4
| | | | | | | | | | | | The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it. Instead, use offset = 0, which is what we always do when not appending. This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*. Yes, the test does use transform feedback. Cc: 10.3 10.4 <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium: set PIPE_MAX_SAMPLERS to 18Marek Olšák2015-02-041-1/+1
| | | | | | | For drivers that use higher slots not to crash in tgsi_shader_info. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_pstipple: add ability to specify a fixed texture unitMarek Olšák2015-02-043-9/+21
| | | | | | | E.g. r600g can use slot 17, which is outside of the API range. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add u_bit_scan64Marek Olšák2015-02-041-0/+7
| | | | | | | Same as u_bit_scan, but for uint64_t. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi: add tgsi_get_processor_type helper from radeonMarek Olšák2015-02-043-11/+14
| | | | | Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/osmesa: Fix osbuffer->textures indexingPark, Jeongmin2015-02-031-1/+1
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930 Cc: 10.4 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/util: Don't use __builtin_clrsb in util_last_bit().Matt Turner2015-02-031-4/+0
| | | | | | | | Unclear circumstances lead to undefined symbols on x86. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916 Cc: [email protected] Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: add a cap to determine whether the driver supports offset_clampIlia Mirkin2015-02-0215-1/+20
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* dir-locals.el: Don't set variables for non-programming modesNeil Roberts2015-02-027-7/+7
| | | | | | | | | | | | | | This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <[email protected]>
* vc4: Kill a bunch of color write calculation when colormask is all off.Eric Anholt2015-02-011-8/+35
| | | | | | | | | | | | | I could have done this in the bit that generates the ANDs and ORs, but it's probably generally useful. Sadly, I still need this even if I move to NIR, because I can't yet express my read of the destination color in NIR, which I would need to move my blend/logicop/colormask handling into NIR. total uniforms in shared programs: 13497 -> 13455 (-0.31%) uniforms in affected programs: 101 -> 59 (-41.58%) total instructions in shared programs: 40797 -> 40296 (-1.23%) instructions in affected programs: 1639 -> 1138 (-30.57%)
* vc4: Dump the VPM read index in QIR disasm.Eric Anholt2015-02-011-4/+9
| | | | | Since the VPM reads have to be in order, it's useful to see their indices in the dump.
* gallium/docs: fix docs wrt ARL/ARR/FLRRoland Scheidegger2015-01-291-10/+8
| | | | | | | | | since the address reg holds integer values, ARL/ARR do an implicit float-to-int conversion, so clarify that. Thus it is also incorrect to say that FLR really does the same as ARL. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* vc4: Fix point size handling when it's the first output.Eric Anholt2015-01-291-1/+1
|
* gallium: Replace u_simple_list.h with util/simple_list.hEric Anholt2015-01-2825-228/+23
| | | | | | | The code was exactly the same, except util/ has c++ guards and a struct simple_node declaration. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Move simple_list.h to src/util.Eric Anholt2015-01-281-0/+1
| | | | | | We have two copies of it in the tree, I'm going to delete one. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Enable VGPR spilling for all shader types v5Tom Stellard2015-01-288-52/+217
| | | | | | | | | | | | | | | | | | | | v2: - Only emit write SPI_TMPRING_SIZE once per packet. - Use context global scratch buffer. v3: - Patch shaders using WRITE_DATA packet instead of map/unmap. - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and VS_PARTIAL_FLUSH when patching shaders. v4: - Code cleanups. - Remove unnecessary multiplies. v5: - Patch shaders in system memory and re-upload to vram. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi/compute: Allocate the scratch buffer during state creationTom Stellard2015-01-282-24/+62
| | | | | | | | | | This moves scratch buffer allocation from si_launch_grid() to si_create_compute_state(). This helps to reduce the overhead of launching a kernel and also fixes a bug in the code that would cause the scratch buffer to be too small if a kernel with smaller scratch size was launched before a kernel with a larger scratch size. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: Add radeon_shader_binary member to struct si_shaderTom Stellard2015-01-282-6/+6
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi/compute: Rename si_compute::program to si_compute::shaderTom Stellard2015-01-281-5/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: Avoid leaking memory when rebuilding shader statesMarek Olšák2015-01-283-4/+13
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* clover/llvm: Dump the OpenCL C code earlier.EdB2015-01-281-3/+3
| | | | | | | | [ Francisco Jerez: As discussed on the mailing list, this is intended to produce more useful debug output in cases where the compilation terminates unexpectedly. ] Reviewed-by: Francisco Jerez <[email protected]>
* clover/llvm: Move CLOVER_DEBUG stuff into anonymous namespace.EdB2015-01-281-13/+20
| | | | | | | [ Francisco Jerez: As we're at it make debug_options[] local to its only user and remove temporary. ] Reviewed-by: Francisco Jerez <[email protected]>
* r600g: add support for primitive id without geom shader (v2)Dave Airlie2015-01-286-1/+51
| | | | | | | | | | | | | | | | | | | GLSL 1.50 specifies a fragment shader may have a primitive id input without a geometry shader present. On r600 hw there is a special GS scenario for this, you have to enable GS_SCENARIO_A and pass the primitive id through the vertex shader which operates in GS_A mode. This is a first pass attempt at this, and passes the piglit tests that test for this. v1.1: clean up debug print + no need to assign key value to setup output. v2: add r600 support Reviewed-by: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: move selecting the pixel shader earlier.Dave Airlie2015-01-281-3/+4
| | | | | | | | | | | In order to detect that a pixel shader has a prim id input when we have no geometry shader we need to reorder the shader selection so the pixel shader is selected first, then the vertex shader key can take into account the primitive id input requirement and lack of geom shader. Reviewed-by: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/clover: Pass target instead of target.begin() to std::string()Michel Dänzer2015-01-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | Fixes reading beyond allocated memory: ==1936== Invalid read of size 1 ==1936== at 0x4C2C1B4: strlen (vg_replace_strmem.c:412) ==1936== by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20) ==1936== by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698) ==1936== by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) ==1936== Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd ==1936== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==1936== by 0x5B398F0: alloc (compat.hpp:59) ==1936== by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98) ==1936== by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327) ==1936== by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) Reviewed-by: Francisco Jerez <[email protected]>
* r600g,radeonsi: Fix calculation of IR target cap string buffer sizeMichel Dänzer2015-01-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes writing beyond the allocated buffer: ==31855== Invalid write of size 1 ==31855== at 0x50AB2A9: vsprintf (iovsprintf.c:43) ==31855== by 0x508F6F6: sprintf (sprintf.c:32) ==31855== by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526) ==31855== by 0x5B2B7DE: get_compute_param<char> (device.cpp:37) ==31855== by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) ==31855== Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd ==31855== at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324) ==31855== by 0x5B2B7C2: allocate (new_allocator.h:104) ==31855== by 0x5B2B7C2: allocate (alloc_traits.h:357) ==31855== by 0x5B2B7C2: _M_allocate (stl_vector.h:170) ==31855== by 0x5B2B7C2: _M_create_storage (stl_vector.h:185) ==31855== by 0x5B2B7C2: _Vector_base (stl_vector.h:136) ==31855== by 0x5B2B7C2: vector (stl_vector.h:278) ==31855== by 0x5B2B7C2: get_compute_param<char> (device.cpp:35) ==31855== by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* clover: Fix build with llvm after r226981Jan Vesely2015-01-261-0/+4
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88783 Signed-off-by: Jan Vesely <[email protected]>
* st/nine: Correctly handle when ff vs should have no texture coord input/outputAxel Davy2015-01-221-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous code semantic was: . if ff ps will not run a ff stage, then do not output texture coords for this stage for vs . if XYZRHW is used (position_t), use only the mode where input coordinates are copied to the outputs. Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means copy texture coord input to texture coord output if there is such input. The case where there is no texture coord input wasn't handled correctly. Drivers like r300 dislike when vs has inputs that are not fed. Moreover if the app uses ff vs with a programmable ps, we shouldn't look at what are the parameters of the ff ps to decide to output or not texture coordinates. The new code semantic is: . if XYZRHW is used, restrict to PASSTHRU . if PASSTHRU is used and no texture input is declared, then do not output texture coords for this stage The case where ff ps needs a texture coord input and ff vs doesn't output it is not handled, and should probably be a runtime error. This fixes 3Dmark05, which uses ff vs with programmable ps. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Change comment relating to vertex shader inputs not matching ↵Axel Davy2015-01-221-5/+6
| | | | | | | declaration Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>