aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium/docs: fix docs wrt ARL/ARR/FLRRoland Scheidegger2015-01-291-10/+8
| | | | | | | | | since the address reg holds integer values, ARL/ARR do an implicit float-to-int conversion, so clarify that. Thus it is also incorrect to say that FLR really does the same as ARL. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* vc4: Fix point size handling when it's the first output.Eric Anholt2015-01-291-1/+1
|
* gallium: Replace u_simple_list.h with util/simple_list.hEric Anholt2015-01-2825-228/+23
| | | | | | | The code was exactly the same, except util/ has c++ guards and a struct simple_node declaration. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Move simple_list.h to src/util.Eric Anholt2015-01-281-0/+1
| | | | | | We have two copies of it in the tree, I'm going to delete one. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Enable VGPR spilling for all shader types v5Tom Stellard2015-01-288-52/+217
| | | | | | | | | | | | | | | | | | | | v2: - Only emit write SPI_TMPRING_SIZE once per packet. - Use context global scratch buffer. v3: - Patch shaders using WRITE_DATA packet instead of map/unmap. - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and VS_PARTIAL_FLUSH when patching shaders. v4: - Code cleanups. - Remove unnecessary multiplies. v5: - Patch shaders in system memory and re-upload to vram. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi/compute: Allocate the scratch buffer during state creationTom Stellard2015-01-282-24/+62
| | | | | | | | | | This moves scratch buffer allocation from si_launch_grid() to si_create_compute_state(). This helps to reduce the overhead of launching a kernel and also fixes a bug in the code that would cause the scratch buffer to be too small if a kernel with smaller scratch size was launched before a kernel with a larger scratch size. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: Add radeon_shader_binary member to struct si_shaderTom Stellard2015-01-282-6/+6
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi/compute: Rename si_compute::program to si_compute::shaderTom Stellard2015-01-281-5/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: Avoid leaking memory when rebuilding shader statesMarek Olšák2015-01-283-4/+13
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* clover/llvm: Dump the OpenCL C code earlier.EdB2015-01-281-3/+3
| | | | | | | | [ Francisco Jerez: As discussed on the mailing list, this is intended to produce more useful debug output in cases where the compilation terminates unexpectedly. ] Reviewed-by: Francisco Jerez <[email protected]>
* clover/llvm: Move CLOVER_DEBUG stuff into anonymous namespace.EdB2015-01-281-13/+20
| | | | | | | [ Francisco Jerez: As we're at it make debug_options[] local to its only user and remove temporary. ] Reviewed-by: Francisco Jerez <[email protected]>
* r600g: add support for primitive id without geom shader (v2)Dave Airlie2015-01-286-1/+51
| | | | | | | | | | | | | | | | | | | GLSL 1.50 specifies a fragment shader may have a primitive id input without a geometry shader present. On r600 hw there is a special GS scenario for this, you have to enable GS_SCENARIO_A and pass the primitive id through the vertex shader which operates in GS_A mode. This is a first pass attempt at this, and passes the piglit tests that test for this. v1.1: clean up debug print + no need to assign key value to setup output. v2: add r600 support Reviewed-by: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: move selecting the pixel shader earlier.Dave Airlie2015-01-281-3/+4
| | | | | | | | | | | In order to detect that a pixel shader has a prim id input when we have no geometry shader we need to reorder the shader selection so the pixel shader is selected first, then the vertex shader key can take into account the primitive id input requirement and lack of geom shader. Reviewed-by: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/clover: Pass target instead of target.begin() to std::string()Michel Dänzer2015-01-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | Fixes reading beyond allocated memory: ==1936== Invalid read of size 1 ==1936== at 0x4C2C1B4: strlen (vg_replace_strmem.c:412) ==1936== by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20) ==1936== by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698) ==1936== by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) ==1936== Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd ==1936== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==1936== by 0x5B398F0: alloc (compat.hpp:59) ==1936== by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98) ==1936== by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327) ==1936== by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) Reviewed-by: Francisco Jerez <[email protected]>
* r600g,radeonsi: Fix calculation of IR target cap string buffer sizeMichel Dänzer2015-01-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes writing beyond the allocated buffer: ==31855== Invalid write of size 1 ==31855== at 0x50AB2A9: vsprintf (iovsprintf.c:43) ==31855== by 0x508F6F6: sprintf (sprintf.c:32) ==31855== by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526) ==31855== by 0x5B2B7DE: get_compute_param<char> (device.cpp:37) ==31855== by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) ==31855== Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd ==31855== at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324) ==31855== by 0x5B2B7C2: allocate (new_allocator.h:104) ==31855== by 0x5B2B7C2: allocate (alloc_traits.h:357) ==31855== by 0x5B2B7C2: _M_allocate (stl_vector.h:170) ==31855== by 0x5B2B7C2: _M_create_storage (stl_vector.h:185) ==31855== by 0x5B2B7C2: _Vector_base (stl_vector.h:136) ==31855== by 0x5B2B7C2: vector (stl_vector.h:278) ==31855== by 0x5B2B7C2: get_compute_param<char> (device.cpp:35) ==31855== by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* clover: Fix build with llvm after r226981Jan Vesely2015-01-261-0/+4
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88783 Signed-off-by: Jan Vesely <[email protected]>
* st/nine: Correctly handle when ff vs should have no texture coord input/outputAxel Davy2015-01-221-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous code semantic was: . if ff ps will not run a ff stage, then do not output texture coords for this stage for vs . if XYZRHW is used (position_t), use only the mode where input coordinates are copied to the outputs. Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means copy texture coord input to texture coord output if there is such input. The case where there is no texture coord input wasn't handled correctly. Drivers like r300 dislike when vs has inputs that are not fed. Moreover if the app uses ff vs with a programmable ps, we shouldn't look at what are the parameters of the ff ps to decide to output or not texture coordinates. The new code semantic is: . if XYZRHW is used, restrict to PASSTHRU . if PASSTHRU is used and no texture input is declared, then do not output texture coords for this stage The case where ff ps needs a texture coord input and ff vs doesn't output it is not handled, and should probably be a runtime error. This fixes 3Dmark05, which uses ff vs with programmable ps. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Change comment relating to vertex shader inputs not matching ↵Axel Davy2015-01-221-5/+6
| | | | | | | declaration Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Allocate vs constbuf buffer for indirect addressing once.Axel Davy2015-01-223-5/+6
| | | | | | | | | | | | | | | When the shader does indirect addressing on the constants, we allocate a temporary constant buffer to which we copy the constants from the app given user constants and the constants filled in the shader. This patch makes this buffer be allocated once. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Signed-off-by: Tiziano Bacocco <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Allocate the correct size for the user constant bufferAxel Davy2015-01-223-7/+8
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Add variables containing the size of the constant buffersAxel Davy2015-01-223-6/+10
| | | | | | Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fix sm3 relative addressing for non-debug buildAxel Davy2015-01-221-4/+0
| | | | | | | | | | | | | Relative addressing needs the constant buffer to get all the correct constants, even those defined by the shader. The code to copy the shader constants to the constant buffer was enabled only for debug build. Enable it always. Cc: "10.4" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Remove unused code for psAxel Davy2015-01-223-40/+15
| | | | | | | | | Since constant indirect adressing is not allowed for ps, we can remove our code to handle that. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Correct rules for relative adressing and constants.Axel Davy2015-01-221-6/+8
| | | | | | | | | relative adressing for constants is possible only for vs float constants. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGBAxel Davy2015-01-221-3/+36
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXDP3TEXAxel Davy2015-01-221-1/+19
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXDP3Axel Davy2015-01-221-1/+11
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXDEPTHAxel Davy2015-01-221-1/+22
| | | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXM3x3SPECAxel Davy2015-01-221-1/+38
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXM3x2TEXAxel Davy2015-01-221-1/+19
| | | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: implement TEXM3x2DEPTHAxel Davy2015-01-221-1/+26
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fix TEXM3x3 and implement TEXM3x3VSPECAxel Davy2015-01-221-17/+36
| | | | | | | | | | The fix is that this line: "src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0. Instead access tx->regs.vT directly when needed. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fill missing dst and src number for some instructions.Axel Davy2015-01-221-23/+23
| | | | | | | | | | Not filling them correctly results in bad padding and later crash. Reviewed-by: David Heidelberg <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Implement TEXCOORD special behavioursAxel Davy2015-01-221-5/+26
| | | | | | | | | | | | | | | | | | texcoord for ps < 1_4 should clamp between 0 and 1 the values. texcrd (texcoord ps 1_4) does not clamp and can be used with two modifiers _dw and _dz that means the channels are divided by w or z. Implement those in shared code, since the same modifiers can be used for texld ps 1_4. v2: replace DIV by RCP + MUL v3: Remove an useless MOV Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fix CALLNZ implementationAxel Davy2015-01-221-9/+4
| | | | | | | | | | | | | | | | | Nothing seems to indicates the negation modifier would be stored in the instruction flags instead of the source modifier. tx_src_param has already handled it if it is in the source modifier. In addition, when the card supports native integers, the boolean are stored in 32 bits int and are equal to 0 or 0xFFFFFFFF. Given 0xFFFFFFFF is NaN if it was a float, better use UIF than IF. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Fix some fixed function pipeline operationAxel Davy2015-01-221-2/+4
| | | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Clamp ps 1.X constantsAxel Davy2015-01-221-0/+7
| | | | | | | | | This is wine (and windows) behaviour. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Remove duplicated code for ps texcoord input declarationAxel Davy2015-01-221-8/+4
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Fix CND implementationAxel Davy2015-01-221-9/+13
| | | | | | | Signed-off-by: Axel Davy <[email protected]> Signed-off-by: Tiziano Bacocco <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Match REP implementation to LOOPAxel Davy2015-01-221-19/+30
| | | | | | | | | | | Previous implementation was behaving fine, but improve it by: . Improved documentation . Decreasing counter (comparing to 0 is likely to be faster than to constant) . Move the counter update at the end for better performance for shaders that break the loop earlier than when the count is done. Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Rewrite LOOP implementation, and a0 aL handlingAxel Davy2015-01-221-63/+100
| | | | | | | | | | | | | | | | | | | Previous implementation didn't work well with nested loops. Instead of using several address registers, put a0 and aL into normal registers, and copy them to one address register when we need to use them. Wine tests loop_index_test() and nested_loop_test() now pass correctly. Fixes r600g crash while loading Bioshock - bug https://bugs.freedesktop.org/show_bug.cgi?id=85696 Tested-by: David Heidelberg <[email protected]> Reviewed-by: Tiziano Bacocco <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Correct LOG on negative valuesAxel Davy2015-01-221-2/+13
| | | | | | | | | | | | We should take the absolute value of the input. Also return -FLT_MAX instead of -Inf for an input of 0. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Handle NRM with input of null normAxel Davy2015-01-221-1/+3
| | | | | | | | | | | | | When the input's xyz are 0.0, the output should be 0.0. This is due to the fact that Inf * 0 = 0 for dx9. To handle this case, cap the result of RSQ to FLT_MAX. We have FLT_MAX * 0 = 0. Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Handle RSQ special casesAxel Davy2015-01-221-1/+12
| | | | | | | | | | | We should use the absolute value of the input as input to ureg_RSQ. Moreover, an input of 0.0 should return FLT_MAX. Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fix POW implementationAxel Davy2015-01-221-1/+12
| | | | | | | | | | | | | POW doesn't match directly TGSI, since we should take the absolute value of src0. Fixes black textures in some games Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Fix typo for M4x4Axel Davy2015-01-221-1/+1
| | | | | | | Cc: "10.4" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* st/nine: Correctly declare NineTranslateInstruction_Mkxn inputsAxel Davy2015-01-221-2/+5
| | | | | | | | | | | | | Let's say we have c1 and c2 declared in the shader and c0 given by the app Then here we would have read c0, c1 and c2 given by the app, instead of the correct c0, c1, c2. This correction fixes several issues in some games. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Saturate oFog and oPts vs outputsAxel Davy2015-01-221-2/+2
| | | | | | | | | | | According to docs and Wine, these two vs outputs have to be saturated. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Remove some shader unused codeAxel Davy2015-01-221-22/+1
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>
* st/nine: Convert integer constants to floats before storing them when cards ↵Axel Davy2015-01-221-13/+52
| | | | | | | | | | | don't support integers The shader code is already behaving as if they are floats when the the card doesn't support integers Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Axel Davy <[email protected]> Cc: "10.4" <[email protected]>