summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600
Commit message (Collapse)AuthorAgeFilesLines
* vl/buffer: use 2D_ARRAY instead of 3D texturesChristian König2013-05-011-7/+7
| | | | Signed-off-by: Christian König <[email protected]>
* r600g/sb: remove unused codeVadim Girlin2013-04-302-34/+0
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: collect shader statisticsVadim Girlin2013-04-305-8/+162
| | | | | | | | | Collects various statistical information for each shader and total stats for contexts. Printed with R600_DEBUG=sb,sbstat Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: don't propagate dead values in GVN passVadim Girlin2013-04-301-0/+3
| | | | | | | | | | | | | | In some cases we use value::gvn_source field to link values that are known to be equal before gvn pass (e.g. results of DOT4 in different slots of the same alu group), but then source value may become dead later and this confuses further passes. This patch resets value::gvn_source to NULL in the dce_cleanup pass if it points to dead value. Fixes segfault during shader optimization with ETQW. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: use simple heuristic to limit register pressureVadim Girlin2013-04-302-3/+33
| | | | | | | | | | | | | | | | | | | | | It's not a complete register pressure tracking, yet it helps to prevent register allocation problems in some cases where they were observed. The problems are uncovered by false dependencies between fetch instructions introduced by some recent changes in TGSI and/or default backend. Sometimes we have code like this: ... SAMPLE R5.xyzw, R5.xyzw ... store R5.xyzw somewhere MOV R5.x, <next x coord> MOV R5.y, <next y coord> SAMPLE R5.xyzw, R5.xyzw ... <may be repeated a lot of times> With 2D resources, z and w in SAMPLE src reg aren't used and can be simply masked, but shader backend doesn't have this information, so it's considered as data dependency by optimization algorithms.
* r600g/sb: improve error checking in ra_coalesce passVadim Girlin2013-04-302-14/+27
|
* r600g/sb: use source bytecode in case of optimization errorsVadim Girlin2013-04-305-11/+25
|
* r600g: plug in optimizing backendVadim Girlin2013-04-308-3/+155
| | | | | | Optimization is enabled with "R600_DEBUG=sb". Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: initial commit of the optimizing shader backendVadim Girlin2013-04-3035-0/+17498
|
* r600g: use enum type for domains field in struct r600_resourceVadim Girlin2013-04-301-1/+1
| | | | This prevents the problems when the header is included in C++ code.
* r600g: add new flags to isa instruction tablesVadim Girlin2013-04-301-116/+127
|
* r600g: always create reverse lookup isa tablesVadim Girlin2013-04-301-10/+2
|
* r600g: mask unused source components for SAMPLEVadim Girlin2013-04-301-0/+20
| | | | | | | | This results in more clean shader code and may improve the quality of optimized code produced by r600-sb due to eliminated false dependencies in some cases. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/llvm: Fix opencl buildVincent Lejeune2013-04-301-1/+1
|
* r600g/llvm: get use_kill from compiler shaderVincent Lejeune2013-04-303-1/+8
|
* r600g: force full cache for hyperzJerome Glisse2013-04-292-0/+2
| | | | | | | | | | | | | | | | | Seems that in some case allowing half cache usage confuse the gpu and trigger lockup. Force full cache use. Should fix : https://bugs.freedesktop.org/show_bug.cgi?id=59592 https://bugs.freedesktop.org/show_bug.cgi?id=60848 https://bugs.freedesktop.org/show_bug.cgi?id=60969 https://bugs.freedesktop.org/show_bug.cgi?id=61747 https://bugs.freedesktop.org/show_bug.cgi?id=62466 https://bugs.freedesktop.org/show_bug.cgi?id=62669 https://bugs.freedesktop.org/show_bug.cgi?id=62721 https://bugs.freedesktop.org/show_bug.cgi?id=63124 Signed-off-by: Jerome Glisse <[email protected]>
* r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2Christian König2013-04-263-1/+18
| | | | | | | | | That is just not supported by the hardware. v2: fix compare Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeon/uvd: stop using anonymous unionsChristian König2013-04-261-2/+2
| | | | Signed-off-by: Christian König <[email protected]>
* winsys/radeon: consolidate tracing into winsys v2Jerome Glisse2013-04-255-57/+13
| | | | | | | | | | | | This move the tracing timeout and printing into winsys and add an debug environement variable for it (R600_DEBUG=trace_cs). Lot of file touched because of winsys API changes. v2: Do not write lockup file if ib uniq id does not match last one Signed-off-by: Jerome Glisse <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g/compute: Removed unused and untested codeTom Stellard2013-04-255-776/+66
| | | | | | | | There was a lot of code in evergreen_compute_internal.c that was not being used at all and most of it was duplicating code from other parts of the driver. Reviewed-by: Alex Deucher <[email protected]>
* r600g/compute: Use a constant buffer to store kernel parameters v2Tom Stellard2013-04-252-16/+30
| | | | | | | | | v2: - Fix usage of set_constant_buffer() - Fix typo in comment Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add evergreen_emit_cs_constant_buffers() v2Tom Stellard2013-04-253-11/+36
| | | | | | | | v2: - Bump R600_NUM_ATOMS Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernelTom Stellard2013-04-251-6/+0
| | | | | | | The state tracker should be responsible for waiting for the kernel to finish. Reviewed-by: Alex Deucher <[email protected]>
* r600g/compute: Fix input buffer size calculationTom Stellard2013-04-251-1/+1
| | | | | | Buffer size should be in bytes not dwords. Reviewed-by: Alex Deucher <[email protected]>
* r600g: use CP DMA for buffer clears on evergreen+Alex Deucher2013-04-244-2/+119
| | | | | | | | | | Lighter weight then using streamout. Only evergreen and newer asics support embedded data as src with CP DMA. Reviewed-by: Jerome Glisse <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* r600g/llvm: Pass struct r600_bytecode to r600_llvm_compileTom Stellard2013-04-243-8/+7
| | | | | | This way we don't need to update the function signature everytime we emit a new config value. This also fixes the build with --enable-opencl.
* gallium: Replace gl_rasterization_rules with lower_left_origin and ↵José Fonseca2013-04-232-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | half_pixel_center. Squashed commit of the following: commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852 Author: José Fonseca <[email protected]> Date: Tue Apr 23 17:37:18 2013 +0100 gallium: s/lower_left_origin/bottom_edge_rule/ commit 4dff4f64fa83b9737def136fffd161d55e4f1722 Author: José Fonseca <[email protected]> Date: Tue Apr 23 17:35:04 2013 +0100 gallium: Move diagram to docs. commit 442a63012c8c3c3797f45e03f2ca20ad5f399832 Author: James Benton <[email protected]> Date: Fri May 11 17:50:55 2012 +0100 gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. This change is necessary to achieve correct results when using OpenGL FBOs. Reviewed-by: Marek Olšák <[email protected]>
* r600g: initialize CMASK and HTILE with the GPU using streamoutMarek Olšák2013-04-234-7/+80
| | | | | | | | | | | | | This fixes a crash when a resource cannot be mapped to the CPU's address space because it's too big. This puts a global pipe_context in r600_screen, which is guarded by a mutex, so that we can use pipe_context when there isn't one around. Hopefully our multi-context support is solid. Reviewed-by: Alex Deucher <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* r600/llvm: Read stacksize from config headerVincent Lejeune2013-04-233-2/+4
|
* /bin/bash: q : commande introuvableVincent Lejeune2013-04-231-1/+1
|
* st/mesa: optionally apply texture swizzle to border color v2Christoph Bumiller2013-04-181-0/+3
| | | | | | | | | | | | This is the only sane solution for nv50 and nvc0 (really, trust me), but since on other hardware the border colour is tightly coupled with texture state they'd have to undo the swizzle, so I've added a cap. The dependency of update_sampler on the texture updates was introduced to avoid doing the apply_depthmode to the swizzle twice. v2: Moved swizzling helper to u_format.c, extended the CAP to provide more accurate information.
* r600g: Fix build with --enable-openclTom Stellard2013-04-181-1/+2
|
* r600g/llvm: Use gprcount from llvmVincent Lejeune2013-04-173-1/+4
|
* gallium: Disambiguate TGSI_OPCODE_IF.José Fonseca2013-04-171-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_IF condition had two possible interpretations: - src.x != 0.0f - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for vertex and fragment shaders - gallivm/llvmpipe - postprocess - vl state tracker - vega state tracker - most old drivers - old internal state trackers - many graw examples - src.x != 0U - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both vertex and fragment shaders - tgsi_exec/softpipe - r600 - radeonsi - nv50 And drivers that use draw module also were a mess (because Mesa would emit float IFs, but draw module supports native integers so it would interpret IF arg as integers...) This sort of works if the source argument is limited to float +0.0f or +1.0f, integer 0, but would fail if source is float -0.0f, or integer in the float NaN range. It could also fail if source is integer 1, and hardware flushes denormalized numbers to zero. But with this change there are now two opcodes, IF and UIF, with clear meaning. Drivers that do not support native integers do not need to worry about UIF. However, for backwards compatibility with old state trackers and examples, it is advisable that native integer capable drivers also support the float IF opcode. I tried to implement this for r600 and radeonsi based on the surrounding code. I couldn't do this for nouveau, so I just shunted IF/UIF together, which matches the current behavior. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> v2: - Incorporate Roland's feedback. - Fix r600_shader.c merge conflict. - Fix typo in radeon, spotted by Michel Dänzer. - Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float) properly in nv50/ir.
* gallium: Eliminate TGSI_OPCODE_IFC.José Fonseca2013-04-171-3/+6
| | | | | | Never used or implemented. Reviewed-by: Roland Scheidegger <[email protected]>
* r600/uvd: cleanup disabling tiling on pre EG asicsChristian König2013-04-161-5/+6
| | | | | | Set transfer flag instead of fiddling with the tilling params directly. Signed-off-by: Christian König <[email protected]>
* r600g: Workaround for a harware bug with nested loops on CaymanMartin Andersson2013-04-161-3/+15
| | | | | | | | | | | | | | | | There is a hardware bug on Cayman where a BREAK/CONTINUE followed by LOOP_STARTxxx for nested loops may put the branch stack into a state such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this by replacing the ALU_PUSH_BEFORE with a PUSH + ALU Fixes piglit tests EXT_transform_feedback/order* v2: Use existing loop count and improve comment v3: [Vadim Girlin] Set jump address for PUSH instructions NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <[email protected]>
* r600g: implement pipeline statistics queryMarek Olšák2013-04-164-1/+100
|
* winsys/radeon: use query_value for timestamp, remove query_timestampMarek Olšák2013-04-161-1/+1
|
* r600g: add a debug flag for printing virtual addresses of resourcesMarek Olšák2013-04-164-0/+17
|
* r600g: add a query returning the amount of time spent during bo_map sync.Marek Olšák2013-04-163-0/+11
|
* radeon/llvm: Use a struct for storing compiled codeTom Stellard2013-04-151-2/+6
|
* r600g: add get_sample_position support (v3)Dave Airlie2013-04-112-122/+240
| | | | | | | v2: I rewrote this to use the sample positions properly. v3: rewrite properly to use bitfield to cast back to signed ints Signed-off-by: Dave Airlie <[email protected]>
* r600g: fix two issues in compressed msaa reading codeDave Airlie2013-04-111-2/+2
| | | | | | | | | I've no idea when sample_chan would ever be 4 here, but 4 is most definitely wrong, array textures have it as 3 as well. Also the cayman code though unused is obviously wrong. Signed-off-by: Dave Airlie <[email protected]>
* radeon/uvd: add UVD implementation v5Christian König2013-04-115-6/+212
| | | | | | | | | | | | | | Just everything you need for UVD with r600g and radeonsi. v2: move UVD code to radeon subdir, clean up build system additions, remove an unused SI function, disable tiling on SI for now. v3: some minor indentation fix and rebased v4: dpb size calculation fixed v5: implement proper fall-back in case the kernel doesn't support UVD, based on patches from Andreas Boll but cleaned up a bit more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: Add support for GL_ARB_texture_buffer_rangeFredrik Höglund2013-04-113-5/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* r600g: fix valgrind warning on CaymanMarek Olšák2013-04-101-1/+1
| | | | Warning: "Conditional jump or move depends on uninitialised value(s)".
* r600g: Fix UMAD on CaymanMartin Andersson2013-04-091-13/+32
| | | | | | | | | | | | The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. This fixed hardlocks in the EXT_transform_feedback/order tests. NOTE: This is a candidate for the stable branches. (might not be easy to cherry-pick though) Signed-off-by: Marek Olšák <[email protected]>
* r600g/llvm: Add support for native isa for pre EGVincent Lejeune2013-04-081-1/+5
| | | | | This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
* gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2Tom Stellard2013-04-054-69/+72
| | | | | | | | | | | | This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <[email protected]>