summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* ilo: allow bo format to differ from that requestedChia-I Wu2013-05-092-14/+22
| | | | | For separate stencil buffer or formats not supported natively, the real format of the bo may differ from that requested.
* draw/llvm: Add additional llvm optimization passesStéphane Marchesin2013-05-081-0/+3
| | | | | | | | | | | | It helps a bit with vertex shader performance on i915g (a couple percent faster with openarena). I have tried most other passes, and they weren't showing any measurable improvement. Note that my vertex shaders didn't have loops, so maybe the loop optimizations could still be useful in the future. Reviewed-by: Brian Paul <[email protected]>
* i915: Use Y tiling for texturesStéphane Marchesin2013-05-081-2/+7
| | | | | | | | | | | This basically reverts commit 2acc7193743199701f8f6d1877a59ece0ec4fa5b. With the previous change, we're not batchbuffer limited any longer. So we actually start seeing a performance difference between X and Y tiling. X tiling is funny because it is faster for screen-aligned quads but slower in games. So let's use Y tiling which is 10% faster overall.
* i915g: Optimize batchbuffer sizesStéphane Marchesin2013-05-082-4/+6
| | | | | | | Now that we don't throttle at every batchbuffer, we can shrink the size of batchbuffers to achieve early flushing. This gives a significant speed boost in a lot of games (on the order of 20%).
* i915g: Add more PIPE_CAP_* supportStéphane Marchesin2013-05-081-0/+9
|
* ilo: remove our own type inferenceChia-I Wu2013-05-081-97/+27
| | | | tgsi_opcode_infer_{src,dst}_type() works just fine.
* ilo: use tgsi_util_get_texture_coord_dim()Chia-I Wu2013-05-083-92/+4
| | | | And remove toy_tgsi_get_texture_coord_dim().
* tgsi: fix operand type of TGSI_OPCODE_NOTChia-I Wu2013-05-082-1/+2
| | | | | | | | | It should be TGSI_TYPE_UNSIGNED, not TGSI_TYPE_FLOAT. Fixed also gallivm not_emit_cpu() to use uint build context. Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* tgsi: refactor tgsi_opcode_infer_src_type()Chia-I Wu2013-05-081-35/+9
| | | | | | | Call tgsi_opcode_infer_type() from tgsi_opcode_infer_src_type(). Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* tgsi: refactor tgsi_opcode_infer_dst_type()Chia-I Wu2013-05-081-25/+35
| | | | | | | | | | | | | Move the body of tgsi_opcode_infer_dst_type() to a new helper function, tgsi_opcode_infer_type(), and call the helper function from tgsi_opcode_infer_dst_type(). The diff looks complicated simply because the code is moved around. A following commit will make tgsi_opcode_infer_src_type() call tgsi_opcode_infer_type(). Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* tgsi: reorder opcodes in opcode type inferenceChia-I Wu2013-05-081-24/+24
| | | | | | | | | Reorder opcodes by their assigned numbers. This makes it easier to see the differences between tgsi_opcode_infer_src_type() and tgsi_opcode_infer_dst_type(). Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* tgsi: clean up exec_tex()Chia-I Wu2013-05-081-168/+52
| | | | | | | | | | | | Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table. There is a subtle difference with this change. When TXP is used with an array texture, the layer is now also projected. This behavior matches the TGSI doc. Since GLSL does not allow TXP on an array texture, I am not sure which behavior is correct or preferred. Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* tgsi: add tgsi_util_get_texture_coord_dim()Chia-I Wu2013-05-082-0/+94
| | | | | | | | | | | | This util function returns the dimension of the texture coordinates for a texture target, and the location of the shadow reference value. For example, when the texture target is TGSI_TEXTURE_SHADOW2D, the dimension of the texture coordinates is 2, and the location of the ref value is 2 (that is, the Z channel). Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* nv50: initialize kick_notify callback in nv50_createBryan Cain2013-05-071-0/+1
| | | | | | Fixes infinite loop on startup in Portal and Left 4 Dead 2. NOTE: This is a candidate for the 9.0 and 9.1 branches.
* gallium: more tgsi documentation updatesRoland Scheidegger2013-05-071-131/+250
| | | | | | | | | Adds the remaining integer opcodes, and some opcodes are moved to more appropriate places, along with getting rid of the (already nearly empty) ps_2_x section. Though the CAP bits for some of these are still a bit in the air so the documentation isn't quite as watertight as is desirable. Reviewed-by: Jose Fonseca <[email protected]>
* ilo: Add missing break statement in aos_tex TGSI_OPCODE_TEX2 case.Vinson Lee2013-05-071-0/+1
| | | | | | | Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* r600g/sb: optimize some cases for CNDxx instructionsVadim Girlin2013-05-072-5/+81
| | | | | | | | | | | | | | We can replace CNDxx with MOV (and possibly eliminate after propagation) in following cases: If src1 is equal to src2 in CNDxx instruction then the result doesn't depend on condition and we can replace the instruction with "MOV dst, src1". If src0 is const then we can evaluate the condition at compile time and also replace it with MOV. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix memory leaksVadim Girlin2013-05-072-1/+7
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix kcache handling on r6xxVadim Girlin2013-05-071-1/+5
| | | | | | | | | Use the same limit for kcache constants in alu group on r6xx as on other chips (two const pairs). Relaxing this will require additional checks to make sure that all 4 consts in the group come from 2 kcache sets (clause limit), probably without noticeable improvements of shader performance. Signed-off-by: Vadim Girlin <[email protected]>
* gallivm: Fix build for LLVM < 3.3Tom Stellard2013-05-061-0/+6
| | | | | The C API versions of the LLVM multithreaded functions were added in LLVM 3.3.
* r600g/llvm: Parse config values in register / value pairsTom Stellard2013-05-062-4/+31
| | | | Rather than relying on a predetermined order for the config values.
* r600g/llvm: Don't feed LLVM output through r600_bytecode_build()Tom Stellard2013-05-064-395/+21
| | | | | The LLVM backend emits raw ISA now, so we can just its output unmodified.
* r600g/llvm: Don't emit CALL_FS for vertex shadersTom Stellard2013-05-062-8/+10
| | | | The LLVM backend takes care of this now.
* radeon/llvm: Always build libradeonllvm as staticTom Stellard2013-05-063-17/+10
| | | | | | | | | This library is very small, so there is not much to gain from building it as a shared library. Also, when linking statically with LLVM, a shared libradeonllvm exports LLVM symbols and creates problems when used with other shared objects that also link statically to LLVM. Reviewed-by: [email protected]
* radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2Tom Stellard2013-05-065-203/+173
| | | | | | | | | | | The LLVM C API is considered stable and should never change, so it is much more desirable to use than the LLVM C++ API, which is constantly in flux. v2: - Split target initialization and lookup into separate functions Reviewed-by: [email protected]
* gallivm: Move LLVMStartMultithreaded() static initializer into gallivmTom Stellard2013-05-062-14/+15
| | | | | | | This does not solve all of the problems with using LLVM in a multithreaded enivronment, but it should help in some cases. Reviewed-by: [email protected]
* radeon/llvm: Don't use the global context when parsing LLVM IRTom Stellard2013-05-061-2/+3
| | | | | | | This leads to crashes when multiple threads try to compile compute shaders in the same time. Fixes a crash in bfgminer when using more than one thread.
* r600g/llvm: Update radeon family mappings for LLVM backendTom Stellard2013-05-062-4/+8
| | | | | New processors were added to the backend to distinguish between GPUs with and without vertex caches.
* android: add ilo to the build systemChia-I Wu2013-05-064-0/+87
| | | | | | | | | It can be selected with BOARD_GPU_DRIVERS := ilo Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* ilo: correctly set return types of sampler messagesChia-I Wu2013-05-052-0/+3
| | | | | Correctly set the types of the temporaries. We do not want type conversions when moving the results to the final destinations.
* r600g/llvm: Undefines unrequired texture coord valuesVincent Lejeune2013-05-041-1/+28
| | | | | This is a port of "r600g:mask unused source components for SAMPLE" patch from Vadim Girlin.
* nvc0: fixup video decoding with 2D_ARRAYMaarten Lankhorst2013-05-042-5/+4
| | | | Signed-off-by: Maarten Lankhorst <[email protected]>
* gallium: fix type of flags in pipe_context::flush()Chia-I Wu2013-05-0422-23/+25
| | | | | | | | | | | | | | | | It should be unsigned, not enum pipe_flush_flags. Fixed a build error: src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error: invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive] v2: replace all occurrences of enum pipe_flush_flags by unsigned Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Marek Olšák <[email protected]> [olv: document the parameter now that the type is unsigned]
* draw/pt: adjust overflow calculationsZack Rusin2013-05-032-2/+8
| | | | | | | | | gallium lies. buffer_size is not actually buffer_size but available size, which is 'buffer_size - buffer_offset' so by adding buffer offset we'd incorrectly compute overflow. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* tgsi/ureg: make the dst register match the src indirectionZack Rusin2013-05-032-4/+11
| | | | | | | | | | | In ureg src registers could have an indirect register that was either a temp or an addr register, while dst registers allowed only addr. That made moving between them a little difficult so make them behave the same way and allow temp's and addr registers as indirect files for both (tgsi supports it, just ureg didn't). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium: tgsi documentation updates and clarification for integer opcodes.Roland Scheidegger2013-05-031-73/+289
| | | | | | | A lot of them were missing. Others were moved from the Compute ISA to a new Integer ISA section as that seemed more appropriate. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: get rid of depth swizzling.Roland Scheidegger2013-05-037-273/+414
| | | | | | | | | | | | | | | Eliminating this we no longer need to copy between linear and swizzled layout. This is probably not quite ideal since it's a bit more work for now, could do some optimizations by moving depth testing outside the fragment shader loop (but tricky for early depth test as we don't have neither the mask nor the interpolated z in the right order handy). The large amount of tile/untile code is no longer needed will be deleted in next commit. No piglit regressions. v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR. v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type Reviewed-by: Jose Fonseca <[email protected]>
* r600g: Correctly initialize the shader key, v2Lauri Kasanen2013-05-031-1/+2
| | | | | | | | | | | | | | | | | | | | | Assigning a struct only copies the members - any padding is left as is. Thus this code: struct foo_t foo; foo = bar; leaves the padding of foo intact, ie uninitialized random garbage. This patch fixes constant shader recompiles by initializing the struct to zero. For completeness, memcpy is used to copy the key to the shader struct. NOTE: This is a candidate for the stable branches. Signed-off-by: Lauri Kasanen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Andreas Boll <[email protected]>
* st/xvmc/tests: Fix build failure, v2Lauri Kasanen2013-05-031-1/+1
| | | | | | | | | v2: Removed extra libs as requested by Matt Turner. Signed-off-by: Lauri Kasanen <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Andreas Boll <[email protected]>
* scons: remove nouveau buildAndreas Boll2013-05-035-58/+0
| | | | | | | One build system for linux/unix only drivers should be enough. Additionally the nouveau target was disabled anyway. Acked-by: Jose Fonseca <[email protected]>
* scons: remove radeon buildAndreas Boll2013-05-039-185/+0
| | | | | | | | One build system for linux/unix only drivers should be enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694 Acked-by: Jose Fonseca <[email protected]>
* r600g: don't emit surface_sync after FLUSH_AND_INV_EVENTAlex Deucher2013-05-031-26/+0
| | | | | | | | | | | | | | | | | It shouldn't be needed since the FLUSH_AND_INV_EVENT has already made sure the destination caches are flushed. Additionally, we didn't previously emit the surface_sync until this commit: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b Emitting them together causes hangs in compute on cayman/TN and hangs in Heaven on evergreen. Note: this patch is a candidate for the 9.1 branch, but requires: http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a as well. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* r600g/sb: zero-initialize bytecode structsVadim Girlin2013-05-032-3/+6
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix constant propagation in gvn passVadim Girlin2013-05-031-1/+2
| | | | | | Fixes the bug that prevented propagation of literals in some cases. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: don't run unnecessary passesVadim Girlin2013-05-031-3/+0
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: silence warnings with gcc 4.8Vadim Girlin2013-05-032-14/+15
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix handling of interference sets in post_schedulerVadim Girlin2013-05-032-8/+8
| | | | | | | | | | | post_scheduler clears interference set for reallocatable values when the value becomes live first time, and then updates it to take into account modified order of operations, but this was not handled properly if the value appears first time as a source in copy operation. Fixes issues with webgl demo: http://madebyevan.com/webgl-water/ Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix allocation of indirectly addressed input arraysVadim Girlin2013-05-034-10/+25
| | | | | | | | | Some inputs may be preloaded into predefined GPRs, so we can't reallocate arrays with such inputs. Fixes issues with webgl demo: http://oos.moxiecode.com/js_webgl/snake/ Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: use hex instead of binary constantsVadim Girlin2013-05-035-15/+15
| | | | | | This should fix build issues with GCC < 4.3 Signed-off-by: Vadim Girlin <[email protected]>
* r600g: use old shader disassembler by defaultVadim Girlin2013-05-034-19/+18
| | | | | | | | | | | | | | New disassembler is not completely isolated yet from further processing in r600g/sb that is not required for printing the dump, so it has higher probability to fail in case of any unexpected features in the bytecode. This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new disassembler in r600g/sb for shader dumps when shader optimization is not enabled. If shader optimization is enabled, new disassembler is used by default. Signed-off-by: Vadim Girlin <[email protected]>