summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallium: Replace gl_rasterization_rules with lower_left_origin and ↵José Fonseca2013-04-2318-27/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | half_pixel_center. Squashed commit of the following: commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852 Author: José Fonseca <[email protected]> Date: Tue Apr 23 17:37:18 2013 +0100 gallium: s/lower_left_origin/bottom_edge_rule/ commit 4dff4f64fa83b9737def136fffd161d55e4f1722 Author: José Fonseca <[email protected]> Date: Tue Apr 23 17:35:04 2013 +0100 gallium: Move diagram to docs. commit 442a63012c8c3c3797f45e03f2ca20ad5f399832 Author: James Benton <[email protected]> Date: Fri May 11 17:50:55 2012 +0100 gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. This change is necessary to achieve correct results when using OpenGL FBOs. Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_blitter: implement buffer clearingMarek Olšák2013-04-232-8/+97
| | | | | | | | | | | | Although this might be useful for ARB_clear_buffer_object, I need it for initializating resources in r600g. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Brian Paul <[email protected]> v2: comment cleanups NOTE: This is a candidate for the 9.1 branch.
* gallivm: Fix build with LLVM >= r180063Tom Stellard2013-04-232-0/+8
|
* draw: use the prim count for ia primitivesZack Rusin2013-04-221-1/+2
| | | | | | | | | | Number of vertices to fetch doesn't always equal the number of input vertices. To correctly compute the number if IA primitives we need to use the total number of input vertices, not only those that need to be fetched. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* tgsi/scan: set correct input limits for geometry shaderZack Rusin2013-04-221-0/+17
| | | | | | | | | TGSI geometry shader input declerations are of the IN[][2] format and the dimensions of the array have to be deduced from the input primitive property. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: add code to reset instance dependent dataZack Rusin2013-04-225-1/+31
| | | | | | | | | | | | We want to be able to reset certain parts of the pipeline, in particular the input primitive index, but only either with seperate invocations of the draw_vbo or new instances. In all other cases (e.g. new invocations due to primitive restart) that data needs to be preserved. Add a function through which we can reset instance dependent data. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium: Add a new clip_halfz rasterizer state.José Fonseca2013-04-226-24/+19
| | | | | | gl_rasterization_rules lumps too many different flags. Reviewed-by: Brian Paul <[email protected]>
* gallivm: Fix assignment of unsigned values to OUT register.José Fonseca2013-04-221-77/+52
| | | | | | | | | | | | | | TEMP is not the only register file that accept unsigned. OUT too. Actually, what determines the appropriate type of the destination value is not the opcode, but rather the register. Also cleanup/simplify code. Add a few more asserts, but also make code more robust by handling graceful if assert fails. This fixes segfault / assertion in the included vert-uadd.sh graw shader. Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "gallivm: Emit vector selects."José Fonseca2013-04-211-2/+14
| | | | | | | | | | | | | | | | | | | | | | | It caused inumerous regressions (LLVM 3.1) in blending. In particular: - lp_test_blend type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ... MISMATCH Src: 0 0 0 b5 49 29 0 a2 0 21 de 0 c3 1b ec 0 Src1: 2d 85 14 0 f8 0 79 a1 99 0 d8 0 59 16 0 0 Dst: 0 a9 97 0 c0 0 78 0 0 8b aa f0 bd 0 78 f6 Con: 7d 0 c0 0 0 bb 77 0 0 0 50 0 40 51 0 0 Res: 0 0 0 0 0 29 0 0 0 0 c8 0 97 1b e3 0 Ref: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ... MISMATCH Src: d 0 0 e9 0 37 35 f0 62 0 0 b2 e9 f7 0 5c Src1: 8f 0 bf 0 a8 5 0 0 c4 0 d7 7 92 a 0 17 Dst: cb 0 1e 0 0 0 19 8e 0 4d 0 0 0 0 3 46 Con: aa 5a 5f 8f 0 0 bc 92 0 88 0 0 b7 8a c0 88 Res: 44 0 13 0 0 0 7 8e 0 24 0 0 0 0 1 40 Ref: 44 0 13 0 0 37 35 0 62 24 0 0 e9 f7 1 0 This reverts commit 1e266c7ef01251ecf72347a2ba1d174b035cbe3b.
* gallivm: Disable LLVM 2.7 workaround on other versions.José Fonseca2013-04-201-2/+1
| | | | | | | | | 2.7 was a particularly trouble ridden release. Furthermore, the bug no longer can be reproduced ever since the first_level state was taken in account. Reviewed-by: Brian Paul <[email protected]>
* gallivm: Emit vector selects.José Fonseca2013-04-201-12/+2
| | | | | | | | | | | | | They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC though.) Actually lp_build_linear_mip_levels() already has been emitting them for some time. This avoids intrinsics, which tend to be an obstacle for certain optimization passes. Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement switch opcodeRoland Scheidegger2013-04-203-12/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Should be able to handle all things which make this tricky to implement. Fallthroughs, including most notably into/out of default, should be handled correctly but are quite a mess. If we see largely unoptimized switches in the wild should probably think about some "real" switch optimization pass, e.g. things like this: switch case1 someinst brk case2 default case3 someinst brk case4 someinst endswitch are legal, but the pointless case2/case3 statements not only cause condition evaluation but will turn this into a "fake" fallthrough case (because mask and defaultmask are already updated for case2 when default is encountered) requiring executing code twice. If default is at the end though, there's never any code re-execution, and if that's not the case if there's no fallthrough in (not even a fake one) and out of default there's no code re-execution neither. v2: add comments, and use enum for break type instead of magic boolean. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: use uint build context for mask instead of floatRoland Scheidegger2013-04-201-1/+1
| | | | | | Unsurprisingly noone was using it except for grabbing builder. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/tgsi: fix up breakcRoland Scheidegger2013-04-203-2/+6
| | | | | | | | | It seems there was a typo in gallivm breakc handling (I am actually still not sure it is really needed but otherwise that statement really should go away). Also fix the wrong src argument type, even though they weren't really used. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: increase nesting limit to 66Roland Scheidegger2013-04-201-2/+4
| | | | | | | | | | | This is still not really correct, since at least for sm 4.0 the nesting limit is 64 per subroutine, and subroutine nesting itself has a limit of 32, so since we have a flat stack we'd need 32*64. But this should probably be better fixed with per-subroutine stacks, since otherwise these structures get really big (like 100kB for the lp_exec_mask). Reviewed-by: Jose Fonseca <[email protected]>
* draw: implement primitive assemblerZack Rusin2013-04-187-4/+386
| | | | | | | | | | | | | | | | | | | | Input assembler needs to be able to decompose adjacency primitives into something that can be understood by the rest of the pipeline. The specs say that the adjacency primitives are *only* visible in the geometry shader, for everything else they need to be decomposed. Which in most of the cases is not an issue, because the geometry shader always decomposes them for us, but without geometry shader we were passing unchanged adjacency primitives to the rest of the pipeline and causing crashes everywhere. This commit introduces a primitive assembler which, if geometry shader is missing and the input primitive is one of the adjacency primitives, decomposes them into something that the rest of the pipeline can understand. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* util/prim: fix decomposed counts for adjacency primitivesZack Rusin2013-04-181-4/+4
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw/so: uses the correct index with the pre clipped coordinatesZack Rusin2013-04-181-6/+6
| | | | | | | | | | | pre_clip_pos is a float[4] we just used (*float)[4] to be able to jump within the array of vertex_headers with it. So if the idx happened to be anything but 0, we'd actually read from some garbage in memory. Change it to just be a simple pointer instead of casting it to something that it's not. As suggested by Jose. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Fix half floats with MCJIT.José Fonseca2013-04-191-0/+3
| | | | | | Prevents: LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128
* st/mesa: optionally apply texture swizzle to border color v2Christoph Bumiller2013-04-182-0/+46
| | | | | | | | | | | | This is the only sane solution for nv50 and nvc0 (really, trust me), but since on other hardware the border colour is tightly coupled with texture state they'd have to undo the swizzle, so I've added a cap. The dependency of update_sampler on the texture updates was introduced to avoid doing the apply_depthmode to the swizzle twice. v2: Moved swizzling helper to u_format.c, extended the CAP to provide more accurate information.
* gallivm: change cubemaps / derivatives handling, take 55Roland Scheidegger2013-04-183-104/+119
| | | | | | | | | | | | | | | | | | | | | | | Turns out the previous "fix" for handling per-pixel face selection and derivatives didn't work out that well - the derivatives were wrong by quite a bit, in theory transformation of the derivatives into cube space should work, but would be _a lot_ more work than the "simplified" transform used. So, for explicit derivatives, I'm just giving up and go back to not honoring them. For implicit derivatives (and the fake explicit ones) however we try something a little different, we just calculate rho as we would for a 3d texture, that is after scaling the coords by the inverse major axis. This gives the same results as calculating the derivs after projection of the coords to the same face as long as all pixels hit the same face (and only without rho_no_opt, otherwise it should be a bit worse). And when not all pixels are hitting the same face, the results aren't so hot but not catastrophically bad (I believe not off by more than a factor of 2 without no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is better than just picking the wrong face but who knows... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Add no_rho_approx debug optionRoland Scheidegger2013-04-183-118/+185
| | | | | | | | | | | | | | | | | | | | | This will calculate rho correctly as sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2)) instead of max(|ds/dx|,|dt/dx|,|dr/dx|,|ds/dy|,|dt/dy,|dr/dy|) (for 3 coords - 2 coords work analogous, for 1 coord there's no point doing the exact version), for both implicit and explicit derivatives. While such approximation seems to be allowed in OpenGL some APIs may be less forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for 3 coords so wrong by nearly one mip level in the latter case). This also helps to single out "real" bugs from "expected" ones, so it is debug only (though at least combined with no_brilinear I didn't really see much of a performance difference but only tested with a debug build - at least with implicit mipmaps the instruction count is almost exactly the same though the instructions are more complex (1 sqrt and mul/adds instead of and/max mostly). The code when the option isn't set stays exactly the same. v2: rename no_rho_opt to no_rho_approx. Reviewed-by: Brian Paul <[email protected]>
* gallivm: Drop pos arg from lp_build_tgsi_soa.José Fonseca2013-04-183-6/+0
| | | | | | Never used. Reviewed-by: Roland Scheidegger <[email protected]>
* draw/so: respect leading/provoking vertex infoZack Rusin2013-04-171-1/+1
| | | | | | | | we were ignoring leading/provoking vertex settings which was breaking decomposition of some strips. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/gs: fix indirect addressing in geometry shadersZack Rusin2013-04-173-6/+30
| | | | | | | | | | | We were always treating the vertex index as a scalar but when the shader is using indirect addressing it will be a vector of indices for each channel. This was causing some nasty crashes insides LLVM. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/gs: make sure geometry shaders don't overflowZack Rusin2013-04-165-11/+81
| | | | | | | | | | | | | | The specification says that the geometry shader should exit if the number of emitted vertices is bigger or equal to max_output_vertices and we can't do that because we're running in the SoA mode, which means that our storing routines will keep getting called on channels that have overflown (even though they will be masked out, but we just can't skip them). So we need some scratch area where we can keep writing the overflown vertices without overwriting anything important or crashing. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw/gs: Return early if the passed geometry shader is nullZack Rusin2013-04-161-0/+3
| | | | | | | | Can happen if we were using stream output without geometry shader, by returning early we avoid a crash. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: implement pipeline statistics in the draw moduleZack Rusin2013-04-1610-20/+112
| | | | | | | | | | | | | This is a basic implementation of the pipeline statistics in the draw module. The interface is similar to the stream output statistics and also requires that the callers explicitly enable it. Included is the implementation of the interface in llvmpipe and softpipe. Only softpipe enables the pipeline statistics capability though because llvmpipe is lacking gathering of the fragment shading and rasterization statistics. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/gs: fix the end primitive callsZack Rusin2013-04-162-27/+50
| | | | | | | | | | | | | | | The issue with SOA execution and end_primitive opcode is that it can be executed both when we haven't emitted any vertices, in which case we don't want to emit an empty primitive, and when the execution mask is zero and the execution should be skipped. We handled only the latter of those conditions. Now we're combining the execution mask with a mask created from emitted vertices to handle both cases. As a result we don't need the pending_end_primitive flag which was broken because it was static and could be affected by both above mentioned conditions at run-time. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi/exec: geometry shaders are executed on a single primitiveZack Rusin2013-04-161-13/+17
| | | | | | | | which means that our execution mask in GS is equal to 1 not 0xf. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi/exec: fix the udiv and umod instructionsZack Rusin2013-04-161-8/+8
| | | | | | | | | Same as with llvmpipe: we can't be divind/moding by zero and we need to make sure that dividing/moding by zero produces 0xffffffff. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: JIT symbol resolution with linux perf.José Fonseca2013-04-175-59/+101
| | | | | | | Details on docs/llvmpipe.html Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: Silence uninitialized var warnings.José Fonseca2013-04-172-3/+7
| | | | Trivial.
* gallium: Disambiguate TGSI_OPCODE_IF.José Fonseca2013-04-178-1/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_IF condition had two possible interpretations: - src.x != 0.0f - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for vertex and fragment shaders - gallivm/llvmpipe - postprocess - vl state tracker - vega state tracker - most old drivers - old internal state trackers - many graw examples - src.x != 0U - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both vertex and fragment shaders - tgsi_exec/softpipe - r600 - radeonsi - nv50 And drivers that use draw module also were a mess (because Mesa would emit float IFs, but draw module supports native integers so it would interpret IF arg as integers...) This sort of works if the source argument is limited to float +0.0f or +1.0f, integer 0, but would fail if source is float -0.0f, or integer in the float NaN range. It could also fail if source is integer 1, and hardware flushes denormalized numbers to zero. But with this change there are now two opcodes, IF and UIF, with clear meaning. Drivers that do not support native integers do not need to worry about UIF. However, for backwards compatibility with old state trackers and examples, it is advisable that native integer capable drivers also support the float IF opcode. I tried to implement this for r600 and radeonsi based on the surrounding code. I couldn't do this for nouveau, so I just shunted IF/UIF together, which matches the current behavior. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> v2: - Incorporate Roland's feedback. - Fix r600_shader.c merge conflict. - Fix typo in radeon, spotted by Michel Dänzer. - Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float) properly in nv50/ir.
* gallium: Eliminate TGSI_OPCODE_IFC.José Fonseca2013-04-174-4/+1
| | | | | | Never used or implemented. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/hud: fix FPS computation for framerate > 4.2kMarek Olšák2013-04-161-1/+2
|
* gallium/hud: increase vertex buffer size for background black rectanglesMarek Olšák2013-04-161-1/+1
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/hud: update the contents of GALLIUM_HUD=helpMarek Olšák2013-04-161-2/+17
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/hud: remove pipeline-statistics- prefix in query namesMarek Olšák2013-04-161-21/+22
| | | | | | | | for the env var string not to be awfully long v2: fix bug in indexing of "name" Reviewed-by: Brian Paul <[email protected]>
* gallivm: fix small but severe bug in handling multiple lod level stridesRoland Scheidegger2013-04-151-1/+1
| | | | | | | | | | | | | | | | | Inserting the value for the second quad in the wrong place for the following shuffle. This meant the row or image stride was undefined which is quite catastrophic, can lead to bogus texels fetched or just segfault. This code is only hit for SoA path currently, still surprising it didn't crash more or caused more visible issues (I think llvm used a broadcast shuffle for the undefined parts of the vector, hence the undefined value for the second quad was just the same as that from the first quad, so as long as both quads hit the same mip level everything was fine, and since lower mips always have the same large stride it made it less likely to hit out-of-bound memory in case of differing lods). Note: this is a candidate for stable branches. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/tgsi: handle untyped movesZack Rusin2013-04-102-0/+10
| | | | | | | | | | | both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix loops and conditionals within GSZack Rusin2013-04-102-19/+105
| | | | | | | | | | | | | We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries, which were holding the number of vertices and primitives which were emitted, were being reset to zero. Now we're using alloca to allocate those variables to preserve them across conditionals. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix unsigned divide and remainder opcodesZack Rusin2013-04-101-4/+33
| | | | | | | | | | We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0xffffffff. Based on José idea. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix breakcZack Rusin2013-04-101-12/+14
| | | | | | | | | we break when the mask values are 0 not, 1, plus it's bit comparison not a floating point comparison. This fixes both. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: Ensure struct tgsi_ind_register field Index is initialized.Vinson Lee2013-04-081-0/+1
| | | | | | | Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add const to a parameter of util_max_layerMarek Olšák2013-04-061-1/+1
|
* util: add ETC as compressed formatWladimir2013-04-051-0/+1
| | | | | | | Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: fix is_blit_generic_supported() stencil checkingBrian Paul2013-04-051-12/+14
| | | | | | | | | | | | | | | Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: add GALLIUM_HUD_PERIOD env varBrian Paul2013-04-041-1/+16
| | | | | | | To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: initialize sampler stateBrian Paul2013-04-041-0/+6
| | | | | | | | | The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <[email protected]>