summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallivm/gs: fix indirect addressing in geometry shadersZack Rusin2013-04-173-6/+30
| | | | | | | | | | | We were always treating the vertex index as a scalar but when the shader is using indirect addressing it will be a vector of indices for each channel. This was causing some nasty crashes insides LLVM. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/gs: make sure geometry shaders don't overflowZack Rusin2013-04-165-11/+81
| | | | | | | | | | | | | | The specification says that the geometry shader should exit if the number of emitted vertices is bigger or equal to max_output_vertices and we can't do that because we're running in the SoA mode, which means that our storing routines will keep getting called on channels that have overflown (even though they will be masked out, but we just can't skip them). So we need some scratch area where we can keep writing the overflown vertices without overwriting anything important or crashing. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw/gs: Return early if the passed geometry shader is nullZack Rusin2013-04-161-0/+3
| | | | | | | | Can happen if we were using stream output without geometry shader, by returning early we avoid a crash. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: implement pipeline statistics in the draw moduleZack Rusin2013-04-1610-20/+112
| | | | | | | | | | | | | This is a basic implementation of the pipeline statistics in the draw module. The interface is similar to the stream output statistics and also requires that the callers explicitly enable it. Included is the implementation of the interface in llvmpipe and softpipe. Only softpipe enables the pipeline statistics capability though because llvmpipe is lacking gathering of the fragment shading and rasterization statistics. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/gs: fix the end primitive callsZack Rusin2013-04-162-27/+50
| | | | | | | | | | | | | | | The issue with SOA execution and end_primitive opcode is that it can be executed both when we haven't emitted any vertices, in which case we don't want to emit an empty primitive, and when the execution mask is zero and the execution should be skipped. We handled only the latter of those conditions. Now we're combining the execution mask with a mask created from emitted vertices to handle both cases. As a result we don't need the pending_end_primitive flag which was broken because it was static and could be affected by both above mentioned conditions at run-time. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi/exec: geometry shaders are executed on a single primitiveZack Rusin2013-04-161-13/+17
| | | | | | | | which means that our execution mask in GS is equal to 1 not 0xf. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi/exec: fix the udiv and umod instructionsZack Rusin2013-04-161-8/+8
| | | | | | | | | Same as with llvmpipe: we can't be divind/moding by zero and we need to make sure that dividing/moding by zero produces 0xffffffff. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: JIT symbol resolution with linux perf.José Fonseca2013-04-175-59/+101
| | | | | | | Details on docs/llvmpipe.html Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: Silence uninitialized var warnings.José Fonseca2013-04-172-3/+7
| | | | Trivial.
* gallium: Disambiguate TGSI_OPCODE_IF.José Fonseca2013-04-178-1/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_IF condition had two possible interpretations: - src.x != 0.0f - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for vertex and fragment shaders - gallivm/llvmpipe - postprocess - vl state tracker - vega state tracker - most old drivers - old internal state trackers - many graw examples - src.x != 0U - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both vertex and fragment shaders - tgsi_exec/softpipe - r600 - radeonsi - nv50 And drivers that use draw module also were a mess (because Mesa would emit float IFs, but draw module supports native integers so it would interpret IF arg as integers...) This sort of works if the source argument is limited to float +0.0f or +1.0f, integer 0, but would fail if source is float -0.0f, or integer in the float NaN range. It could also fail if source is integer 1, and hardware flushes denormalized numbers to zero. But with this change there are now two opcodes, IF and UIF, with clear meaning. Drivers that do not support native integers do not need to worry about UIF. However, for backwards compatibility with old state trackers and examples, it is advisable that native integer capable drivers also support the float IF opcode. I tried to implement this for r600 and radeonsi based on the surrounding code. I couldn't do this for nouveau, so I just shunted IF/UIF together, which matches the current behavior. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> v2: - Incorporate Roland's feedback. - Fix r600_shader.c merge conflict. - Fix typo in radeon, spotted by Michel Dänzer. - Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float) properly in nv50/ir.
* gallium: Eliminate TGSI_OPCODE_IFC.José Fonseca2013-04-174-4/+1
| | | | | | Never used or implemented. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/hud: fix FPS computation for framerate > 4.2kMarek Olšák2013-04-161-1/+2
|
* gallium/hud: increase vertex buffer size for background black rectanglesMarek Olšák2013-04-161-1/+1
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/hud: update the contents of GALLIUM_HUD=helpMarek Olšák2013-04-161-2/+17
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium/hud: remove pipeline-statistics- prefix in query namesMarek Olšák2013-04-161-21/+22
| | | | | | | | for the env var string not to be awfully long v2: fix bug in indexing of "name" Reviewed-by: Brian Paul <[email protected]>
* gallivm: fix small but severe bug in handling multiple lod level stridesRoland Scheidegger2013-04-151-1/+1
| | | | | | | | | | | | | | | | | Inserting the value for the second quad in the wrong place for the following shuffle. This meant the row or image stride was undefined which is quite catastrophic, can lead to bogus texels fetched or just segfault. This code is only hit for SoA path currently, still surprising it didn't crash more or caused more visible issues (I think llvm used a broadcast shuffle for the undefined parts of the vector, hence the undefined value for the second quad was just the same as that from the first quad, so as long as both quads hit the same mip level everything was fine, and since lower mips always have the same large stride it made it less likely to hit out-of-bound memory in case of differing lods). Note: this is a candidate for stable branches. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm/tgsi: handle untyped movesZack Rusin2013-04-102-0/+10
| | | | | | | | | | | both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix loops and conditionals within GSZack Rusin2013-04-102-19/+105
| | | | | | | | | | | | | We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries, which were holding the number of vertices and primitives which were emitted, were being reset to zero. Now we're using alloca to allocate those variables to preserve them across conditionals. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix unsigned divide and remainder opcodesZack Rusin2013-04-101-4/+33
| | | | | | | | | | We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0xffffffff. Based on José idea. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix breakcZack Rusin2013-04-101-12/+14
| | | | | | | | | we break when the mask values are 0 not, 1, plus it's bit comparison not a floating point comparison. This fixes both. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: Ensure struct tgsi_ind_register field Index is initialized.Vinson Lee2013-04-081-0/+1
| | | | | | | Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add const to a parameter of util_max_layerMarek Olšák2013-04-061-1/+1
|
* util: add ETC as compressed formatWladimir2013-04-051-0/+1
| | | | | | | Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: fix is_blit_generic_supported() stencil checkingBrian Paul2013-04-051-12/+14
| | | | | | | | | | | | | | | Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: add GALLIUM_HUD_PERIOD env varBrian Paul2013-04-041-1/+16
| | | | | | | To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: initialize sampler stateBrian Paul2013-04-041-0/+6
| | | | | | | | | The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <[email protected]>
* gallivm: some minor cube map cleanupRoland Scheidegger2013-04-041-10/+15
| | | | | | | | | | | | | | | The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: implement ucmpZack Rusin2013-04-041-0/+21
| | | | | | | and add a test for it Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* util: add debug_memory_check_block(), debug_memory_tag()Brian Paul2013-04-042-0/+61
| | | | | | | | | | The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: replace malloc w/ MALLOCBrian Paul2013-04-041-1/+1
| | | | | | To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <[email protected]>
* gallivm: honor explicit derivatives values for cube maps.Roland Scheidegger2013-04-044-28/+60
| | | | | | | | | | | | This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Passes piglit arb_shader_texture_lod-texgradcube test. v2: add the forgotten abs() for all incoming derivatives (discovered by new piglit arb_shader_texture_lod-texgradcube test, though more by luck as it was failing only for exactly one pixel...). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: do per-pixel cube face selection (finally!!!)Roland Scheidegger2013-04-043-82/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: minor rho calculation optimization for 1 or 3 coordsRoland Scheidegger2013-04-042-29/+22
| | | | | | | Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: use f16c hw support for float->half and half->float conversionRoland Scheidegger2013-04-044-4/+53
| | | | | | | | Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. Reviewed-by: Brian Paul <[email protected]>
* draw/llvmpipe: allow independent so attachments to the vsZack Rusin2013-04-034-14/+16
| | | | | | | | | | | | | | When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: remove unused functionZack Rusin2013-04-032-12/+0
| | | | | | | | we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/llvm: use an enum instead of magic numbersZack Rusin2013-04-032-10/+15
| | | | | | | | | | | I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: cleanup some debugging codeZack Rusin2013-04-031-4/+0
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: maintain an exact number of written verticesZack Rusin2013-04-033-7/+33
| | | | | | | | | It's quite helpful during the rendering when we know exactly the count of the vertices available in the buffer. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: Implement support for primitive idZack Rusin2013-04-038-8/+33
| | | | | | | We were largely ignoring primitive id. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: Fix bogus assertZack Rusin2013-04-031-1/+0
| | | | | | | We do support so with multiple primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: Fix memory corruption with multiple primitivesZack Rusin2013-04-031-10/+15
| | | | | | | | | | We were flushing with incorrect number of primitives. TGSI exec can only work with a single primitive at a time. Plus the fetching with multiple primitives on llvm paths wasn't copying the last element. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: cleanup the gs interfaceZack Rusin2013-04-033-50/+85
| | | | | | | | Instead of void pointers use a base interface. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* util: add new util_resource_size() function in u_resource.[ch]Brian Paul2013-04-032-1/+98
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* util: move functions from u_resource.c to u_transfer.cBrian Paul2013-04-032-75/+74
| | | | | | | | | The functions are prototyped in u_transfer.h and are related to the other functions in u_transfer.c. The next patch will re-use the u_resource.c file for new code. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: try L8 texture for font if I8 format isn't supportedBrian Paul2013-04-031-4/+13
|
* gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICSChristoph Bumiller2013-04-034-9/+52
| | | | | | | Also, renamed "pixels-rendered" to "samples-passed" because the occlusion counter increments even if colour and depth writes are disabled, or (on some implementations) for killed fragments that passed the depth test when PS early_fragment_tests is set.
* gallivm: bring back optimized but incorrect float to smallfloat optimizationsRoland Scheidegger2013-04-021-38/+78
| | | | | | | | | | | | | Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic tests either way (though there are no tests testing special cases, but some manual tests injecting them seemed promising). v2: style and comment fixes suggested by Jose Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: consolidate code for float-to-half and float-to-packed conversion.Roland Scheidegger2013-04-023-108/+102
| | | | | | | | | | | | | | | | | | This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now always be properly propagated, and rounding behavior is now towards zero (note this means too large but non-Infinity values get propagated to max representable value, not Infinity). The implementation will definitely not match util code, however (which does nearest rounding, which also means too large values will get propagated to Infinity). Also fix a bogus round mask probably leading to rounding bugs... v2: fix a logic bug in handling infs/nans. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: do .xxxx swizzling for the font texture in the fragment shaderMarek Olšák2013-04-021-6/+30
| | | | | | This allows using L8 and R8 for the font if I8 isn't supported. Tested-by: Brian Paul <[email protected]>