summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallium/hud: add GALLIUM_HUD_PERIOD env varBrian Paul2013-04-041-1/+16
| | | | | | | To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: initialize sampler stateBrian Paul2013-04-041-0/+6
| | | | | | | | | The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <[email protected]>
* gallivm: some minor cube map cleanupRoland Scheidegger2013-04-041-10/+15
| | | | | | | | | | | | | | | The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: implement ucmpZack Rusin2013-04-041-0/+21
| | | | | | | and add a test for it Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* util: add debug_memory_check_block(), debug_memory_tag()Brian Paul2013-04-042-0/+61
| | | | | | | | | | The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: replace malloc w/ MALLOCBrian Paul2013-04-041-1/+1
| | | | | | To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <[email protected]>
* gallivm: honor explicit derivatives values for cube maps.Roland Scheidegger2013-04-044-28/+60
| | | | | | | | | | | | This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Passes piglit arb_shader_texture_lod-texgradcube test. v2: add the forgotten abs() for all incoming derivatives (discovered by new piglit arb_shader_texture_lod-texgradcube test, though more by luck as it was failing only for exactly one pixel...). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: do per-pixel cube face selection (finally!!!)Roland Scheidegger2013-04-043-82/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: minor rho calculation optimization for 1 or 3 coordsRoland Scheidegger2013-04-042-29/+22
| | | | | | | Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: use f16c hw support for float->half and half->float conversionRoland Scheidegger2013-04-044-4/+53
| | | | | | | | Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. Reviewed-by: Brian Paul <[email protected]>
* draw/llvmpipe: allow independent so attachments to the vsZack Rusin2013-04-034-14/+16
| | | | | | | | | | | | | | When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: remove unused functionZack Rusin2013-04-032-12/+0
| | | | | | | | we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/llvm: use an enum instead of magic numbersZack Rusin2013-04-032-10/+15
| | | | | | | | | | | I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: cleanup some debugging codeZack Rusin2013-04-031-4/+0
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: maintain an exact number of written verticesZack Rusin2013-04-033-7/+33
| | | | | | | | | It's quite helpful during the rendering when we know exactly the count of the vertices available in the buffer. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: Implement support for primitive idZack Rusin2013-04-038-8/+33
| | | | | | | We were largely ignoring primitive id. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: Fix bogus assertZack Rusin2013-04-031-1/+0
| | | | | | | We do support so with multiple primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: Fix memory corruption with multiple primitivesZack Rusin2013-04-031-10/+15
| | | | | | | | | | We were flushing with incorrect number of primitives. TGSI exec can only work with a single primitive at a time. Plus the fetching with multiple primitives on llvm paths wasn't copying the last element. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: cleanup the gs interfaceZack Rusin2013-04-033-50/+85
| | | | | | | | Instead of void pointers use a base interface. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* util: add new util_resource_size() function in u_resource.[ch]Brian Paul2013-04-032-1/+98
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* util: move functions from u_resource.c to u_transfer.cBrian Paul2013-04-032-75/+74
| | | | | | | | | The functions are prototyped in u_transfer.h and are related to the other functions in u_transfer.c. The next patch will re-use the u_resource.c file for new code. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: try L8 texture for font if I8 format isn't supportedBrian Paul2013-04-031-4/+13
|
* gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICSChristoph Bumiller2013-04-034-9/+52
| | | | | | | Also, renamed "pixels-rendered" to "samples-passed" because the occlusion counter increments even if colour and depth writes are disabled, or (on some implementations) for killed fragments that passed the depth test when PS early_fragment_tests is set.
* gallivm: bring back optimized but incorrect float to smallfloat optimizationsRoland Scheidegger2013-04-021-38/+78
| | | | | | | | | | | | | Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic tests either way (though there are no tests testing special cases, but some manual tests injecting them seemed promising). v2: style and comment fixes suggested by Jose Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: consolidate code for float-to-half and float-to-packed conversion.Roland Scheidegger2013-04-023-108/+102
| | | | | | | | | | | | | | | | | | This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now always be properly propagated, and rounding behavior is now towards zero (note this means too large but non-Infinity values get propagated to max representable value, not Infinity). The implementation will definitely not match util code, however (which does nearest rounding, which also means too large values will get propagated to Infinity). Also fix a bogus round mask probably leading to rounding bugs... v2: fix a logic bug in handling infs/nans. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: do .xxxx swizzling for the font texture in the fragment shaderMarek Olšák2013-04-021-6/+30
| | | | | | This allows using L8 and R8 for the font if I8 isn't supported. Tested-by: Brian Paul <[email protected]>
* hud: flush/unmap the vertex buffer before drawingBrian Paul2013-04-021-0/+3
| | | | | | | The VMware svga driver is picky about making sure the VBO is unmapped before drawing. Reviewed-by: Marek Olšák <[email protected]>
* draw: use pipe_transfer_unmap() to match pipe_transfer_map()Brian Paul2013-04-021-1/+1
|
* gallivm: fix signed small float to float conversionRoland Scheidegger2013-04-021-1/+1
| | | | | | Introduced by 5f41e08cf39d585d600aa506cdcd2f5380c60ddd, just a silly typo. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.
* gallivm: Minor comment cleanupAdam Jackson2013-04-011-1/+1
| | | | Signed-off-by: Adam Jackson <[email protected]>
* gallivm: consolidate some half-to-float and r11g11b10-to-float codeRoland Scheidegger2013-03-293-63/+52
| | | | | | | Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <[email protected]
* draw: fix some build breakage when LLVM is not usedBrian Paul2013-03-282-1/+8
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <[email protected]>
* llvmpipe/draw: Fix texture sampling in geometry shadersZack Rusin2013-03-273-62/+84
| | | | | | | | | We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/llvm: Cleanup the store debugging codeZack Rusin2013-03-271-8/+5
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: Allocate the output buffer for output primitivesZack Rusin2013-03-271-2/+1
| | | | | | | | | | We were allocating the output buffer but using the input primitives. We need to allocate that buffer using the maximum number of output, not input, primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: Implement the breakc instructionZack Rusin2013-03-272-0/+34
| | | | | | | | Required by more modern examples. Like BRK but with a condition. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: implement implicit primitive flushingZack Rusin2013-03-272-0/+15
| | | | | | | | | TGSI semantics currently require an implicit endprim at the end of GS if an ending primitive hasn't been emitted. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium/llvm: implement geometry shaders in the llvm pathsZack Rusin2013-03-279-77/+1283
| | | | | | | | | This commits implements code generation of the geometry shaders in the SOA paths. All the code is there but bugs are likely present. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: Fetch more than one primitive per invocationZack Rusin2013-03-272-13/+48
| | | | | | | | | | Allows executing gs on up to 4 primitives at a time. Will also be required by the llvm code because there we definitely don't want to flush with just a single primitive. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: Abstract the portions of GS that are tgsi specificZack Rusin2013-03-272-128/+156
| | | | | | | | | To be able to add llvm paths later on we need to have some common interface for them. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/llvm: Remove unused gs_constants from jit_contextZack Rusin2013-03-273-25/+11
| | | | | | | | | The member was never used and we'll need to handle it differently because gs will also need samplers/textures setup. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium: implement a heads-up display moduleMarek Olšák2013-03-269-0/+2082
| | | | | | Reviewed-by: Brian Paul <[email protected]> v2: lots of cosmetic changes
* gallium: add interface for driver queries like performance counters, etc.Marek Olšák2013-03-261-1/+1
| | | | | | | The pipe query interface is reused. The list of available queries can be obtained using pipe_screen::get_driver_query_info. Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: fix valgrind warningMarek Olšák2013-03-261-1/+1
| | | | | | "Conditional jump or move depends on uninitialised value(s)" Reviewed-by: Brian Paul <[email protected]>
* cso: add constant buffer save/restore feature for postprocessingMarek Olšák2013-03-264-2/+78
| | | | | Postprocessing is an internal meta op and should restore the states it changes.
* gallium: undef PACKAGE_* macros to silence warningsBrian Paul2013-03-251-0/+8
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own fileRoland Scheidegger2013-03-245-343/+392
| | | | | This is really not generic conversion stuff and the code very particular to these formats.
* gallivm: Add code for rgb9e5 shared exponent format to float conversionRoland Scheidegger2013-03-243-3/+118
| | | | | | | | | | | | And use this (and the code for r11g11b10 packed float to float conversion) in the soa texturing code (the generated code looks quite good). Should be an order of magnitude faster probably than using the fallback (not measured). Tested with piglit texwrap GL_EXT_packed_float and GL_EXT_texture_shared_exponent respectively (didn't find much else using it). Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: add EXT_packed_float render target format supportRoland Scheidegger2013-03-222-0/+252
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for texture sampling instead of relying on u_format conversion but it's not yet, so rgb9e5 is unused.) Unfortunately a crazy amount of hacks is necessary to get the conversion code running in llvmpipe's generate_unswizzled_blend, which isn't well suited for formats where the storage representation has nothing to do with what's needed for blending (moreover, the conversion will convert from packed AoS values, which is the storage format, to float SoA values, because this is much more natural for the conversion, and likewise from SoA values to packed AoS values - but the "blend" (which includes trivial things like partial mask) works on AoS values, so incoming fs values will go SoA->AoS, values from destination will go packed AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably isn't the most efficient way though the shuffles are probably bearable). Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter), still need to verify Inf/NaNs (where most of the complexity in the conversion comes from actually). v2: drop the (very bogus) rgb9e5 part, and do component extraction in the helper code for r11g11b10 to float conversion, making the code slightly more compact (suggested by Jose), now that there are no other callers left this works quite well. (Could do the same for the opposite way but it's less than ideal there, final part of packing needs to be done in caller anyway and there'd be another conditional.) v3: minor style and comment fixes. Also fix a potential issue with negative zero being potentially returned by max(src, zero) as we don't have well-defined min/max behavior (fortunately no additonal cost). Reviewed-by: Jose Fonseca <[email protected]>
* postprocess: silence some MSVC float/int warningsBrian Paul2013-03-212-4/+4
|