summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/util
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: easy fixes for NULL colorbuffersMarek Olšák2014-01-132-1/+7
| | | | Reviewed-by: Brian Paul <[email protected]>
* st/mesa: bind NULL colorbuffers as specified by glDrawBuffersMarek Olšák2014-01-132-0/+25
| | | | | | | | | | | | | | | | | | | | An example why it is required: Let's say there's a fragment shader writing to gl_FragData[0..1]. The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0}); That means gl_FragData[0] is unused and gl_FragData[1] is written to GL_COLOR_ATTACHMENT0. st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0] was written to GL_COLOR_ATTACHMENT0, which was wrong. This commit fixes it, but drivers must also be fixed not to crash when binding NULL colorbuffers. There is also a new set of piglit tests for this. The MSAA state also had to be fixed not to crash when reading fb->cbufs[0]. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: implement shader-based MSAA resolve with bilinear filteringMarek Olšák2013-12-143-31/+149
| | | | | | | | | For scaled resolve. The filter is only good for magnification. If somebody has an idea how to implement a good filter for minification, I'm all ears. I'd have to use derivatives probably. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: implement shader-based MSAA resolveMarek Olšák2013-12-143-23/+158
| | | | | | | | | We need this for integer formats and upside-down blits, which Radeons don't support for MSAA resolving. It can be used by calling util_blitter_blit. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: remove useless parameters from some functionsMarek Olšák2013-12-142-22/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* util: fix compile breakageBrian Paul2013-12-121-1/+1
| | | | D'oh!
* util: move variable declaration out of for-loopBrian Paul2013-12-121-1/+3
| | | | To fix MSVC build.
* gallium/util: implement new color clear API in u_blitterMarek Olšák2013-12-121-3/+42
|
* gallium: allow choosing which colorbuffers to clearMarek Olšák2013-12-121-3/+4
| | | | | | | | | | | | | | Required for glClearBuffer, which only clears one colorbuffer attachment. Example: If the first colorbuffer is float and the second one is int: pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...); pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...); This doesn't need any driver changes yet, because all drivers just use: if (flags & PIPE_CLEAR_COLOR) .. The drivers which support GL 3.0 will have to implement it properly though.
* gallium/util: implement layered framebuffer clear in u_blitterMarek Olšák2013-12-036-25/+106
| | | | | | | | | | | | | All bound layers (from first_layer to last_layer) should be cleared. This uses a vertex shader which outputs gl_Layer = gl_InstanceID, so each instance goes to a different layer. By rendering a quad and setting the instance count to the number of layers, it will trivially clear all layers. This requires AMD_vertex_shader_layer (or PIPE_CAP_TGSI_VS_LAYER), which only radeonsi supports at the moment. r600 could do this too. Standard DX11 hardware will have to use a geometry shader though, which has higher overhead.
* trace: Dump PIPE_QUERY_* enums.José Fonseca2013-11-282-0/+36
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* u_gen_mipmap: Use untampered cubemap texture coords when generating mipmaps.José Fonseca2013-11-205-6/+19
| | | | | | | | | | | | | | | | | | | It's not necessary to scale down cubemap texture coords when generating mipmaps: we are doing a 2x minification therefore it's guaranteed that the texture coords will always be at least 1 texel away of the edges. Scaling down can actually be harmful, as it may cause artefacts when generating mipmaps with nearest filtering. Sample points will lie exactly in the middle each 2x2 texels, so the scaling factor was causing different texels to be take on each quadrant of the cube face. This is apparent with a 1x1 checkerboard pattern in the base mipmap level: instead of next mipmap level receiving a constant color throughout the face, it will have different colors for each quadrant of the face. The behaviour for blits is left untouched for now, but the cubemap texture coord scaling hack should be reconsidered eventually. Reviewed-by: Brian Paul <[email protected]>
* util: set all unused cbufs to NULL in util_copy_framebuffer_state()Brian Paul2013-11-111-1/+2
| | | | | | This helps fix an issue in the svga driver, and is just safer all-around. Reviewed-by: José Fonseca <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-073-11/+36
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_format: take normalized flag in consideration in ↵José Fonseca2013-11-051-0/+3
| | | | | | util_format_is_rgba8_variant Just happened to notice it was missing while looking at it.
* util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2013-10-292-0/+52
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-231-6/+6
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* util: Fix MinGW build.José Fonseca2013-10-091-1/+1
| | | | | _GNU_SOURCE appears to not be used reliably. Use _MSC_VER instead so that MSVC alone is affected.
* util/u_math: Fix C++ include of u_math.h on MSVC.José Fonseca2013-10-101-1/+1
| | | | | | | GNU C++ compiler declares the C99 lrint, etc. when _GNU_SOURCE is defined, but MSVC does not. Trivial.
* util: when packing depth values, round to nearest.Matthew McClure2013-10-042-4/+56
| | | | | | | This patch adds the lrint, lrintf, llrint, and llrintf rounding utility functions. When packing unorm depth values, we will round to nearest. Reviewed-by: Roland Scheidegger <[email protected]>
* util: remove old bind_fragment_sampler_states() calls from blitter codeBrian Paul2013-10-031-22/+9
|
* util: use pipe_context::bind_sampler_states() if non-nullBrian Paul2013-10-031-6/+22
|
* util/u_format: Assert that format block size is at least 1 byte.Vinson Lee2013-09-301-1/+6
| | | | | | | | | | | The block size for all formats is currently at least 1 byte. Add an assertion for this. This should silence several Coverity "Division or modulo by zero" defects. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: include u_surface.h instead of u_rect.hBrian Paul2013-09-303-8/+2
| | | | | | | | u_rect.h was including u_surface.h just to avoid touching a bunch of other source files after some functions were moved from u_rect.h to u_surface.h. This patch cleans up that hack. Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_blit: Implement util_blit_pixels via pipe_context::blit.José Fonseca2013-09-181-410/+37
| | | | | | | | | This removes a lot of code, but not everything, as util_blit_pixels_tex is still useful when one needs to override pipe_sampler_view::swizzle_?. Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_blit: Support blits from cubemaps.José Fonseca2013-09-182-3/+32
| | | | | | | | | | | | By calling util_map_texcoords2d_onto_cubemap. A new parameter for util_blit_pixels_tex is necessary, as pipe_sampler_view::first_layer is always supposed to point to the first face when sampling from cubemaps. Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util: Fix unmatched parenthesis.Vinson Lee2013-09-101-1/+1
| | | | | | | | | Fixes MSVC build error introduced with commit 923d3467147dd301d94ed3e6b41295fb2bcd6f47. src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '(' Signed-off-by: Vinson Lee <[email protected]>
* util: don't use _fxsave() with MSVC 2010 or olderBrian Paul2013-09-101-1/+4
| | | | | | And update _MSC_VER comments in p_config.h Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Support PIPE_FORMAT_R10G10B10A2_UINT.José Fonseca2013-08-221-0/+1
| | | | | | Same as PIPE_FORMAT_B10G10R10A2_UINT but without the swizzling. Reviewed-by: Roland Scheidegger <[email protected]>
* util: add avx2 and xop detection to cpu detection codeRoland Scheidegger2013-08-202-0/+50
| | | | | | | | | | Going to need this soon (not going to bother with avx2 intrinsics at this time but don't want to do workarounds for true vector shifts if llvm itself can use them just fine and won't need the gazillion instruction emulation). Not really tested other than my cpu returns 0 for these features... (I have no idea if llvm actually would emit avx2/xop instructions neither...) Reviewed-by: Jose Fonseca <[email protected]>
* vl: rename enum pipe_video_codec to pipe_video_formatChristian König2013-08-191-6/+6
| | | | Signed-off-by: Christian König <[email protected]>
* util: (trivial) fix asm input/output list for fxsaveRoland Scheidegger2013-08-091-1/+1
| | | | | Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak. Hopefully this time it's finally right?
* util: (trivial) fix more compile errors in u_cpu_detect (gcc/x86 this time).Dieter Nützel2013-08-091-1/+1
| | | | Oops. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=67921
* util: (trivial) fix compile error with MSVC on x86Roland Scheidegger2013-08-081-1/+1
|
* util: try much harder to set DAZ flagRoland Scheidegger2013-08-083-1/+31
| | | | | | | | | | | | | | | | | | While so far this only causes some harmless test failures, there's lots more cpus with DAZ. All 64bit capable ones can do it (particularly relevant for AMD cpus as they supported sse3 very very late) but if really necessary we can check support for that for real with some more magic. (In fact just about ANY cpu with sse2 can support DAZ, I believe the only exception are first gen P4 (Willamette) and from those only early steppings which can't do it it's almost like intel forgot to add it... - a real pity though docs say you can't just try to set it as they will throw a GPF.) While this was meant to address https://bugs.freedesktop.org/show_bug.cgi?id=67672 it does not fix it. Most likely the tests need fixing as I don't think there's any guarantee about denorm handling in the reference math library functions if the flags aren't set to standard values. Nevertheless enabling DAZ on all cpus which can do it should be the right thing to do. Reviewed-by: Jose Fonseca <[email protected]>
* util: implement table-based + linear interpolation linear-to-srgb conversionRoland Scheidegger2013-08-082-11/+102
| | | | | | | | | | | | | | | | | Should be much faster, seems to work in softpipe. While here (also it's now disabled) fix up the pow factor - the former value is what is in GL core it is however not actually accurate to fp32 standard (as it is 1.0/2.4), and if someone would do all the accurate math there's no reason to waste 8 mantissa bits or so... v2: use real table generating function instead of just printing the values (might take a bit longer as it does calculations on some 3+ million floats but much more descriptive obviously). Also fix up another inaccurate pow factor (this time in the python code) - wondering where the couple one bit errors came from :-(. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* gallium/util: reformat, comment util_get_offset()Brian Paul2013-07-311-3/+7
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: comments, var renaming in u_inlines.hBrian Paul2013-07-311-13/+48
| | | | | | | | | | | | The variable 'usage' was being used for two different things. Sometimes for PIPE_USAGE_x and other times for PIPE_TRANSFER_x. This renames usage to access when we're talking about PIPE_TRANSFER_x flags. Plus, add a bunch of comments to remind us what's going on. Also, use unsigned for PIPE_TRANSFER_x bitmask to be consistent with other places. And add a missing const qualifier. Reviewed-by: Roland Scheidegger <[email protected]>
* st/dri: implement the driconf option force_s3tc_enable properlyMarek Olšák2013-07-301-10/+2
| | | | Reviewed-by: Brian Paul <[email protected]>
* util: don't flush overflowing values to infinity in half-float conversionRoland Scheidegger2013-07-272-9/+17
| | | | | | | | | | | | | | | | | | I am not able to find _any_ rounding behavior specified for OpenGL for float to half-float conversions. However, it is specified for fp11/fp10 which suggests round to next finite value but round-to-zero would also be allowed, but finite values must not be flushed to infinity in either case. Hence I believe it makes sense to do the same for half-floats too. We could probably also use round-to-zero consistently, which is in fact required by d3d10 (but it doesn't seem to matter much). Does not match the mesa core function doing the same though (which is saying it was built to match intel gpus which I don't believe for a second as it would cause failures in d3d10, moreover the PRM (for ivy bridge, not listed in older manuals) while not specifying rounding behavior clearly states finite numbers are never flushed to infinity). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/util: Fix detection of AVX cpu capsAndre Heider2013-07-231-2/+25
| | | | | | | | | | | | | | | | | For AVX it's not sufficient to only rely on the cpuid flags. If the CPU supports these extensions, but the OS doesn't, issuing these insns will trigger an undefined opcode exception. In addition to the AVX cpuid bit we also need to: * test cpuid for OSXSAVE support * XGETBV to check if the OS saves/restores AVX regs on context switches See "Detecting Availability and Support" at http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* util/u_math: Define NAN/INFINITY macros for MSVC.José Fonseca2013-07-201-0/+4
| | | | Untested. But should hopefully fix the build.
* util/u_format_s3tc: handle srgb formats correctly.Roland Scheidegger2013-07-172-185/+254
| | | | | | | | | | | | | | | | Instead of just ignoring the srgb/linear conversions, simply call the corresponding conversion functions, for all of pack/unpack/fetch, both for float and unorm8 versions (though some don't make a whole lot of sense, i.e. unorm8/unorm8 srgb/linear combinations). Refactored some functions a bit so don't have to duplicate all the code (there's a slight change for packing dxt1_rgb, as there will now be always 4 components initialized and sent to the external compression function so the same code can be used for all, the quite horrid and ad-hoc interface (by now) should always have worked with that). Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}intEmil Velikov2013-07-172-8/+8
| | | | | | | | Every function but the above four uses explicitly sized types for their src and dst arguments. Even fetch_rgba_{s,u}int follows the convention. Signed-off-by: Emil Velikov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* util/u_format: Comment out half float denormal test case.José Fonseca2013-07-121-0/+5
| | | | So that lp_test_format doesn't fail until we decide what should be done.
* tgsi: rename the TGSI fragment kill opcodesBrian Paul2013-07-121-5/+5
| | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <[email protected]>
* util: add casts to silence MSVC warnings in u_blit.cBrian Paul2013-07-121-14/+14
|
* util/u_math: Use xmmintrin.h whenever possible.José Fonseca2013-07-101-9/+17
| | | | | | | | | | | | | It seems __builtin_ia32_ldmxcsr is only available on gcc and only when -msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but these too are only available with gcc when -msse/-msse3 are set. scons build always sets -msse on x86 builds, but autotools doesn't seem to. We could try to get this working on gcc x86 without -msse by emitting assembly, but I believe that in this day and age we really should be building Mesa with -msse and -msse2.
* util: treat denorm'ed floats like zeroZack Rusin2013-07-092-0/+63
| | | | | | | | | | | | | The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/translate: fix instancingZack Rusin2013-06-281-4/+4
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>