aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallium: fixup definitions of the rsq and sqrtZack Rusin2013-07-112-13/+8
| | | | | | | | | | | | GLSL spec says that rsq is undefined for src<=0, but the D3D10 spec says it needs to be a NaN, so lets stop taking an absolute value of the source which completely breaks that behavior. For the gl program we can simply insert an extra abs instrunction which produces the desired behavior there. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* util/u_format: Comment out half float denormal test case.José Fonseca2013-07-121-0/+5
| | | | So that lp_test_format doesn't fail until we decide what should be done.
* gallivm: Eliminate redundant lp_build_select calls.José Fonseca2013-07-121-12/+2
| | | | | | | lp_build_cmp already returns 0 / ~0, so the lp_build_select call is unnecessary. Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: rename the TGSI fragment kill opcodesBrian Paul2013-07-1214-46/+44
| | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: fix-up KILP commentsBrian Paul2013-07-122-5/+3
| | | | | | | | KILP is really unconditional fragment kill. We've had KIL and KILP transposed forever. I'll fix that next. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vectorBrian Paul2013-07-121-1/+1
| | | | | | To align with the docs and the state tracker. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: use X component of the second operand in exec_scalar_binary()Brian Paul2013-07-121-1/+1
| | | | | | | | | The code happened to work in the past since the (scalar) src args effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so whether you grab the X or Y component doesn't really matter. Just fixing the code to make it look right. Reviewed-by: Roland Scheidegger <[email protected]>
* os: add os_get_process_name() functionBrian Paul2013-07-123-0/+133
| | | | | v2: explicitly test for BSD/APPLE, #warning for unexpected environments.
* hud: silence some MSVC warningsBrian Paul2013-07-121-8/+8
|
* util: add casts to silence MSVC warnings in u_blit.cBrian Paul2013-07-121-14/+14
|
* tgsi: s/unsigned/int/ to silence MSVC warningBrian Paul2013-07-121-1/+1
|
* util/u_math: Use xmmintrin.h whenever possible.José Fonseca2013-07-101-9/+17
| | | | | | | | | | | | | It seems __builtin_ia32_ldmxcsr is only available on gcc and only when -msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but these too are only available with gcc when -msse/-msse3 are set. scons build always sets -msse on x86 builds, but autotools doesn't seem to. We could try to get this working on gcc x86 without -msse by emitting assembly, but I believe that in this day and age we really should be building Mesa with -msse and -msse2.
* util: treat denorm'ed floats like zeroZack Rusin2013-07-094-0/+72
| | | | | | | | | | | | | The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetchRoland Scheidegger2013-07-051-1/+2
| | | | | | The logic for choosing number of lods was bogus. (The code should ultimately handle the case of only one lod even with multiple quads but currently can't.)
* gallivm: Remove bogus assert.José Fonseca2013-07-051-4/+1
| | | | | | | | | | | It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.
* gallivm: (trivial) fix bogus assertion for per-element lod with 1d resourcesRoland Scheidegger2013-07-052-2/+1
| | | | | | The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).
* gallivm: do per-pixel lod calculations for explicit lodRoland Scheidegger2013-07-049-125/+193
| | | | | | | | | | | | | | | | | | | | | d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix overflows in the indexed rendering pathsZack Rusin2013-07-034-43/+159
| | | | | | | | | | | | | The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: index overflows if it's greater than elt maxZack Rusin2013-07-031-1/+1
| | | | | | | | | The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* postprocess: move second temporary assertion into isolated configurationMatthew McClure2013-07-031-2/+2
| | | | | | | | | With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <[email protected]>
* gallivm: Simplify intrinsic name construction.José Fonseca2013-07-021-23/+10
| | | | | | Just noticed this could be slightly shortened when fixing MSVC build. Trivial.
* gallivm: Fix MSVC build.José Fonseca2013-07-021-8/+7
|
* gallivm: Fix indirect immediate registers.José Fonseca2013-07-021-2/+2
| | | | | | | | | | | If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* draw/translate: fix instancingZack Rusin2013-06-2813-24/+93
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>
* draw: fix incorrect clipper invocation statisticsZack Rusin2013-06-281-6/+0
| | | | | | | | clipper invocations are computed earlier (of course before the emittion) so this code was adding bogus numbers to already computed clipper invocations. Signed-off-by: Zack Rusin <[email protected]>
* draw/gallivm: export overflow arithmetic to its own fileZack Rusin2013-06-284-44/+234
| | | | | | | We'll be reusing this code so lets put it in a common file and use it in the draw module. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for integer overflows in instance computationZack Rusin2013-06-282-0/+7
| | | | | | | | | Integers could easily overflow is the starting instance was large enough. Instead of letting bogus counts through set the instance to max if it overflown and let our regular buffer overflow computation handle it. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for an integer overflow when computing strideZack Rusin2013-06-281-10/+43
| | | | | | | | | Our buffer overflow arithmetic was susceptible to integer overflows which was the buffer overflow logic to break. Lets use the llvm overflow intrinsics to check for integer overflows while computing the stride/needed buffer size. Signed-off-by: Zack Rusin <[email protected]>
* draw: account for elem size when computing overflowZack Rusin2013-06-281-7/+23
| | | | | | | | | | | We weren't taking into account the size of element that is to be fetched, which meant that it was possible to overflow the buffer reads if the stride was very close to the end of the buffer, e.g. stride = 3, buffer size = 4, and the element to be read = 4. This should be properly detected as an overflow. Signed-off-by: Zack Rusin <[email protected]>
* st/mesa: handle SNORM formats in generic CopyPixels pathMarek Olšák2013-06-302-0/+23
| | | | v2: check desc->is_mixed in util_format_is_snorm
* postprocess: handle partial intialization failures.Matthew McClure2013-06-277-95/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes segfaults observed when enabling the post processing features. When the format is not supported, or a texture cannot be created, the code must gracefully handle failure and report the error to the calling code for proper failure handling. To accomplish this the following changes were made to the filters.h prototypes: - bool return for pp_init_func - Added pp_free_func for filter specific resource destruction Fixes segfaults from backtraces: * util_destroy_blit pp_free * u_transfer_inline_write_vtbl pp_jimenezmlaa_init_run pp_init This patch also uses tgsi_alloc_tokens to allocate temporary tokens in pp_tgsi_to_state, instead of allocating the array on the stack. This fixes the following stack corruption segfault in pp_run.c: * _int_free aaline_delete_fs_state pp_free Bug Number: 1021843 Reviewed-by: Brian Paul <[email protected]>
* hud: add float casts to silence MSVC warningsBrian Paul2013-06-261-49/+49
|
* hud: include stdio.h since we use fprintf(), fscanf(), etcBrian Paul2013-06-261-0/+2
|
* hud: add cast to silence MSVC warningBrian Paul2013-06-261-1/+1
|
* os: add cast in os_time_sleep() to silence MSVC warningBrian Paul2013-06-261-1/+1
|
* util: int/unsigned changes to silence some MSVC warningsBrian Paul2013-06-262-3/+3
|
* util: add some casts to silence some MSVC warningsBrian Paul2013-06-261-2/+2
|
* util: s/int/unsigned/ to silence some MSVC warningsBrian Paul2013-06-261-2/+2
|
* vl/mpeg12: handle mpeg-1 bitstreams more correctlyMaarten Lankhorst2013-06-261-5/+16
| | | | | Add support for D-frames. Add support for slices ending on a different horizontal row of macroblocks.
* util/debug: Cleanup/improve debug_symbol_name_dbghelp.José Fonseca2013-06-251-78/+161
| | | | | | | | | | | | - use mgwhelp -- the successor for bfdhelp which does not have a hard dependency on BFD, and works on 64bits. - use a macro instead of hand-typing to dispatch DbgHelp functions - dump line numbers - dump module names when symbols are not available - support 64bits. - add comments Reviewed-by: Brian Paul <[email protected]>
* util/debug: Make debug_backtrace_capture work for 64bit windows.José Fonseca2013-06-252-2/+61
| | | | | | Rely on Windows' CaptureStackBackTrace to do the grunt work. Reviewed-by: Brian Paul <[email protected]>
* draw: allow overflows in the llvm pathsZack Rusin2013-06-251-4/+8
| | | | | | | | | | | | Because our code couldn't handle it we were skipping rendering if we detected overflows. According to the spec we should still render but with all 0 vertices, which is what the llvm code already does. So for the llvm paths lets enable processing even if an overflow condition has been detected. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: avoid overflows in the llvm draw loopZack Rusin2013-06-251-8/+6
| | | | | | | | | | | Before we could easily overflow if start+count>max integer. To avoid it we can just iterate over the count. This makes sure that we never crash, since most of the overflow conditions is already handled. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/hud: do not use free() for the free_query_data hookBrian Paul2013-06-243-3/+23
| | | | | | | That confuses Gallium's memory debugging code where CALLOC/MALLOC must be matched with FREE, not free(). Reviewed-by: Marek Olšák <[email protected]>
* draw: check for out-of-memory conditions in the AA line module.Matthew McClure2013-06-241-7/+34
| | | | | | | | | | | | To prevent segfaults in the AA line module, the code will check for a valid pointer to the aaline_stage in the draw context. Fixes segfault from backtrace: * aaline_stage_from_pipe aaline_delete_fs_state Reviewed-by: Brian Paul <[email protected]>
* gallium: Fix llvmpipe on big-endian machinesAdam Jackson2013-06-2415-256/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Squashed commit of the following: commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 0d65131649a8aa140e2db228ba779d685c4333e3 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix big-endian machines This adds a bit-shift count to the format table, and adds the concept of vector or bitwise alignment on gathers. Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 9740bda9b7dc894b629ed38be9b51059ce90818f Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:07 2013 -0400 llvmpipe: Fix convert_to_blend_type on big-endian Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit ae037c2de0f029e4e99371c0de25560484f0d8df Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 util: Convert color pack to packed formats This fixes them on big-endian. Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 graw-xlib: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 format: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 417b60bc66eb450e68a92ab0e47f76e292b385e6 Author: Adam Jackson <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 st/dri: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 0934b2e022a5e0847d312c40734e2b44cac52fd8 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 st/xlib: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit a307ea3c3716a706963acce7966b5e405ba11db9 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 gbm: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 tests: Convert to packed formats Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 2f77fe3ee524945eacd546efcac34f7799fb3124 Author: Adam Jackson <[email protected]> Date: Tue Jun 18 13:07:37 2013 -0400 gallium: Document packed formats Signed-off-by: Adam Jackson <[email protected]> commit 1f1017159ce951f922210a430de9229f91f62714 Author: Richard Sandiford <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 gallium: Introduce 32-bit packed format names These are for interacting with buffers natively described in terms of bit shifts, like X11 visuals: uint32_t xyzw8888 = (x << 0) | (y << 8) | (z << 16) | (w << 24); Define these in terms of (endian-dependent) aliases to the array-style format names. Reviewed-by: Adam Jackson <[email protected]> Signed-off-by: Richard Sandiford <[email protected]> commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a Author: Adam Jackson <[email protected]> Date: Mon Jun 3 12:10:32 2013 -0400 gallium: Document format name conventions v2: - Fix a channel name thinko (Michel Dänzer) - Elaborate on SCALED versus INT - Add links to DirectX and FOURCC docs Signed-off-by: Adam Jackson <[email protected]> commit df4d269e7fb62051a3c029b84147465001e5776e Author: Adam Jackson <[email protected]> Date: Tue Jun 18 12:25:06 2013 -0400 gallivm: Remove all notion of byte-swapping Signed-off-by: Adam Jackson <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* vl/mpeg12: fix mpeg-1 bytestream parsingMaarten Lankhorst2013-06-221-6/+24
| | | | | | | | | | | This fixes the bytestream parsing of mpeg-1 stream, but still leaves open a number of issues with the interpretation: - IDCT mismatch control is not correct for MPEG-1. - Slices do not have to start and end on the same horizontal row of macroblocks. - picture_coding_type = 4 (D-pictures) is not handled. - full_pel_*_vector is not handled. Signed-off-by: Maarten Lankhorst <[email protected]>
* util: (trivial) add has_popcnt fieldRoland Scheidegger2013-06-192-0/+2
| | | | | | Not used yet but there's a couple of places in llvmpipe which should use this (occlusion count is currently very inefficent if there's no cpu popcnt instruction).
* indices: add some commentsBrian Paul2013-06-192-4/+28
| | | | | | This is pretty complicated code with few/any comments. Here's a first stab. Reviewed-by: Jose Fonseca <[email protected]>
* Revert "draw: clear the draw buffers in draw"Zack Rusin2013-06-172-25/+3
| | | | | | | | | | | This reverts commit 41966fdb3b71c0b70aeb095e0eb3c5626c144a3a. While it's a lot cleaner it causes regressions because the draw interface is always called from the draw functions of the drivers (because the buffers need to be mapped) which means that the stream output buffers endup being cleared on every draw rather than on setting. Signed-off-by: Zack Rusin <[email protected]>