summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* util: treat denorm'ed floats like zeroZack Rusin2013-07-096-0/+83
| | | | | | | | | | | | | The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600g: improve the mechanism for recognizing an empty CSMarek Olšák2013-07-083-3/+8
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: explicitly flush caches for streamout-based buffer copying & clearingMarek Olšák2013-07-081-0/+13
| | | | | | | It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <[email protected]>
* r600g: only flush the caches that need to be flushed during CP DMA operationsMarek Olšák2013-07-083-32/+117
| | | | | | | This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <[email protected]>
* r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flagsMarek Olšák2013-07-085-27/+52
| | | | | | | also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <[email protected]>
* r600g: adjust flush flags (v3)Alex Deucher2013-07-086-7/+42
| | | | | | | | | | | | | 1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't call buffer_wait in buffer_mmap_sync_with_ringsMarek Olšák2013-07-081-2/+1
| | | | | | | | The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't read back the MSAA depth buffer if the read flag is not setMarek Olšák2013-07-081-8/+8
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't flush the context in texture_transfer_mapMarek Olšák2013-07-081-5/+0
| | | | | | the winsys does this automatically Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix texture offset computation for mapped MSAA depth buffersMarek Olšák2013-07-082-16/+14
| | | | | | | | | It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix color resolve for RGBX8 and RGBX16 integer formatsMarek Olšák2013-07-081-2/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: enable fast MSAA color clear for array/3D/cube texturesMarek Olšák2013-07-081-4/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: implement fast MSAA color clear for integer texturesMarek Olšák2013-07-081-9/+12
| | | | | | | this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <[email protected]>
* r600/uvd: fix check for UVD 2.xChristian König2013-07-081-1/+1
| | | | Signed-off-by: Christian König <[email protected]>
* gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetchRoland Scheidegger2013-07-051-1/+2
| | | | | | The logic for choosing number of lods was bogus. (The code should ultimately handle the case of only one lod even with multiple quads but currently can't.)
* gallivm: Remove bogus assert.José Fonseca2013-07-051-4/+1
| | | | | | | | | | | It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.
* nvc0: enable very initial support for nvf0 (GK110)Ben Skeggs2013-07-055-5/+76
| | | | | | | Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <[email protected]>
* gallivm: (trivial) fix bogus assertion for per-element lod with 1d resourcesRoland Scheidegger2013-07-052-2/+1
| | | | | | The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).
* gallivm: do per-pixel lod calculations for explicit lodRoland Scheidegger2013-07-0410-126/+195
| | | | | | | | | | | | | | | | | | | | | d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix overflows in the indexed rendering pathsZack Rusin2013-07-034-43/+159
| | | | | | | | | | | | | The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: index overflows if it's greater than elt maxZack Rusin2013-07-031-1/+1
| | | | | | | | | The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* postprocess: move second temporary assertion into isolated configurationMatthew McClure2013-07-031-2/+2
| | | | | | | | | With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <[email protected]>
* targets/xvmc-nouveau: add in missing nv30 libIlia Mirkin2013-07-031-0/+1
| | | | | | | Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so that it may be dlopen'd. Signed-off-by: Ilia Mirkin <[email protected]>
* mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependenciesMarek Olšák2013-07-0213-13/+0
| | | | | | Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <[email protected]>
* gallivm: Simplify intrinsic name construction.José Fonseca2013-07-021-23/+10
| | | | | | Just noticed this could be slightly shortened when fixing MSVC build. Trivial.
* gallivm: Fix MSVC build.José Fonseca2013-07-021-8/+7
|
* gallivm: Fix indirect immediate registers.José Fonseca2013-07-021-2/+2
| | | | | | | | | | | If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* gallium/tests: fix the translate testZack Rusin2013-06-281-4/+4
|
* draw/translate: fix instancingZack Rusin2013-06-2817-40/+109
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>
* draw: fix incorrect clipper invocation statisticsZack Rusin2013-06-281-6/+0
| | | | | | | | clipper invocations are computed earlier (of course before the emittion) so this code was adding bogus numbers to already computed clipper invocations. Signed-off-by: Zack Rusin <[email protected]>
* draw/gallivm: export overflow arithmetic to its own fileZack Rusin2013-06-284-44/+234
| | | | | | | We'll be reusing this code so lets put it in a common file and use it in the draw module. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for integer overflows in instance computationZack Rusin2013-06-282-0/+7
| | | | | | | | | Integers could easily overflow is the starting instance was large enough. Instead of letting bogus counts through set the instance to max if it overflown and let our regular buffer overflow computation handle it. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for an integer overflow when computing strideZack Rusin2013-06-281-10/+43
| | | | | | | | | Our buffer overflow arithmetic was susceptible to integer overflows which was the buffer overflow logic to break. Lets use the llvm overflow intrinsics to check for integer overflows while computing the stride/needed buffer size. Signed-off-by: Zack Rusin <[email protected]>
* draw: account for elem size when computing overflowZack Rusin2013-06-281-7/+23
| | | | | | | | | | | We weren't taking into account the size of element that is to be fetched, which meant that it was possible to overflow the buffer reads if the stride was very close to the end of the buffer, e.g. stride = 3, buffer size = 4, and the element to be read = 4. This should be properly detected as an overflow. Signed-off-by: Zack Rusin <[email protected]>
* tools/trace: Return dummy fence object to silence warnings.José Fonseca2013-07-011-1/+2
|
* tools/trace: Don't crash if a trace has no timing information.José Fonseca2013-07-012-3/+4
|
* nvc0: allow frame dropping in h264Maarten Lankhorst2013-07-011-3/+0
| | | | | | | | The only reason the checks existed were paranoia, when I first wrote the code I wasn't sure it was correct. Now that I am, the asserts triggered when XBMC was dropping frames, so remove it. NOTE: This is a candidate for the 9.1 branch.
* r300g/compiler: Prevent regalloc from swizzling texture operands v2Tom Stellard2013-06-305-0/+124
| | | | | | | | | https://bugs.freedesktop.org/show_bug.cgi?id=63520 NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r300g/compiler/tests: Add an assembly parserTom Stellard2013-06-305-16/+200
| | | | | | | The assembly parser can be used to load r300 assembly dumps and run them through any of the r300 compiler passes. Reviewed-by: Alex Deucher <[email protected]>
* r300g: Fix make checkTom Stellard2013-06-301-1/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: implement fast color clears for MSAA on evergreen+Grigori Goronzy2013-07-013-2/+79
| | | | | | | | | | | | | | | | | | | | | | Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMASK is required, we have to be able to create a clear color value for the format and the texture mustn't contain multiple images. Technically, it should be possible to support array textures and cubemaps if all images are attached to the framebuffer, but this does not appear to be common. v2: fix fast clear check v3: Marek: - disable fast clear with 128-bit formats, which are unsupported - set tex->dirty_level_mask in r600_clear, so that the driver knows the resource must be decompressed/expanded - return early from r600_clear if there's nothing else to do Signed-off-by: Marek Olšák <[email protected]>
* r600g/compute: disable unused colorbuffer slotsMarek Olšák2013-07-011-1/+12
| | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Tom Stellard <[email protected]>
* st/mesa: handle SNORM formats in generic CopyPixels pathMarek Olšák2013-06-302-0/+23
| | | | v2: check desc->is_mixed in util_format_is_snorm
* llvmpipe: fix timer query if there's no binsRoland Scheidegger2013-06-291-0/+10
| | | | | | | | | | | | | b04a295a4a0cd2defe352b3193b5fa79ca8fc9fc removed seemingly unnecessary code in get_query. Turns out this code could in fact be reached - while timestamps are always binned, if there are no bins (which happens if fb size is 0) then the rasterization query code filling this in is still never executed. So fix this up by filling in some timestamp, but do it at EndQuery time not GetQuery time which should be more appropriate. Makes piglit arb_timer_query-timestamp-get happy again. Reviewed-by: Jose Fonseca <[email protected]>
* clover: Don't segfault when compiling a program with no kernelTom Stellard2013-06-281-0/+7
|
* radeonsi: disable 2D tiling on CIK for nowAlex Deucher2013-06-281-1/+4
| | | | | | Causes GPU hangs. Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: add llvm processor names for CIKAlex Deucher2013-06-281-0/+3
| | | | | | Requires updated llvm. Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cikAlex Deucher2013-06-281-17/+34
| | | | | | | | Use the golden values for each asic. Todo: update Kabini and Kaveri. Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: PA_CL_ENHANCE is privileged on CIKAlex Deucher2013-06-281-2/+3
| | | | | | Needs to be and is set by the kernel. Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: update surface sync packet emit for CIKAlex Deucher2013-06-281-6/+17
| | | | Signed-off-by: Alex Deucher <[email protected]>