summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/swr
Commit message (Collapse)AuthorAgeFilesLines
* swr: Implement fence attached work queues for deferred deletion.Bruce Cherniak2016-12-169-54/+255
| | | | | | | Work can now be added to fences and triggered by fence completion. This allows for deferred resource deletion, and other asynchronous tasks. Reviewed-by: George Kyriazis <[email protected]>
* swr: [rasterizer core/memory] StoreTile: AVX512 progressTim Rowley2016-12-122-222/+138
| | | | | | Fixes to 128-bit formats. Reviwed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common/core/jitter] fetch support for GL_FIXEDTim Rowley2016-12-095-34/+188
| | | | | | v2: use fmul(1/65536) instead of fdiv(65535) Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core/memory] Finish R24_UNORM_X8_TYPELESS for AVX512Tim Rowley2016-12-092-26/+24
| | | | | | This one-off specialization was missed. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] supply proper clip distances to point spritesIlia Mirkin2016-12-081-3/+9
| | | | | | | | | | | | | | | Large points become pairs of triangles when rasterized, so we must feed it three clip distances, one for each vertex. The clip distance is not subject to sprite coord replacement, so there's no interpolation of it. We just take its value and put it in the "z" component of the barycentric-ready plane equation. (We could also just cull it at an earlier point in time, but that would require larger changes.) Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] perform perspective division on clip distancesIlia Mirkin2016-12-081-6/+8
| | | | | | | | Clip distances need to be perspective-divided. This fixes all the interpolation-*-{distance,vertex} piglits. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: mark PIPE_CAP_NATIVE_FENCE_FD unsupportedTim Rowley2016-12-051-0/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: include llvm version and vector width in renderer stringTim Rowley2016-12-051-1/+11
| | | | | | Uses llvmpipe's string formating. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Fix active_queries countBruce Cherniak2016-12-021-6/+7
| | | | | | | The active_query count was incorrect for query types that don't require a begin_query. Removed the unnecessary assert. Reviewed-by: Tim Rowley <[email protected]>
* swr: Fix type to match parameters of std::max()George Kyriazis2016-12-021-7/+7
| | | | | | Include propagation of comparisons further down. Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] include cstdarg in builder_misc.cppTim Rowley2016-12-021-1/+2
| | | | | | | | Fixes build problem with llvm-svn. v2: use cstdarg instead of stdarg.h Reviewed-by: Bruce Cherniak <[email protected]>
* swr: add streamout buffer offset into pBuffer pointerIlia Mirkin2016-11-301-2/+3
| | | | | | | | The buffer_size does not take the offset into account. Just add the offset into the pointer which lines up the structures much better. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: fix assertion for max number of so targetsIlia Mirkin2016-11-301-1/+1
| | | | | | | The number has to be less than or equal to the max, not just less than. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: properly report max number of SO componentsIlia Mirkin2016-11-301-1/+1
| | | | | | | | The components count the number of individual values, not the number of slots. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: turn off queries around blitsIlia Mirkin2016-11-301-1/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: don't advertise stream pause/resumeIlia Mirkin2016-11-301-1/+1
| | | | | | | | | There is no support for resuming streamout. Furthermore, this also controls glDrawTransformFeedback functionality which requires the same ability to query how many primitives were sent out of TF. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: fix range computation for instanced client-side arraysIlia Mirkin2016-11-302-24/+52
| | | | | | | | | | | We need to take the instance divisor and number of instances into account for instanced client-side arrays, rather than the vertex parameters. Loosely based on the comparable nvc0 logic. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer memory] assert when trying to convert an unknown formatIlia Mirkin2016-11-301-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: remove warning about multi-layer surfacesIlia Mirkin2016-11-301-4/+0
| | | | | | | | | | We now support clearing these, and actually rendering to multiple layers would require GS support, which will fail in much more spectacular ways for now. Once that is hooked up, there won't be anything else to do here. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] don't attempt to load another RTAI when storingIlia Mirkin2016-11-301-1/+1
| | | | | | | | | Since we don't pass a renderTargetArrayIndex in, and the current hot tile may be for a different index, we may end up loading the RTAI=0 into the hot tile for no reason. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* gallium: add PIPE_CAP_TGSI_CAN_READ_OUTPUTSNicolai Hähnle2016-11-301-0/+1
| | | | | | | | | | | Drivers that support this benefit by saving one lowering pass in the GLSL-to-TGSI conversion. radeonsi already supports this because all outputs are stored in temporary variables before the export (except for TCS outputs, which have always been readable in TGSI anyway due to their special semantics). Reviewed-by: Marek Olšák <[email protected]>
* swr: [rasterizer jit] use signed integer representation for logic opIlia Mirkin2016-11-291-5/+12
| | | | | | | | | | Instead of (incorrectly) biasing the snorm value to make it look like a unorm, just use signed integer math. This fixes arb_color_buffer_float-render GL_RGBA8_SNORM Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: add missing rgbx8_srgb variantIlia Mirkin2016-11-291-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: reorder renderable formats, add grouping commentsIlia Mirkin2016-11-291-65/+87
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: use util_copy_framebuffer_state helperIlia Mirkin2016-11-291-12/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: enable cubemap arraysIlia Mirkin2016-11-291-1/+1
| | | | | | | Everything is in place for these. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: rearrange caps into limits/supported/unsupported groupsIlia Mirkin2016-11-291-129/+84
| | | | | | | | | | I find this a lot more readable and compact - much easier to scan through the list and see what's on and what's off. No functional change intended. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: only store up to the LOD sizeIlia Mirkin2016-11-291-1/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] add SwrTrace() and macrosTim Rowley2016-11-292-15/+95
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer memory] only clear up to the LOD sizeIlia Mirkin2016-11-281-2/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer memory] hook up stencil clears for ClearTileIlia Mirkin2016-11-281-5/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer memory] add support for clearing Z32F_X32 and Z16Ilia Mirkin2016-11-281-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: don't clear all dirty bits when changing so targetsIlia Mirkin2016-11-281-1/+1
| | | | | | | | | | Among other things, blits would clear existing SO targets which would cause a bunch of updates from u_blitter to be missed. Fixes fbo-scissor-blit fbo, probably among many others. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] fix typo in scissor tile-alignment logicIlia Mirkin2016-11-281-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* scons: Recognize LLVM_CONFIG environment variable.Vinson Lee2016-11-241-1/+2
| | | | | | Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* swr: clear every layer of the attached surfacesIlia Mirkin2016-11-231-6/+29
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] pipe renderTargetArrayIndex through to clearsIlia Mirkin2016-11-237-20/+35
| | | | | | | | | | Currently clears only operate on the 0th array index (ignoring surface layout parameters). Instead normalize to take a RTAI like all the load/store tile logic does, and use ComputeSurfaceAddress to properly take the surface state's lod/array index into account. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] clear data now comes in as floatIlia Mirkin2016-11-231-10/+4
| | | | | | | | The non-fast-clear path was never updated after clear colors were passed in as floats. Remove the now-harmful conversion from unorm8. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] actually perform clear before store in GetHotTileIlia Mirkin2016-11-231-0/+12
| | | | | | | | | When switching render target array indexes (as might happen in a GS, or in a future change, with layered clears), if the previous state is HOTTILE_CLEAR, we should actually clear the tile before saving it off. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] fix cast for stencil clear valueTim Rowley2016-11-221-3/+2
| | | | | | | Bad type cast for stencil clear value was picking up structure padding bytes. Reviewed-by: Ilia Mirkin <[email protected]>
* swr: color interpolation is also supposed to get perspective divisionIlia Mirkin2016-11-221-2/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: add sprite coord enable mask to fs keyIlia Mirkin2016-11-222-1/+3
| | | | | | | This fixes gl-coord-replace-doesnt-eliminate-frag-tex-coords Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: rework vert <-> frag shader linkage logicIlia Mirkin2016-11-221-43/+50
| | | | | | | | | | | | Fixes a few things: - sprite coords only apply to generic varyings, and are a bitmask - back color only applies in 2-sided lighting mode - handle some odd situations between only some front/back colors being there. This is only semi-legal in GL, but we shouldn't start crashing. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: flatshading makes color outputs flat, it doesn't affect othersIlia Mirkin2016-11-221-4/+2
| | | | | | | | We were previously not marking the "regular" flat outputs as flat when flatshading was enabled. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: only broadcast color0 value, not all color valuesIlia Mirkin2016-11-221-1/+2
| | | | | | | | | | | | | The way that dual-source blending is described for GLES2 is very odd, and we end up with a shader that both has this property set *and* has a color1 value to be used as the second source. While changing the state tracker is an option, it seems more reliable to verify that the broadcast is only done on color0. Fixes arb_blend_func_extended-fbo-extended-blend-pattern_gles2 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: report a reasonable max lod biasIlia Mirkin2016-11-221-1/+1
| | | | | | | | | This is the same value that llvmpipe uses. Since swr uses the same sampler logic, makes sense for this value to also be the same. Most applications don't care. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: avoid using exceptions for expected condition handlingIlia Mirkin2016-11-221-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was getting a weird segfault from GCC 4.9.3: 0x00007ffff54f27aa in strlen () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff54f27aa in strlen () from /lib64/libc.so.6 #1 0x00007ffff4f128e5 in get_cie_encoding (cie=cie@entry=0x7ffff6e09813) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:272 #2 0x00007ffff4f1318e in classify_object_over_fdes (ob=ob@entry=0xd7bb90, this_fde=0x7ffff7f11010) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:628 #3 0x00007ffff4f135ba in init_object (ob=0xd7bb90) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:749 #4 search_object (ob=ob@entry=0xd7bb90, pc=pc@entry=0x7ffff4f11f4d <_Unwind_RaiseException+61>) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:961 #5 0x00007ffff4f13e62 in _Unwind_Find_registered_FDE (bases=0x7fffffffd358, pc=0x7ffff4f11f4d <_Unwind_RaiseException+61>) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:1025 #6 _Unwind_Find_FDE (pc=0x7ffff4f11f4d <_Unwind_RaiseException+61>, bases=bases@entry=0x7fffffffd358) at /gcc-4.9.3/libgcc/unwind-dw2-fde-dip.c:450 #7 0x00007ffff4f11197 in uw_frame_state_for (context=context@entry=0x7fffffffd2b0, fs=fs@entry=0x7fffffffd100) at /gcc-4.9.3/libgcc/unwind-dw2.c:1245 #8 0x00007ffff4f11b15 in uw_init_context_1 (context=context@entry=0x7fffffffd2b0, outer_cfa=outer_cfa@entry=0x7fffffffd660, outer_ra=0x7ffff518d23b <__cxa_throw+91>) at /gcc-4.9.3/libgcc/unwind-dw2.c:1566 #9 0x00007ffff4f11f4e in _Unwind_RaiseException (exc=0xd7c250) at /gcc-4.9.3/libgcc/unwind.inc:88 #10 0x00007ffff518d23b in __cxa_throw () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libstdc++.so.6 #11 0x00007ffff51ed556 in std::__throw_out_of_range(char const*) () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libstdc++.so.6 #12 0x00007fffea778be0 in std::map<pipe_format, SWR_FORMAT, std::less<pipe_format>, std::allocator<std::pair<pipe_format const, SWR_FORMAT> > >::at ( this=0x7fffebeb4c40 <mesa_to_swr_format(pipe_format)::mesa2swr>, __k=@0x7fffffffd73c: PIPE_FORMAT_RGTC1_UNORM) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/include/g++-v4/bits/stl_map.h:549 #13 0x00007fffea776aee in mesa_to_swr_format (format=PIPE_FORMAT_RGTC1_UNORM) at swr_screen.cpp:597 We can just void this whole issue by not using exceptions in the first place. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: remove formats from mapping table that don't have StoreTile implsIlia Mirkin2016-11-221-38/+48
| | | | | | | | | | | This table exists for the purpose of determining renderable formats. Without a StoreTile implementation, that can't happen. This basically removes rendering support to all L/LA/I formats. They can be re-added when/if StoreTile implementations are added. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: remove unnecessary -1 entries in format mapping tableIlia Mirkin2016-11-221-126/+0
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: rework resource layout and surface setupIlia Mirkin2016-11-226-160/+352
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a bit of a mega-commit, but unfortunately there's no great way to break this up since a lot of different pieces have to match up. Here we do the following: - change surface layout to match swr's Load/StoreTile expectations - fix sampler settings to respect all sampler view parameters - fix stencil sampling to read from secondary resource - respect pipe surface format, level, and layer settings - fix resource map/unmap based on the new layout logic - fix resource map/unmap to copy proper parts of stencil values in and out of the matching depth texture These fix a massive quantity of piglits, including all the tex-miplevel-selection ones. Note that the swr native miptree layout isn't extremely space-efficient, and we end up using it for all textures, not just the renderable ones. A back-of-the-envelope calculation suggests about 10%-25% increased memory usage for miptrees, depending on the number of LODs. Single-LOD textures should be unaffected. There are a handful of regressions as a result of this change: - Some textureGrad tests, these failures match llvmpipe. (There are debug settings allowing improved gallivm sampling accurancy.) - Some layered clearing tests as swr doesn't currently support that. It was getting lucky before because enough other things were broken. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>