aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/swr/swr_shader.cpp
Commit message (Collapse)AuthorAgeFilesLines
* swr/rast: Removed superfluous JitManager argument from passesAlok Hota2018-05-251-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Optimize late/bindless JIT of samplersGeorge Kyriazis2018-04-181-0/+9
| | | | | | | | | Add per-worker thread private data to all shader calls Add per-worker sampler cache and jit context Add late LoadTexel JIT support Add per-worker-thread Sampler / LoadTexel JIT Reviewed-by: Bruce Cherniak <[email protected]>
* swr: add x86 lowering pass to fragment shaderGeorge Kyriazis2018-04-181-0/+7
| | | | | | | | Needed because some FP paths (namely stipple) use gather intrinsics that now need to be lowered to x86. v2: fix typo in commit message Reviewed-by: Bruce Cherniak <[email protected]>
* tgsi/scan: use wrap-around shift behavior explicitly for file_maskRoland Scheidegger2018-03-061-1/+1
| | | | | | | | | | | | | | The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swr: Support simd16 vertex shadersGeorge Kyriazis2018-01-191-12/+30
| | | | | | | | | | | | | | | Supporting simd16 vertex shaders involves packing the output of the fetch shader appropriately, especially the vertexID buffers that have to be formatted in one simd16 register, needed by the VS. As part of this support, we needed to remove the 2nd JitManager, since it was not accounting for vector width correctly. USE_SIMD16_SHADERS is also split into two defines. The additional one (USE_SIMD16_VS) controls the width of the vertex shader (VS), while the original one (USE_SIMD16_SHADERS) controls overall front end width. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Handle indirect indices in GSGeorge Kyriazis2018-01-101-8/+39
| | | | | | | | | | | | | | | BuilderSWR::swr_gs_llvm_fetch_input() (and consequently swr_gs_llvm_fetch_input()), did not handle the case where is_vindex_indirect or is_aindex_direct is set. Implement it, using the code in draw_llvm.c as a guideline. Fixes the following piglit tests: dynamic_input_array_index (crash) gs-input-array-vec4-index-rd vs-output-array-vec4-index-wr-before-gs Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Simplify GATHER* jit builder apiTim Rowley2017-11-201-1/+1
| | | | | | | General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: simd16 shaders work in progressTim Rowley2017-10-111-2/+12
| | | | | | | | Start building vertex shaders as simd16. Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: New GS state/context APITim Rowley2017-09-251-98/+85
| | | | | | | One piglit regression, which was a false pass: [email protected]@execution@geometry@dynamic_input_array_index Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix transform feedback logicGeorge Kyriazis2017-07-271-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shader that is used to copy vertex data out of the vs/gs shaders to the user-specified buffer (streamout or SO shader) was not using the correct offsets. Adjust the offsets that are used just for the SO shader: - Make sure that position is handled in the same special way as in the vs/gs shaders - Use the correct offset to be passed in the core - consolidate register slot mapping logic into one function, since it's been calculated in 2 different places (one for calcuating the slot mask, and one for the register offsets themselves Also make room for all attibutes in the backend vertex area. Fixes: - all vtk GL2PS tests - 18 piglit tests (16 ext_transform_feedback tests, arb-quads-follow-provoking-vertex and primitive-type gl_points v2: - take care of more SGV slots in slot mapping logic - trim feState.vsVertexSize - fix GS interface and incorporate GS while calculating vsVertexSize Note that vsVertexSize is used in the core as the one parameter that controls vertex size between all stages, so it has to be adjusted appropriately for the whole vs/gs/fs pipeline. Also note that GS and SO is not fully implemented. This will be addressed later. fixes: - fixes total of 20 piglit tests CC: 17.2 <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Support dynamically sized vertex layoutTim Rowley2017-06-301-0/+2
| | | | | | | | | | | | | | | Each shader stage state (VS, TS, GS, SO, BE/CLIP) now has a vertexAttribOffset to specify the offset to the start of the general attribute section of the incoming verts for that stage. It is up to the driver to set this up correctly based on the active stages. All the shader stages use this value instead of VERTEX_ATTRIB_START_SLOT to offset to the incoming attributes. Only the vertex shader stage supports dynamic layout output currently. The other stages continue to expect the output to be the fixed layout slots as before. Will be enabling GS next. Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
* swr/rast: Fix read-back of viewport array indexTim Rowley2017-06-161-2/+0
| | | | | | | Binner/clipper read viewport array index from the vertex header as needed. Move viewport state to BACKEND_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix read-back of render target array indexTim Rowley2017-06-161-1/+0
| | | | | | | | The last FE stage can emit render target array index. Currently we only check to see if GS is emitting it. Moved the state to BACKEND_STATE and plumbed the driver to set it. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Rework attribute layoutTim Rowley2017-06-161-21/+57
| | | | | | | Move fixed attributes to the top and pack single component SGVs. WIP to support dynamically allocated vertex size. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove explicit primitive id slot in the vertex layoutTim Rowley2017-06-161-12/+9
| | | | | | | | - Remove any special casing in the PS stage when primitive ID is input. Treat as a normal attribute that must be set up properly in the FE linkage. - Remove primitive id from the PS_CONTEXT and TRI_FLAGS Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 FE - interleaved simdvertex output in GSTim Rowley2017-05-301-3/+26
| | | | | | Eliminates conversion copies on GS output from simdvertex to simd16vertex. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: don't use AttributeSet with llvm >= 5Tim Rowley2017-05-171-15/+21
| | | | | | | | | | | This change fixes the build break with llvm-svn. r301981 of llvm-svn made add/remove of function attributes use AttrBuilder instead of AttributeList. Tested with llvm-3.9, llvm-4.0, llvm-svn. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: simd16 vs workTim Rowley2017-04-191-5/+25
| | | | | | | Build VS with alternating output for the current simd16 fe double-pump of a simd8 shader. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Add polygon stipple supportGeorge Kyriazis2017-04-141-6/+50
| | | | | | | | | Add polygon stipple functionality to the fragment shader. Explicitly turn off polygon stipple for lines and points, since we do them using tris. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix unused variable warningsTim Rowley2017-04-071-1/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix llvm-5.0.0 build bustageTim Rowley2017-03-281-9/+15
| | | | | | | Handle rename of llvm AttributeSet to AttributeList in the same fashion as ac_llvm_helper.cpp. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Cleanup naming of codegen filesTim Rowley2017-03-201-2/+2
| | | | | | All template files and generated files are prefixed with gen_. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: support layer output in geometry shadersIlia Mirkin2017-03-151-0/+2
| | | | | | | | | This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with 2DMS_ARRAY, which is expected given the lackluster MSAA support. However all the regular types pass. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: s/unsigned/enum pipe_shader_type/Brian Paul2017-03-081-1/+1
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* swr: implement geometry shadersTim Rowley2017-03-051-5/+470
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix crash in swr_update_derived following st/mesa state changesBruce Cherniak2017-03-021-0/+6
| | | | | | | | | | | | | | | | | | | | Recent change to st/mesa state update logic caused major regressions to swr validation code. swr uses the same validation logic (swr_update_derived) for both draw and Clear calls. New st/mesa state update logic results in certain state objects not being set/bound during Clear. This was causing null ptr exceptions. Creation of static dummy state objects allows setting these pointers during Clear validation, without interfering with relevant state validation. Once fixed, new logic also highlighted an error in dirty bit checking for fragment shader and clip validation. (The alternative is to have a simplified validation routine for Clear. Which may do that at some point.) Reviewed-by: Tim Rowley <[email protected]>
* swr: add fetch shader cacheGeorge Kyriazis2017-02-231-0/+14
| | | | | | | | | For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: remove unneeded llvm version checkTim Rowley2017-01-051-4/+0
| | | | | | | Old test caused breakage with llvm-svn (4.0.0svn), and not needed as the minimum required llvm version for swr is 3.6. Reviewed-by: George Kyriazis <[email protected]>
* swr: color interpolation is also supposed to get perspective divisionIlia Mirkin2016-11-221-2/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: add sprite coord enable mask to fs keyIlia Mirkin2016-11-221-1/+2
| | | | | | | This fixes gl-coord-replace-doesnt-eliminate-frag-tex-coords Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: rework vert <-> frag shader linkage logicIlia Mirkin2016-11-221-43/+50
| | | | | | | | | | | | Fixes a few things: - sprite coords only apply to generic varyings, and are a bitmask - back color only applies in 2-sided lighting mode - handle some odd situations between only some front/back colors being there. This is only semi-legal in GL, but we shouldn't start crashing. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: only broadcast color0 value, not all color valuesIlia Mirkin2016-11-221-1/+2
| | | | | | | | | | | | | The way that dual-source blending is described for GLES2 is very odd, and we end up with a shader that both has this property set *and* has a color1 value to be used as the second source. While changing the state tracker is an option, it seems more reliable to verify that the broadcast is only done on color0. Fixes arb_blend_func_extended-fbo-extended-blend-pattern_gles2 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: rework resource layout and surface setupIlia Mirkin2016-11-221-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a bit of a mega-commit, but unfortunately there's no great way to break this up since a lot of different pieces have to match up. Here we do the following: - change surface layout to match swr's Load/StoreTile expectations - fix sampler settings to respect all sampler view parameters - fix stencil sampling to read from secondary resource - respect pipe surface format, level, and layer settings - fix resource map/unmap based on the new layout logic - fix resource map/unmap to copy proper parts of stencil values in and out of the matching depth texture These fix a massive quantity of piglits, including all the tex-miplevel-selection ones. Note that the swr native miptree layout isn't extremely space-efficient, and we end up using it for all textures, not just the renderable ones. A back-of-the-envelope calculation suggests about 10%-25% increased memory usage for miptrees, depending on the number of LODs. Single-LOD textures should be unaffected. There are a handful of regressions as a result of this change: - Some textureGrad tests, these failures match llvmpipe. (There are debug settings allowing improved gallivm sampling accurancy.) - Some layered clearing tests as swr doesn't currently support that. It was getting lucky before because enough other things were broken. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: add support for upper-left fragcoord positionIlia Mirkin2016-11-151-2/+8
| | | | | | | Fixes glsl-arb-fragment-coord-conventions. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fixes for icc in vs2015 compat modeTim Rowley2016-10-031-0/+2
| | | | | | | - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer] attribute swizzling and linkageTim Rowley2016-07-201-12/+0
| | | | | | | | | Add support for enhanced attribute swizzling. Currently supports constant source overrides to handle PrimitiveID support. No support yet for input select swizzling or wrap shortest. Removes obsoleted linkageMask and associated code. Signed-off-by: Tim Rowley <[email protected]>
* swr: push/pop DEBUG macro around llvm includesTim Rowley2016-06-231-3/+7
| | | | | | | | | | llvm redefines DEBUG; adding push/pop prevents a undefined reference to debug_refcnt_state in llvm-3.7+. v2: add undef DEBUG Cc: "12.0" <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: implement clipPlanes/clipVertex/clipDistance/cullDistanceTim Rowley2016-06-091-0/+65
| | | | | | | | | | | | | | v2: only load the clip vertex once v3: fix clip enable logic, add cullDistance v4: remove duplicate fields in vs jit key, fix test of clip fixup needed v5: fix clipdistance linkage for slot!=0,4 v6: support clip+cull; passes most piglit clip (failures understood) Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix provoking vertexTim Rowley2016-06-071-7/+5
| | | | | | | | | | Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <[email protected]>
* swr: fix memory leaks from vs/fs compilationTim Rowley2016-04-221-17/+24
| | | | | | v2: varient -> variant Reviewed by: George Kyriazis <[email protected]>
* gallium/swr: Make flat shading tris work.George Kyriazis2016-04-131-0/+4
| | | | | | | - Incorporate flatshade flag into the shader generation - Use provoking vertex (vc) in shader when flat shading. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: support samplers in vertex shadersTim Rowley2016-04-121-34/+64
| | | | Reviewed-by: George Kyriazis <[email protected]>
* gallium/swr: add OpenSWR driverTim Rowley2016-03-021-0/+591
OpenSWR is a new software rasterizer for x86 processors designed for high performance and high scalablility on visualization workloads. Acked-by: Roland Scheidegger <[email protected]> Acked-by: Jose Fonseca <[email protected]>