summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/draw
Commit message (Collapse)AuthorAgeFilesLines
* draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offsetRoland Scheidegger2013-11-121-8/+13
| | | | | | | | | | Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-074-8/+44
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: move type construction out of loopBrian Paul2013-11-041-1/+3
| | | | | | We can create clip_ptr_type once instead of n times inside the loop. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-232-33/+37
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* draw: make vs_slot signed.José Fonseca2013-10-151-2/+4
| | | | | | Otherwise (vs_slot < 0) will never be true. Trivial.
* draw: remove use of old bind_fragment_sampler_states()Brian Paul2013-10-032-82/+13
|
* draw: use pipe_context::bind_sampler_states() if non-nullBrian Paul2013-10-032-7/+97
|
* draw: rename bind_sampler_states variablesBrian Paul2013-10-032-19/+19
| | | | | Put 'fragment' in the names. In preparation for upcoming function renaming.
* draw: Add a null check for draw.Vinson Lee2013-09-301-1/+1
| | | | | | | | | | There is an earlier null check for draw so draw could be null here as well. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw/clip: don't emit so many empty trianglesZack Rusin2013-09-251-0/+39
| | | | | | | | | | | | Compress empty triangles (don't emit more than one in a row) and never emit empty triangles if we already generated a triangle covering a non-null area. We can't skip all null-triangles because c_primitives expects ones that were generated from vertices exactly at the clipping-plane, to be emitted. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: Ensure draw_pt_middle_end::bind_parameters is never NULL.José Fonseca2013-09-202-0/+15
| | | | | | Prevents calling NULL pointer with softpipe in certain cases. Trivial.
* gallivm: support indirect registers on both dimensionsZack Rusin2013-09-061-6/+17
| | | | | | | | | | | We support indirect addressing only on the vertex index, but some shaders also use indirect addressing on attributes. This patch adds support for indirect addressing on both dimensions inside gs arrays. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: fix segfaults with aaline and aapoint stages disabledMarek Olšák2013-08-311-2/+4
| | | | | | | | | | There are drivers not using these optional stages. Broken by a3ae5dc7dd5c2f8893f86a920247e690e550ebd4. Cc: [email protected] Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix PIPE_MAX_SAMPLER/PIPE_MAX_SHADER_SAMPLER_VIEWS issuesRoland Scheidegger2013-08-302-6/+6
| | | | | | | | | | | | pstipple/aaline stages used PIPE_MAX_SAMPLER instead of PIPE_MAX_SHADER_SAMPLER_VIEWS when dealing with sampler views. Now these stages can't actually handle sampler_unit != texture_unit anyway (they cannot work with d3d10 shaders at all due to using tex not sample opcodes as "mixed mode" shaders are impossible) but this leads to crashes if a driver just installs these stages and then more than PIPE_MAX_SAMPLER views are set even if the stages aren't even used. Reviewed-by: Zack Rusin <[email protected]>
* draw: fix point/line/triangle determination in draw_need_pipeline()Brian Paul2013-08-291-25/+6
| | | | | | The previous point/line/triangle() functions didn't handle GS primitives. Reviewed-by: Roland Scheidegger <[email protected]>
* draw: clean up setting stream out information a bitRoland Scheidegger2013-08-272-5/+0
| | | | | | | | | | | | | | | | | In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement better control of per-quad/per-element/scalar lodRoland Scheidegger2013-08-201-4/+4
| | | | | | | | | | | | | | | | There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <[email protected]>
* draw: handle nan clipdistanceZack Rusin2013-08-153-4/+16
| | | | | | | | If clipdistance for one of the vertices is nan (or inf) then the entire primitive should be discarded. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: make sure that the stages setup outputsZack Rusin2013-08-145-30/+62
| | | | | | | | | | | | Calling the prepare outputs cleans up the slot assignments for outputs, unfortunately aapoint and aaline didn't have code to reset their slots after the initial setup, this was messing up our slot assignments. The unfilled stage was just missing the initial assignment of the face slot. This fixes all of the reported piglit failures. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: simplify prim mask constructionRoland Scheidegger2013-08-121-22/+10
| | | | | | | | | The code was quite weird, the second comparison was in fact a complete no-op and we can also do the comparison with the vector directly instead of scalar, which should not also be faster but it is way more obvious how that mask is actually going to look like. Reviewed-by: Zack Rusin <[email protected]>
* draw: (trivial) dump tgsi for geometry shaders with GALLIVM_DEBUG_TGSIRoland Scheidegger2013-08-121-0/+5
| | | | | | And dump the variant key too (same as vs does). Just so I can stop wondering why I see the tgsi dump for fs and vs but not gs...
* gallivm: set non-existing values really to zero in size queries for d3d10Roland Scheidegger2013-08-091-2/+2
| | | | | | | | | | | My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. Reviewed-by: Zack Rusin <[email protected]>
* gallivm: use texture target from shader instead of static state for size queryRoland Scheidegger2013-08-091-0/+2
| | | | | | | | | | | | | | | | | | | d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). Reviewed-by: Zack Rusin <[email protected]>
* draw: rewrite primitive assemblerZack Rusin2013-08-088-296/+180
| | | | | | | | | | | | We can't be injecting the primitive id's in the pipeline because by that time the primitives have already been decomposed. To properly number the primitives we need to handle the adjacency primitives by hand. This patch moves the prim id injection into the original primitive assembler and completely removes the useless pipeline stage. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: reset the vertex id when injecting new primitive idZack Rusin2013-08-081-0/+9
| | | | | | | | | | | | Without reseting the vertex id, with primitives where the same vertex is used with different primitives (e.g. tri/lines strips) our vbuf module won't re-emit those vertices with the changed primitive id. So lets reset the vertex id whenever injecting new primitive id to make sure that the vertex data is correctly emitted. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: cleanup the extra attribsZack Rusin2013-08-081-0/+1
| | | | | | | | | | Before inserting new front face and prim id outputs cleanup the old extra outputs, otherwise our cache will use previous output slots which will break as soon as outputs of the current shader don't match the last. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: propagate scalar_lod to emit_size_query tooRoland Scheidegger2013-08-081-0/+2
| | | | | | | Clearly the returned values need to be per-element if the lod is per element. Does not actually change behavior yet. Reviewed-by: Zack Rusin <[email protected]>
* draw: Change slot from unsigned to int.Vinson Lee2013-08-051-1/+1
| | | | | | | | | unfilled_stage::face_slot is of type int. Fixes "Unsigned compared against 0" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: add back separate input assemblerZack Rusin2013-08-035-4/+350
| | | | | | | | | | | the issue is that stream output is run before the pipeline, which means that unless we decompose the primitives before the so then things crash. we could convert the entire stream output code into a pipeline stage but it will take a bit, so for now fix the crashes by simply re-adding the old input assembler which is run before the SO. Signed-off-by: Zack Rusin <[email protected]>
* draw: implement proper primitive assembler as a pipeline stageZack Rusin2013-08-0311-351/+279
| | | | | | | | | | | | | we used to have a face primitive assembler that we ran after if the gs was missing but we had adjacency primitives in the pipeline, lets convert it to a pipeline stage, which allows us to use it to inject outputs (primitive id) into the vertices. it's also a lot cleaner because the decomposition is already handled for us. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: fix front face injectionZack Rusin2013-08-031-9/+15
| | | | | | | | | | | | | Inject front face only if the fragment shader uses it and propagate through all channels because otherwise we'll need to figure out the exact swizzle that the fs expects and it's just simpler to make sure all the components within the front face register are correctly set. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: make sure clipping works with injected outputsZack Rusin2013-08-021-35/+54
| | | | | | | | | | | | | clipping would drop the extra outputs because it always used the number of standard vertex shader outputs, without geometry shader or extra outputs. The commit makes sure that clipping with geometry shaders which have more outputs than the current vertex shader and with extra outputs correctly propagates the entire vertex. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: inject frontface info into wireframe outputsZack Rusin2013-08-024-0/+101
| | | | | | | | | | | | | | Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: stop crashing with extra shader outputsZack Rusin2013-08-029-15/+52
| | | | | | | | | | | | | | | | | | Draw sometimes injects extra shader outputs (aa points, lines or front face), unfortunately most of the pipeline and llvm code didn't handle them at all. It only worked if number of inputs happened to be bigger or equal to the number of shader outputs plus the extra injected outputs. In particular when running the pipeline which depends on the vertex_id in the vertex_header things were completely broken. The patch adjust the code to correctly use the total number of shader outputs (the standard ones plus the injected ones) to make it all stop crashing and work. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: use the vertex sizeZack Rusin2013-08-021-1/+1
| | | | | | | | | | Instead of using the magical 4 use the above computed vertex size. Doesn't change the behavior, just makes the code a bit cleaner. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw/llvm: add some extra debugging outputZack Rusin2013-08-021-0/+6
| | | | | | | | | | when dumping shader outputs it's nice to have the integer values of the outputs, in particular because some values are integers. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix vertex id computationZack Rusin2013-07-255-13/+37
| | | | | | | | | | | | | vertex id has to be unaffected by the start index (i.e. when calling draw arrays with start_index = 5, the first vertex_id has to still be 0, not 5) and it has to be equal to the index when performing indexed rendering (in which case it has to be unaffected by the index bias). This fixes our behavior. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: cleanup and fix instance id computationZack Rusin2013-07-252-7/+7
| | | | | | | | | | | | | | The instance id system value always starts at 0, even if the specified start instance is larger than 0. Instead of implicitly setting instance id to instance id plus start instance and then having to subtract instance id when computing the buffer offsets lets just set instance id to the proper instance id. This fixes instance id computation and cleansup buffer offset computation. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: always call util_cpu_detect() in draw context creation.Roland Scheidegger2013-07-241-1/+4
| | | | | | | | | | | | | | | | | | | | Since disabling denorms in draw_vbo() we require the util_cpu_caps to be initialized there. Hence add another util_cpu_detect() call in draw_create_context() which should ensure this. (There is another call in draw_get_option_use_llvm() which only gets called with x86 (not x86_64) but calling it always there wouldn't help since it most likely wouldn't get called when compiling without llvm, so leave it alone there.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=66806. (Because util_cpu_caps wasn't initialized when first calling util_fpstate_get() hence it returning zero, but it would later get initialized by rtasm translate code hence when draw call returned it unmasked all exceptions by calling util_fpstate_set(). This was happening only with DRAW_USE_LLVM=0 or not compiling with llvm, otherwise the llvm init code was calling it on time too.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]> Tested-by: Vinson Lee <[email protected]>
* tgsi: rename the TGSI fragment kill opcodesBrian Paul2013-07-122-6/+6
| | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <[email protected]>
* util: treat denorm'ed floats like zeroZack Rusin2013-07-091-0/+8
| | | | | | | | | | | | | The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: do per-pixel lod calculations for explicit lodRoland Scheidegger2013-07-041-1/+2
| | | | | | | | | | | | | | | | | | | | | d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix overflows in the indexed rendering pathsZack Rusin2013-07-034-43/+159
| | | | | | | | | | | | | The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: index overflows if it's greater than elt maxZack Rusin2013-07-031-1/+1
| | | | | | | | | The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/translate: fix instancingZack Rusin2013-06-289-9/+47
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>
* draw: fix incorrect clipper invocation statisticsZack Rusin2013-06-281-6/+0
| | | | | | | | clipper invocations are computed earlier (of course before the emittion) so this code was adding bogus numbers to already computed clipper invocations. Signed-off-by: Zack Rusin <[email protected]>
* draw/gallivm: export overflow arithmetic to its own fileZack Rusin2013-06-281-44/+11
| | | | | | | We'll be reusing this code so lets put it in a common file and use it in the draw module. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for integer overflows in instance computationZack Rusin2013-06-282-0/+7
| | | | | | | | | Integers could easily overflow is the starting instance was large enough. Instead of letting bogus counts through set the instance to max if it overflown and let our regular buffer overflow computation handle it. Signed-off-by: Zack Rusin <[email protected]>
* draw: check for an integer overflow when computing strideZack Rusin2013-06-281-10/+43
| | | | | | | | | Our buffer overflow arithmetic was susceptible to integer overflows which was the buffer overflow logic to break. Lets use the llvm overflow intrinsics to check for integer overflows while computing the stride/needed buffer size. Signed-off-by: Zack Rusin <[email protected]>
* draw: account for elem size when computing overflowZack Rusin2013-06-281-7/+23
| | | | | | | | | | | We weren't taking into account the size of element that is to be fetched, which meant that it was possible to overflow the buffer reads if the stride was very close to the end of the buffer, e.g. stride = 3, buffer size = 4, and the element to be read = 4. This should be properly detected as an overflow. Signed-off-by: Zack Rusin <[email protected]>