summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* softpipe: don't clamp reference value for shadow comparison for float formatsRoland Scheidegger2013-08-081-12/+32
| | | | | | | Clamping is only done for fixed-point formats as part of conversion to texture format. Reviewed-by: Zack Rusin <[email protected]>
* gallivm: propagate scalar_lod to emit_size_query tooRoland Scheidegger2013-08-081-0/+2
| | | | | | | Clearly the returned values need to be per-element if the lod is per element. Does not actually change behavior yet. Reviewed-by: Zack Rusin <[email protected]>
* ilo: get rid of GPE tables completelyChia-I Wu2013-08-086-108/+54
| | | | Move the estimate functions out of the tables and kill the tables.
* ilo: clean up GPE header inclusionsChia-I Wu2013-08-086-14/+8
| | | | | This reduces the number of source files need to be recompiled when GPE functions are changed other than regular clean ups.
* ilo: initialize alpha test state in ilo_gpe_init_dsaChia-I Wu2013-08-085-38/+46
| | | | This could speed up BLEND_STATE and COLOR_CALC_STATE emission a bit.
* ilo: fold gen6_translate_index_size into the callerChia-I Wu2013-08-081-17/+15
| | | | There is only one caller so fold it.
* ilo: fold gen6_translate_depth_format into the callerChia-I Wu2013-08-081-33/+9
| | | | There is only one caller so fold it.
* ilo: Call GPE emit functions directly.Courtney Goeltzenleuchter2013-08-088-1248/+141
| | | | | Eliminate pipeline and GPE function vectors and have the pipeline functions call the GPE emit functions directly.
* ilo: move emit functions so that they can be inlined.Courtney Goeltzenleuchter2013-08-084-3467/+3453
|
* r300g/compiler/tests: Pass the required LDFLAGS when building the test programTom Stellard2013-08-071-1/+2
| | | | CC: "9.2 <[email protected]>"
* r300g/compiler/tests: Fix segfaultTom Stellard2013-08-073-4/+4
| | | | CC: "9.2" <[email protected]>
* ilo: speed up 3DSTATE_VERTEX_BUFFERS emission a bitChia-I Wu2013-08-073-26/+12
| | | | Ignore vbuffer_mask which does not gain us anything.
* ilo: skip state emission when reducing sampler countChia-I Wu2013-08-071-19/+29
| | | | | When the number of sampler states bound is reduced, we are good to keep referencing the old SAMPLER_STATE array and skip emitting a new one.
* ilo: simplify setting of shader samplers and viewsChia-I Wu2013-08-071-44/+33
| | | | | Remove the special path that unbinds all samplers/views not in the range. Just make another call to unbind them.
* ilo: correctly check for stencil ref changeChia-I Wu2013-08-071-1/+1
| | | | I intended to do a memcmp(), not a memcpy()...
* draw: fix slot detectionZack Rusin2013-08-062-2/+1
| | | | | | | | | | | | Nowadays -1 for slots means that the semantic is not present, so we need to store it in a signed variables, otherwise <0 comparisons are pointless. Fixes http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least with softpipe, edgeflags don't work wit llvmpipe) Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nvc0: don't access array out of bounds on unexpected sample countChristoph Bumiller2013-08-061-2/+1
|
* nv50: handle pure integer vertex attributesEmil Velikov2013-08-062-2/+14
| | | | | | | | And as a side effect fix a crash in the following piglit test: general/attribs GL3 Signed-off-by: Emil Velikov <[email protected]> Cc: "9.2 and 9.1" [email protected]
* nvc0: implement MP performance counters for nvc0:nvd9Samuel Pitoiset2013-08-062-93/+370
|
* nvc0: implement compute support for nvc0Samuel Pitoiset2013-08-069-32/+706
| | | | Tested on nvc0, nvc1, nvcf and nvd9.
* nvc0: add more MP counters for nve4Samuel Pitoiset2013-08-063-14/+47
|
* radeonsi: Number of SGPRs retrieved from LLVM already includes VCCMichel Dänzer2013-08-061-8/+8
| | | | | | | | Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using all 104 SGPRs. Cc: [email protected] Reviewed-by: Christian König <[email protected]>
* llvmpipe: Do not need to free anything if there is no geometry shader.Vinson Lee2013-08-051-2/+5
| | | | | | | | | | If gs is null, then freeing state->shader.tokens would result in a null dereference. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nvc0: Initialize ptr for unexpected sample_count on release builds.Vinson Lee2013-08-051-0/+1
| | | | | | | Fixes "Uninitialized pointer read" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: fix frontface behavior againZack Rusin2013-08-021-3/+11
| | | | | | | Lets make sure the frontface is 1 for front and -1 for back. Discussed with Roland and Jose. Signed-off-by: Zack Rusin <[email protected]>
* r600g/sb: Dump correct value for CND.Vinson Lee2013-08-041-1/+1
| | | | | | | Fixes "Copy-paste error" reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vadim Girlin <[email protected]>
* nv50: fix some h264 interlaced decoding on vp2Ilia Mirkin2013-08-032-7/+8
| | | | | | | | | | Some videos specify mb_adaptive_frame_field_flag instead of field_pic_flag. This implies that the pic height needs to be halved, and this field needs to be passed to the VP engine. Cc: "9.2" [email protected] Signed-off-by: Ilia Mirkin <[email protected]>
* llvmpipe: don't interpolate front face or prim idZack Rusin2013-08-021-15/+13
| | | | | | | | | | | | | | The loop was iterating over all the fs inputs and setting them to perspective interpolation, then after the loop we were creating extra output slots with the correct interpolation. Instead of injecting bogus extra outputs, just set the interpolation on front face and prim id correctly when doing the initial scan of fs inputs. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: inject frontface info into wireframe outputsZack Rusin2013-08-0210-4/+45
| | | | | | | | | | | | | | Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: make the front-face behavior match the gallium specZack Rusin2013-08-021-1/+4
| | | | | | | | | | | | The spec says that front-face is true if the value is >0 and false if it's <0. To make sure that we follow the spec, lets just subtract 0.5 from our value (llvmpipe did 1 for frontface and 0 otherwise), which will get us a positive num for frontface and negative for backface. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* r600g: honour semantic index in fragment color exportsChristoph Bumiller2013-08-021-5/+5
| | | | Signed-off-by: Marek Olšák <[email protected]>
* nvc0: properly align NVE4_COMPUTE_MP_TEMP_SIZESamuel Pitoiset2013-07-312-2/+3
| | | | | | | | | MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D must be aligned to 0x20000, so perform both alignments to be sure we allocate enough space (actually the bo will most likely use 128 KiB pages and not aligning to that would be a waste anyway). Cc: "9.2" [email protected]
* softpipe: use new softpipe_resource_data() accessorBrian Paul2013-07-313-4/+20
| | | | | | | | | We should probably be using map()/unmap() when accessing resource data, but this is a little better. v2: assert that the resource is not a display target, per Jose. Reviewed-by: José Fonseca <[email protected]>
* softpipe: don't ignore pipe_constant_buffer::buffer_offsetBrian Paul2013-07-311-3/+5
| | | | | | | | | | | | | This was never a problem since the Mesa state tracker always gives us a user-space constant buffer with buffer_offset=0. But if another state tracker ever gave us a "HW" constant buffer with non-zero buffer_offset we'd mis-render. Also, use the correct buffer size. And move an assertion to the top of the function. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* Revert "r300g: Give CLIP_DISABLE another try"Marek Olšák2013-07-302-3/+2
| | | | | | | | This reverts commit e866bd1adea2c3b4971ad68e69c644752f2ab7b6. https://bugs.freedesktop.org/show_bug.cgi?id=57875 Cc: [email protected]
* r600g/compute: Added missing address space checking of kernel parametersJonathan Charest2013-07-301-3/+2
| | | | | | | | | | | To have non-static buffers in local memory, it is necessary to pass them as arguments to the kernel. For r600, the correct lds size must be set to the SQ_LDS_ALLOC register. The correct size is the clover size plus the size reported by the compiler. Reviewed-by: Tom Stellard <[email protected]>
* nvc0: force use of correct firmware fileMaarten Lankhorst2013-07-281-1/+1
| | | | Signed-off-by: Maarten Lankhorst <[email protected]>
* nv50,nvc0: s/uint16/uint32 for constant buffer offsetChristoph Bumiller2013-07-242-2/+2
| | | | | | | | Looks like a thinko, "Hey, constant buffers can be at most 64 KiB in size, offset can't be larger." But it can, of course. I think piglit lacks a test for UBO and BindBufferRange that tests if it actually works.
* gallium: Add PIPE_CAP_ENDIANNESSTom Stellard2013-07-2212-0/+25
| | | | | | Cc: [email protected] [ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation, define PIPE_ENDIAN_NATIVE. ]
* llvmpipe: Ensure FTZ/DAZ flags are set on deferred draw flushes.Zack Rusin2013-07-221-0/+8
| | | | Tested-by: José Fonseca <[email protected]>
* llvmpipe: Remove lp_rast_get_num_threads().José Fonseca2013-07-222-11/+0
| | | | | | Never called. Trivial.
* llvmpipe/tests: update arith test to check for edge casesZack Rusin2013-07-191-9/+19
| | | | | | | | | Test infs, zeros and nans with our arith functions to assure correct/defined behavior with those values. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: clamp inputs for srgb render buffersRoland Scheidegger2013-07-181-0/+35
| | | | | | | | | | | | | | | Usually with fixed point renderbuffers clamping is done as part of conversion. However, since we blend in float format, we essentially skip all conversion steps pre-blend but since this is still a fixed point renderbuffer we must still clamp the inputs in this case. Makes no difference for piglit though. Obviously we could skip this if fragment color clamping is enabled, but a) this is deprecated in OpenGL (d3d never had it) and b) we don't support it natively so it gets baked into the shader. Also add some comment about logic ops being broken for srgb, luckily no test tries to do that as there's no easy fix... Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alphaRoland Scheidegger2013-07-182-8/+26
| | | | | | | | | | | | | | | | | | We were fixing up the blend factor to ZERO, however this only works correctly with fixed point render buffers where the input values are clamped to 0/1 (because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped inputs). Haven't seen any failure anywhere due to that with fixed point SNORM buffers (which clamp inputs to -1/1) but it should apply there as well (snorm blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all, d3d10 requires them but they are not blendable). Doesn't look like piglit hits this though (some internal testing hits the float case at least). (With legacy OpenGL we could theoretically still use the fixup to zero if the fragment color clamp is enabled, but we can't detect that easily since we don't support native clamping hence it gets baked into the shader.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* r600g: use WAIT_3D_IDLE before using CP DMAMarek Olšák2013-07-182-0/+2
| | | | I broke this with 7948ed1250cae78ae1b22dbce4ab23aceacc6159 for r700 at least.
* r300g: make use of gallium's os_get_process_name()Jonathan Gray2013-07-181-1/+6
| | | | | | | Lets the code compile on non Linux systems. Signed-off-by: Jonathan Gray <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0Ilia Mirkin2013-07-1811-3/+1815
| | | | | | | | | | | | | | Adds H.264 and MPEG2 codec support via VP2, using firmware from the blob. Acceleration is supported at the bitstream level for H.264 and IDCT level for MPEG2. Known issues: - H.264 interlaced doesn't render properly - H.264 shows very occasional artifacts on a small fraction of videos - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there when using XvMC on the same videos Signed-off-by: Ilia Mirkin <[email protected]>
* r600g/sb: improve alu packing on caymanVadim Girlin2013-07-172-15/+89
| | | | | | | | | | | | | | | | Scheduler/register allocator in r600-sb was developed and optimized on evergreen (VLIW-5) hardware, so currently it's not optimal for VLIW-4 chips. This patch should improve performance on cayman gpus due to better alu packing, but also it tends to increase register usage, so overall positive effect on performance has to be proven by real benchmarks yet. Some results with bfgminer kernel on cayman: source bytecode: 60 gprs, 3905 alu groups, sbcl before the patch: 45 gprs, 4088 alu groups, sbcl with this patch: 55 gprs, 3474 alu groups. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix handling of new multislot instructions on caymanVadim Girlin2013-07-173-5/+6
| | | | | | | Ex-scalar instructions that became multislot on cayman do replicate result to all channels - handle them similar to DOT4. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix debug dump code in schedulerVadim Girlin2013-07-171-4/+5
| | | | | | Update the stale debug code for other changes related to debug output. Signed-off-by: Vadim Girlin <[email protected]>