aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: revert accidentally commited hunkRoland Scheidegger2013-08-151-12/+1
| | | | That magic wasn't meant to be commited, need to work on some proper fix.
* gallivm: do per-sample depth comparison instead of doing it post-filterRoland Scheidegger2013-08-152-106/+195
| | | | | | | | | | | | | | | | | | | | | | | | | Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9) and definitely required by d3d10. This actually doesn't do it pre-filter but more "in-filter" as otherwise need to push the comparisons even further down into fetch code and this also trivially allows using a somewhat cheaper lerp. Doing it pre-filter would actually have some performance advantage for UNORM formats (because the comparisons should be done in texture format, we'd only need to convert the shadow ref coord to texture format once, but in turn would save converting the per-sample texture values to floats) but this gets a bit messy as this has implications for border color handling as well (which needs to be done prior to depth comparisons, hence would also need to convert border color to texture format too or use some other tricks like doing separate border color / shadow ref comparison and simply using that result directly when doing border replacement). Should make no difference for nearest filtering, and performance for linear filtering should be mostly the same too (essentially have one more comparison instruction per sample, and replace the sub/mul/add lerp with a sub/and/and/add special "lerp" which all in all shouldn't be much of a difference). v2: get rid of old code completely Reviewed-by: Zack Rusin <[email protected]>
* radeonsi: Pixel shaders pre-load one more SGPRMichel Dänzer2013-08-151-2/+3
| | | | Acked-by: Marek Olšák <[email protected]>
* radeonsi: TGSI_SEMANTIC_CLIPVERTEX doesn't use any parametersMichel Dänzer2013-08-151-0/+1
|
* radeonsi: Don't export unused clip distance vectors from vertex shaderMichel Dänzer2013-08-153-1/+14
| | | | | | | | E.g. the Source engine seems to always write to gl_ClipVertex, but normally doesn't enable any GL_CLIP_DISTANCEn states. This change removes some irrelevant parts from the generated vertex shader code in such cases. Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: Don't leave gaps between position exports from vertex shaderMichel Dänzer2013-08-153-59/+83
| | | | | | | | | | | If the vertex shader exports clip distances but not point size, use position exports 1/2 instead of 2/3 for the clip distances. Fixes geometry corruption in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 Cc: [email protected] Reviewed-by: Tom Stellard <[email protected]>
* llvmpipe: fix stencil bug if we have both stencil and depth testsRoland Scheidegger2013-08-151-14/+13
| | | | | | | | | | | | | This is a very well hidden bug found by accident (only the fixed glean tstencil2 test so far seems to hit it). We must use new mask with combined s_pass values and orig_mask values for zpass/zfail stencil ops, otherwise both the sfail op and one of zpass/zfail op are applied (probably not hit in most tests because some of the ops tend to be KEEP usually). Note: this is a candidate for the 9.2 branch. Reviewed-by: Zack Rusin <[email protected]>
* nvc0: move video param and format support functions to nouveauIlia Mirkin2013-08-155-70/+76
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: move firmware loading functions to nouveauIlia Mirkin2013-08-153-90/+108
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: move some of the simpler decoder functions into nouveauIlia Mirkin2013-08-153-62/+69
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: move vp param filling logic into nouveauIlia Mirkin2013-08-156-476/+499
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: move bsp param-filling logic into nouveauIlia Mirkin2013-08-154-276/+324
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoderIlia Mirkin2013-08-156-224/+227
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: standardize on using #if for NVC0_DEBUG_FENCEIlia Mirkin2013-08-155-8/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: refactor video buffer management logic into nouveau_vp3Ilia Mirkin2013-08-158-175/+243
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: allow forcing PMPEG use, for ease of testingIlia Mirkin2013-08-152-2/+4
| | | | | | | This also allows people who don't want to install the binary blobs required for VP2 to still get MPEG decoding. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: hook up PMPEG support via nouveau_video, enables XvMC to workIlia Mirkin2013-08-153-15/+15
| | | | | | | Force the format to be the reasonable format that doesn't require an inverse z-scan. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: set buffer format of video bufferIlia Mirkin2013-08-151-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: fix number of surfaces in video buffer, use definesIlia Mirkin2013-08-151-4/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: U8_USCALED only works for size 4Ilia Mirkin2013-08-151-3/+0
| | | | | | | | | See https://bugs.freedesktop.org/show_bug.cgi?id=61635 for a sample program. Changing it to use a vec4 makes it work. Remove the unsupported formats. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "9.2 and 9.1" <[email protected]>
* ilo: fix fragment shaders that use PCB on GEN7+Chia-I Wu2013-08-152-3/+7
| | | | Missed this commit when preparing PCB changes for upstreaming.
* nouveau: Fix variable name.Vinson Lee2013-08-141-1/+1
| | | | | | | | | | | | Fixes build error introduced with commit d1ba1055d98c246d1ee9d9c14706bb9fba6a98c7. CC nouveau_video.lo nouveau_video.c: In function 'nouveau_screen_get_video_param': nouveau_video.c:866:33: error: 'screen' undeclared (first use in this function) nouveau_video.c:866:33: note: each undeclared identifier is reported only once for each function it appear Signed-off-by: Vinson Lee <[email protected]>
* radeonsi: unduplicate code in create_contextMarek Olšák2013-08-151-6/+0
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: initialize the radeon_surface structureMarek Olšák2013-08-151-1/+1
| | | | | | this fixes valgrind warnings Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: correct sampler function namesMarek Olšák2013-08-151-23/+23
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: rename r600_texture::dirty_db_mask to dirty_level_maskMarek Olšák2013-08-154-8/+8
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: rename r600_resource_texture to r600_textureMarek Olšák2013-08-157-48/+48
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: add info about MSAA samplers to tgsi_shader_infoMarek Olšák2013-08-152-0/+14
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: fix the location of sample indexMarek Olšák2013-08-151-1/+3
| | | | | | The sample index is always in W. Reviewed-by: Michel Dänzer <[email protected]>
* r600/radeonsi: implement new float comparison instructionsRoland Scheidegger2013-08-152-19/+48
| | | | | | | Also use ordered comparisons for old cmp instructions. Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* nv50: implement new float comparison instructionsRoland Scheidegger2013-08-151-0/+17
| | | | | | untested. Reviewed-by: Christoph Bumiller <[email protected]>
* ilo: implement new float comparison instructionsRoland Scheidegger2013-08-151-8/+12
| | | | | | untested. Reviewed-by: Chia-I Wu <[email protected]>
* gallivm: already pass coords in the right place in the sampler interfaceRoland Scheidegger2013-08-153-99/+90
| | | | | | | | | | | | | | | | | This makes things a bit nicer, and more importantly it fixes an issue where a "downgraded" array texture (due to view reduced to 1 layer and addressed with (non-array) samplec instruction) would use the wrong coord as shadow reference value. (This could also be fixed by passing target through the sampler interface much the same way as is done for size queries, might do this eventually anyway.) And if we'd ever want to support (shadow) cube map arrays, we'd need 5 coords in any case. v2: fix bugs (texel fetch using wrong layer coord for 1d, shadow tex using wrong shadow coord for 2d...). Plus need to project the shadow coord, and just for fun keep projecting the layer coord too. Reviewed-by: Zack Rusin <[email protected]>
* gallivm: change coordinate handling throughout functionsRoland Scheidegger2013-08-153-133/+133
| | | | | | | | | | | | | | | | Instead of passing s,t,r coordinates pass a coord array - the reason is that I need to pass more coords (in particular for shadow "coord", future will also need another one for cube map arrays) so just pass them as an array. Also, to simplify things, use fixed location for the shadow reference value I want to get rid of the silly "where is the right coord value" game. Keep old-style however for aos sampling (which is not going to need shadow coord, though for cube map arrays it still would need fixing). (Next patch will pass those through using the new arrangement directly from sampler interface.) v2: fix up soa split path (unreachable currently but still...) Reviewed-by: Zack Rusin <[email protected]>
* gallivm: fix border color with normalized texture formatsRoland Scheidegger2013-08-151-13/+53
| | | | | | | | | | | | | We need to put border color into texture format color space which essentially means clamping for non-float, normalized formats (not entirely sure if we're also meant to quantize the float but it's probably ok not to do it thankfully). For OpenGL we could do this easily outside generated code due to the 1:1 sampler/texture correspondence but not for d3d10 which is terrible (as we recalculate a constant over and over again per shader invocation). Fortunately border color should be rare enough that we don't care THAT much. Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: fix pipeline statistics with a null psZack Rusin2013-08-148-8/+43
| | | | | | | | | | If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: make sure that the stages setup outputsZack Rusin2013-08-145-30/+62
| | | | | | | | | | | | Calling the prepare outputs cleans up the slot assignments for outputs, unfortunately aapoint and aaline didn't have code to reset their slots after the initial setup, this was messing up our slot assignments. The unfilled stage was just missing the initial assignment of the face slot. This fixes all of the reported piglit failures. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vl: Add support for max level query v2Rico Schüller2013-08-1412-4/+99
| | | | | | | | | This patch adds the level query support to the video decoders and uses some more reasonable defaults. v2: (ck) add commit message Reviewed-by: Christian König <[email protected]>
* radeon/llvm: Add missing "%s" format string to fprintf.Jon Severinsson2013-08-131-1/+1
| | | | | | | | This fixes a compilation warning with -Wformat-security. CC: "9.2" <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g/sb: use MULADD workaround on R7xx for MULADD_IEEEVadim Girlin2013-08-141-1/+2
| | | | | | | | | | | | Looks like the same issue that was seen with MULADD in trans slot on R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is just a most frequently used?). So the workaround is to not allow affected instructions to be placed into the trans slot. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927 Signed-off-by: Vadim Girlin <[email protected]> Cc: "9.2" <[email protected]>
* gallivm: implement new float comparison instructions returning integer masksRoland Scheidegger2013-08-131-2/+79
| | | | | | | | | FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the select. And just for consistency use the same appropriate ordered/unordered comparisons for the old opcodes as well. Reviewed-by: Zack Rusin <[email protected]>
* tgsi: implement new float comparison instructions returning integer masksRoland Scheidegger2013-08-134-4/+102
| | | | | | | | Also while here add a bunch of other forgotten (integer) instructions to tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing away unused input components), though it may still be incomplete. Reviewed-by: Zack Rusin <[email protected]>
* gallium: add new float comparison instructions returning integer masksRoland Scheidegger2013-08-132-17/+82
| | | | | | | | | | | | Newer graphic languages don't want messy float mask results but instead true "boolean" mask results for float comparisons. Otherwise just need to convert the floats back to integers. Need to keep the old opcodes however due to both legacy (gl and d3d9) needing them and because older hw can't really deal with integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and hence must be supported if a driver claims to support glsl 1.30 (or PIPE_SHADER_CAP_INTEGERS). Reviewed-by: Zack Rusin <[email protected]>
* ilo: enable dumping of WM PCBChia-I Wu2013-08-131-1/+5
| | | | It was disabled because it wasn't supported.
* ilo: no binding table change when constants are pushedChia-I Wu2013-08-131-21/+17
| | | | | When constants can be pushed, and nothing else requires new SURFACE_STATEs, there is no need to emit BINDING_TABLE_STATE.
* ilo: support push constant model in shadersChia-I Wu2013-08-136-12/+143
| | | | | Source constants from URB constant data when the constant data can fit in the PCB.
* ilo: support copying constant buffer 0 to PCBChia-I Wu2013-08-135-19/+105
| | | | | Add ILO_KERNEL_PCB_CBUF0_SIZE so that a kernel can specify how many bytes of constant buffer 0 need to be copied to PCB.
* ilo: make constant buffer 0 upload optionalChia-I Wu2013-08-133-29/+35
| | | | | Add ILO_KERNEL_SKIP_CBUF0_UPLOAD so that we can skip constant buffer 0 upload when the kernel does not need it.
* Revert "ilo: initialize constant buffer SURFACE_STATE early"Chia-I Wu2013-08-134-32/+27
| | | | | | This reverts commit a9b800aa81cffdcaef2490ff49986099feae2663. With push constant support, the constructed SURFACE_STATE is unused and wasted. The change only slows things down.
* gallivm: fix exec_mask interaction with geometry shader after end of mainRoland Scheidegger2013-08-122-16/+14
| | | | | | | | | | | | | | | | Because we must maintain an exec_mask even if there's currently nothing on the mask stack, we can still have an exec_mask at the end of the program. Effectively, this mask should be set back to default when returning from main. Without relying on END/RET opcode (I think it's valid to have neither) it is actually difficult to do this, as there doesn't seem any reasonable place to do it, so instead let's just say the exec_mask is invalid outside main (which it really is effectively). The problem is that geometry shader called end_primitive outside the shader (in the epilogue), and as a result used a bogus mask, leading to bugs if we had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid the mask combining function when called from outside the shader. Reviewed-by: Zack Rusin <[email protected]>