summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: fix bogus layer clamping in setupRoland Scheidegger2013-10-292-8/+25
| | | | | | | | | | | | | | | | | | | | | The layer coming from GS needs to be clamped (not sure if that's actually the correct error behavior but we need something) as the number can be higher than the amount of layers in the fb. However, this code was using the layer calculation from the scene, and this was actually calculated in lp_scene_begin_rasterization() hence too late (so setup was using the value from the _previous_ scene or just zero if it was the first scene). Since the value is used in both rasterization and setup, move calculation up to lp_scene_begin_binning() though it's a bit more inconvenient to calculate there. (Theoretically could move _all_ code which was in lp_scene_begin_rasterization() to there, because ever since we got rid of swizzled render/depth buffers our "map" functions preparing the fb data for render don't actually change the data in there at all, but it feels like it would be a hack.) v2: improve comments Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2013-10-291-19/+12
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* svga: reindent drawing codeBrian Paul2013-10-293-266/+199
|
* r600g/sb: fix value::is_fixed()Vadim Girlin2013-10-291-2/+2
| | | | | | | | | | | This prevents unnecessary (and wrong) register allocation in the scheduler for preloaded values in fixed registers. Fixes interpolation-mixed.shader_test on rv770 (and probably on all other pre-evergreen chips). Signed-off-by: Vadim Girlin <[email protected]> Tested-by: Andreas Boll <[email protected]>
* vl/h264: split fields into SPS/PPSChristian König2013-10-285-80/+79
| | | | | | Add alot of missing fields as well. Signed-off-by: Christian König <[email protected]>
* radeon/uvd: fix H264 chroma format handlingChristian König2013-10-281-1/+15
| | | | Signed-off-by: Christian König <[email protected]>
* ilo: minor cleanups for recent interface changesChia-I Wu2013-10-283-156/+9
| | | | | Kill ilo_bind_sampler_states2 and ilo_set_sampler_views2. Map PIPE_FORMAT_R10G10B10A2_UINT to BRW_SURFACEFORMAT_R10G10B10A2_UINT.
* gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZESIlia Mirkin2013-10-2612-0/+12
| | | | | | | | | This CAP will determine whether ARB_framebuffer_object can be enabled. The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf textures. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g,radeonsi: use fences provided by the winsysMarek Olšák2013-10-258-462/+37
|
* winsys/radeon: add the implementation of fences from r300gMarek Olšák2013-10-252-33/+8
|
* radeonsi: add the vertex shader position output if it's missingMarek Olšák2013-10-251-0/+13
| | | | | | This fixes a lockup in piglit/spec/glsl-1.40/execution/tf-no-position. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: respect semantic indices for COLOR[i] fragment shader outputsMarek Olšák2013-10-251-5/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/a3xx/compiler: relative addressingRob Clark2013-10-241-1/+123
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix const/rel/const-rel encodingRob Clark2013-10-244-88/+300
| | | | | | | | | | | | | | | | | | | | | | | | The encoding of constant, relative, and relative-const src registers is a bit more complex than originally thought, which gives an extra bit to encode const reg # at expense of taking a bit from relative offset. In most cases a3xx seems to actually use a scheme whereby it can encode an extra bit for const register. You have three possible encodings in thirteen bits: register: (11 bits for N.c) 00........... rN.c relative: (10 bits for N) 010.......... r<a0.x + N> 011.......... c<a0.x + N> const: (12 bits for N.c) 1............ cN.c Which means we can deal w/ more consts than previously thought. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: add blend stateRob Clark2013-10-242-5/+23
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/resource: fail more gracefullyRob Clark2013-10-241-1/+13
| | | | | | Fail more gracefully when buffer allocation/import fails. Signed-off-by: Rob Clark <[email protected]>
* svga: remove user-space vertex/index buffer codeBrian Paul2013-10-246-259/+13
| | | | | | | | The gallium vbuf module, which we've been using for some time now, takes care of uploading user-space vertex/index data into real buffers. The upload code in the svga driver was unused. Reviewed-by: José Fonseca <[email protected]>
* freedreno: fix compile errorRob Clark2013-10-231-1/+1
| | | | | | Small typo introduced in a3ed98f. Signed-off-by: Rob Clark <[email protected]>
* nv50: clamp PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS to PIPE_MAX_SAMPLERSBrian Paul2013-10-231-1/+1
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70212 Tested-by: Aaron Watry <[email protected]>
* radeonsi: remove unused si_set_cs_sampler_view()Brian Paul2013-10-231-4/+0
| | | | | | | Fixes build breakage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70804 Tested-by: Vinson Lee <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-2321-368/+188
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* svga: remove unneeded include of u_double_list.hBrian Paul2013-10-231-2/+0
|
* llvmpipe: enable seamless cube filteringRoland Scheidegger2013-10-211-1/+1
| | | | | Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r300g/compiler: Fix unsigned comparison with less than zeroDavid Heidelberger2013-10-211-1/+1
| | | | | | | | | | rc_find_free_temporary_list() returns signed integer (in case of lack of free temporary registers returns -1), so new_index in radeon_rename_regs() should be signed. https://bugs.freedesktop.org/show_bug.cgi?id=54867 Signed-off-by: Marek Olšák <[email protected]>
* r600g/sb: Initialize shader::dce_flags.Vinson Lee2013-10-201-1/+2
| | | | | | | Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vadim Girlin <[email protected]>
* r600g/sb: fix issue with DCE between GVN and GCM (v2)Vadim Girlin2013-10-174-12/+39
| | | | | | | | | | | | | We can't perform DCE using the liveness pass between GVN and GCM because it relies on the correct schedule, but GVN doesn't care about preserving correctness - it's rescheduled later by GCM. This patch makes dce_cleanup pass perform simple DCE between GVN and GCM instead of relying on liveness pass. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088 Signed-off-by: Vadim Girlin <[email protected]>
* Revert "scons: Fix build when rtti is disabled"José Fonseca2013-10-161-5/+4
| | | | | | | | | | This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a few problems: - it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there - it is merging not only rtti, but the whole cxxflags (defines etc) which has proven to be a source of troubles (breaks debugging etc.)
* radeonsi: Use 'SI' as the LLVM processor for CIK on LLVM <= 3.3Tom Stellard2013-10-161-0/+4
| | | | | | | | LLVM 3.3 does not know about CIK processors, and the codes paths for SI and CIK are the same. Reviewed-by: Marek Olšák <[email protected]> Cc: "9.2" <[email protected]>
* r600g/compute Improve debugging outputTom Stellard2013-10-162-5/+7
|
* svga: minor fix-ups in svga_get_shader_param()Brian Paul2013-10-161-2/+3
| | | | | Fix debug error message. Add switch case for PIPE_SHADER_COMPUTE. Trivial.
* scons: Fix build when rtti is disabledAlexander von Gluck IV2013-10-151-4/+5
| | | | | | | | | | | | * The rtti fix actually dug up a bug in the scons build scripts. * Autotools took the LLVM cpp and cxx flags, while scons only took the cpp flags. * This grabs the cxx flags and applies them where needed. We may want to make the same change for the llvm cpp flags in scons. * The only linux platform I can find with LLVM no-rtti is Ubuntu. * Fixes bug #70471 Tested-by: Vinson Lee <[email protected]>
* llvmpipe: Advertise PIPE_CAP_DEPTH_CLIP_DISABLE.José Fonseca2013-10-151-1/+1
| | | | | | | | Actually implemented by draw module. Tested piglit ARB_depth_clamp tests, which pass 100%. Trivial.
* radeon: use staging for mapping linear texturesGrigori Goronzy2013-10-131-0/+6
| | | | | | | | Textures that likely reside in VRAM, are mapped for reading and don't require direct mapping should be staged into GTT, to avoid bad performance. This fixes readback performance of VDPAU surfaces. Reviewed-by: Marek Olšák <[email protected]>
* radeon/uvd: use PIPE_BIND_LINEAR for video surfacesGrigori Goronzy2013-10-132-7/+7
| | | | | | | This new bind flag forces linear storage, but does not have other side effects like R600_RESOURCE_FLAG_TRANSFER. Reviewed-by: Christian König <[email protected]>
* radeonsi: Allow Sinking pass to move preloaded const/res/samplVincent Lejeune2013-10-132-5/+28
| | | | | This fixes a crash in Unigine Heaven 3.0, and probably in some others apps.
* radeonsi: pass alpha_ref value to PS in the user sgprVadim Girlin2013-10-133-25/+29
| | | | | | | | | | | | Currently it's hardcoded in the shader, so every change requires compilation of the shader variant, killing the performance in Serious Sam 3 and probably other apps. This patch passes alpha_ref in the user sgpr and removes it from the shader key. Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: fix tgsi_op2_s with trans-only instructionsVadim Girlin2013-10-131-5/+31
| | | | | | | | | | | | | | | | | | This fixes the issue when dst and src is the same reg and operation on one channel overwrites the source for other channels, e.g.: UMUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[2].xxxx In this example the result of the operation on channel x is written in TEMP[2].x and then used as a second source operand for channels y and z instead of original value in TEMP[2].x. This patch stores the results in temp reg and moves them to dst after performing operation on all channels. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70327 Signed-off-by: Vadim Girlin <[email protected]>
* i915g: Fix assertStephane Marchesin2013-10-121-1/+1
| | | | | | Now that we support start, assert on start + num < max samplers Reported by xexaxo
* radeon/llvm: show LLVM disassembly when availableJay Cornwall2013-10-123-1/+9
| | | | | | | | With code dump enabled LLVM may generate disassembly during compilation. Show this disassembly when available and prefer it to SI bytecode dump. Reviewed-by: Tom Stellard <[email protected]> Signed-off-by: Jay Cornwall <[email protected]>
* softpipe: fix seamless cube filteringRoland Scheidegger2013-10-121-48/+151
| | | | | | | | | | | | | | | | | | | | | | Fix coord wrapping (and face selection too) in case of edges. Unfortunately, the coord wrapping is way more complicated than what the code did, as it depends on the face and the direction where the texel falls off the face (the logic needed to get this right in fact seems utterly ridiculous). Also fix a bug in (y direction under/overflow) face selection. And get rid of complicated cube corner handling. Just like edge case, the coord wrapping was wrong and it seems very difficult to fix. I'm near certain it can't always work anyway (though ordinary seamless filtering on edge has actually a similar problem but not as severe) because we don't have per-pixel face, hence could have multiple corner texels which would make it very difficult to average the remaining texels correctly. Hence simply pick a texel which would only have fallen off one edge but not both instead, which is not quite accurate but actually I think should be enough to meet OpenGL (but not d3d10) requirements. v2: small fixes suggested by Brian, add some comments. Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: increase fs shader variant instruction cache limit by factor 4Roland Scheidegger2013-10-121-2/+2
| | | | | | | | | | | | | | | | | The previous limit of of 128*1024 was reported to cause frequent recompiles in some apps due to shader variant thrashing on IRC in some apps leading to noticeable lags. Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible to reach, since even simple fragment shaders without texturing (glxgears) used more than twice than 128 instructions, hence the instruction limit would have always been reached first (excluding things like trivial shaders not writing color). Even with the new limit it is VERY likely the instruction limit is hit first. Should help with such lags due to recompiles (though other shader types have their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in particular the latter seems a bit small (128)). Reviewed-by: Brian Paul <[email protected]>
* svga: s/0/FALSE/Brian Paul2013-10-111-2/+2
|
* r600g: fix crash in set_framebuffer_stateGrigori Goronzy2013-10-112-12/+26
| | | | | | | | We should be able to safely set the framebuffer state without a fragment shader bound. bind_ps_state will take care of updating the necessary state bits later. v2: check in update_db_shader_control
* llvmpipe: We don't use the draw pipeline for offset_point/line.José Fonseca2013-10-091-2/+0
| | | | | | | | Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL, so checking the the polygon mode is sufficient. Testing done: no regression in polygon-mode-offset Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: abstract the code to set number of subpixel bitsZack Rusin2013-10-093-10/+15
| | | | | | | | | | As we're moving towards expanding the number of subpixel bits and the width of the variables used in the computations we need to make this code a bit more centralized. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon/uvd: disable VC-1 simple/main profileGrigori Goronzy2013-10-091-1/+3
| | | | | | | | It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse yet, it often results in random memory corruption or GPU hangs. Rumor has it only the newest UVD hardware could do it anyway. Reviewed-by: Christian König <[email protected]>
* radeon/uvd: try to fix VC-1 decodingGrigori Goronzy2013-10-091-33/+38
| | | | | | | | | | | | | The DPB size calculations seem to be off; there is various random corruption happening, even with advanced profile. Always assuming a minimum number of references appears to fix it, similarly to H.264. This might overallocate the DPB. Also clean up the SPS/PPS field setup so that it matches VC-1 specifications better. With these changes, all advanced profile VC-1 files I could get my hand on work fine. Reviewed-by: Christian König <[email protected]>
* radeon/uvd: fix video format reportingGrigori Goronzy2013-10-091-2/+5
| | | | | | | | | | UVD can only support NV12 in the case of hardware decoding, but we can still use all other formats for software decoding. Use the UNKNOWN profile to signal that we're not interesting in hardware decoding. v2: use profile instead of entrypoint Reviewed-by: Christian König <[email protected]>
* radeonsi: fix occlusion queries for CIKMarek Olšák2013-10-091-3/+12
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: draw register fixes for CIKMarek Olšák2013-10-092-9/+27
| | | | | | This doesn't fix any known issue. I'm just following the docs. Reviewed-by: Michel Dänzer <[email protected]>