summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium/tgsi: use CLAMP instead of open-coded clampsErik Faye-Lund2014-02-071-22/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL)Ilia Mirkin2014-02-061-0/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* nv50: only over-allocate by a page for codeIlia Mirkin2014-02-061-4/+5
| | | | | | | | | The pre-fetching doesn't go too far. Tested with over-allocating by only a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 10.1 <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* nv50: fix layerid to be the fp input number rather than vp output numberIlia Mirkin2014-02-063-7/+9
| | | | | | | | | | In the tests they were the same so it didn't matter, but indications are that this is the correct behaviour. Also take this opportunity to (trivially) support using gl_Layer in fp. Cc: 10.1 <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* nv50: rework primid logicIlia Mirkin2014-02-063-6/+4
| | | | | | | | | Functionally identical but much simpler. Should also better integrate with future layer/viewport changes/fixes. Cc: 10.1 <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* vdpau: flush the context before exporting the surface v2Marek Olšák2014-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Bugzilla (bug needs XBMC changes as well): https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König <[email protected]> Tested-by: StrangeNoises ([email protected])
* gallium/radeon: fix warningsMarek Olšák2014-02-063-5/+9
|
* gallium: remove PIPE_USAGE_STATICMarek Olšák2014-02-0651-71/+67
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium: define the behavior of PIPE_USAGE_* flags properlyMarek Olšák2014-02-062-12/+19
| | | | | | | | STATIC will be removed in the following commit. v2: changed the definition of IMMUTABLE Reviewed-by: Brian Paul <[email protected]>
* gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPSMarek Olšák2014-02-061-1/+0
| | | | | | Unused. Reviewed-by: Brian Paul <[email protected]>
* r600g,radeonsi: set resource domains in one place (v2)Marek Olšák2014-02-065-27/+23
| | | | | | | v2: This doesn't change the behavior. It only moves the tiling check to r600_init_resource and removes the usage parameter. Reviewed-by: Christian König <[email protected]>
* st/omx: add workaround for bug in BellagioChristian König2014-02-063-2/+16
| | | | | | Not blocking for the message thread can lead to accessing freed up memory. Signed-off-by: Christian König <[email protected]>
* st/omx: initial OpenMAX support v3Christian König2014-02-0613-0/+2390
| | | | | | | | | | Featuring a full grown MPEG2 and H264 decoder and a couple of hundred bugs. v2 (Leo): fix an error for pic_order_cnt_type 1 v3 (Leo): implement support for field decoding Signed-off-by: Christian König <[email protected]> Signed-off-by: Leo Liu <[email protected]>
* vl/rbsp: add H.264 RBSP implementationChristian König2014-02-061-0/+164
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add function to limit the vlc sizeChristian König2014-02-061-12/+41
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add remove bits functionChristian König2014-02-061-0/+12
| | | | Signed-off-by: Christian König <[email protected]>
* radeon: just don't map VRAM buffers at allChristian König2014-02-061-2/+2
| | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon/video: directly create buffers in the right domainChristian König2014-02-063-7/+12
| | | | | | Avoid moving things around on start of stream. Signed-off-by: Christian König <[email protected]>
* radeon/video: seperate common video functionsChristian König2014-02-069-315/+413
| | | | Signed-off-by: Christian König <[email protected]>
* gallium/dri2: Fix dri2_dup_imageAxel Davy2014-02-051-0/+1
| | | | | | | | | | | | dri2_dup_image was not copying the dri_format field. This was causing some bugs, for example: . we create an gbm_bo. . we get an EGLImage from the gbm_bo. . Bug: impossible to get again the gbm_bo from the EGLImage by importing. (gbm dri2 backend) Signed-off-by: Axel Davy <[email protected]>
* tgsi/ureg: increase the number of immediatesZack Rusin2014-02-051-1/+1
| | | | | | | | | | ureg_program is allocated on the heap so we can just bump the number of immediates that it can handle. It's needed for d3d10. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: make sure analysis works with large number of immediatesZack Rusin2014-02-051-8/+9
| | | | | | | | | | | We need to handle a lot more immediates and in order to do that we also switch from allocating this structure on the stack to allocating it on the heap. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: handle huge number of immediatesZack Rusin2014-02-054-44/+86
| | | | | | | | | | | | | | We only supported up to 256 immediates, which isn't enough. We had code which was allocating immediates as an allocated array, but it was always used along a statically backed array for performance reasons. This commit adds code to skip that performance optimization and always use just the dynamically allocated immediates if the number of them is too great. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: allow large numbers of temporariesZack Rusin2014-02-054-5/+20
| | | | | | | | | | | | | | The number of allowed temporaries increases almost with every iteration of an api. We used to support 128, then we started increasing and the newer api's support 4096+. So if we notice that the number of temporaries is larger than our statically allocated storage would allow we just treat them as indexable temporaries and allocate them as an array from the start. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix F2U opcodeRoland Scheidegger2014-02-051-20/+22
| | | | | | | | | | | | | | | | | | | | | Previously, we were really doing F2I. And also move it to generic section. (Note that for llvmpipe the code generated is definitely bad, due to lack of unsigned conversions with sse. I think though what llvm does (using scalar conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit) including lots of domain changes is quite suboptimal, could do something like is_large = arg >= 2^31 half_arg = 0.5 * arg small_c = fptoint(arg) large_c = fptoint(half_arg) << 1 res = select(is_large, large_c, small_c) which should be much less instructions but that's something llvm should do itself.) This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs GL 3.0 version override to run.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* tools/trace: Handle index buffer overflow gracefully.José Fonseca2014-02-051-1/+4
| | | | Trivial.
* r600g: add support for geom shaders to r600/r700 chipsets (v2)Dave Airlie2014-02-057-49/+313
| | | | | | | | | | | | | This is my first attempt at enabling r600/r700 geometry shaders, the basic tests pass on both my rv770 and my rv635, It requires this kernel patch: http://www.spinics.net/lists/dri-devel/msg52745.html v2: address Alex comments. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: enable GLSL 3.30 on evergreen GPUsDave Airlie2014-02-051-1/+1
| | | | | | | This throws the switch to enable GL 3.3 and GLSL 330. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: properly propogate clip dist write valueDave Airlie2014-02-051-0/+1
| | | | | | | | | | This moves the value from the GS shader to the copy shader so the registers are setup correctly. fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: calculate a better value for array_size (v2)Dave Airlie2014-02-051-1/+1
| | | | | | | | | attempt to calculate a better value for array size to avoid breaking apps. v2: use 0xfff like streamout, suggested by Grigori Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix CAYMAN geometry shader supportDave Airlie2014-02-051-2/+6
| | | | | | | | | cayman has a different end of program bit, so do that properly. fixes hangs with geom shader tests on cayman. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix up shader out misc stuff for copy shaderDave Airlie2014-02-052-1/+16
| | | | | | | | | | | set the correct values so the misc out register is setup correctly for the copy shader. This also updates the state for the gs copy shader so the hw gets programmed correctly. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: port the layered surface rendering patch from radeonsiDave Airlie2014-02-053-21/+19
| | | | | | | | | This just makes r600 and evergreen do what the radeonsi codepaths do for layered rendering. This makes the 2d amd_vertex_shader_layer test pass on evergreen. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: initial VS output layer supportDave Airlie2014-02-054-14/+50
| | | | | | | This just adds support for emitting the proper value in the VS out misc. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: setup const texture buffers for geom shadersDave Airlie2014-02-051-0/+6
| | | | | | | | This just enables the workarounds we have for vertex/pixel shaders for geom shaders as well. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: calculate correct cut valueDave Airlie2014-02-051-1/+11
| | | | | | | This selects the cut value depending on the shader selected. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix dynamic_input_array_index.shader_testDave Airlie2014-02-051-4/+44
| | | | | | | | | This follows what fglrx does, it unpacks the input we are going to indirect into a bunch of registers and indirects inside them. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: add support for indirect geom ring writesDave Airlie2014-02-051-7/+58
| | | | | | | | | | | We need to be able to write to the ring using a base register for when we emit vertices in a loop, in theory the SB compiler could collapse these indirect writes to direct writes if the register value is constant and known, but that is outside my pay grade. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: write proper output prim typeDave Airlie2014-02-052-27/+26
| | | | | | | | Vadim's code derived it from the info.mode, but it needs to be takes from the geometry shader output primitive. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: enable instance cnt register with new enough kernelDave Airlie2014-02-051-6/+6
| | | | | | | | The instance cnt register was missing for a few kernels, with a new enough kernel we can output it. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: add primitive input support for gsDave Airlie2014-02-054-1/+19
| | | | | | | only enable prim id if gs uses it Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: emit streamout from dma copy shaderDave Airlie2014-02-052-2/+8
| | | | | | | | This enables streamout with GS in the mix, from the VS dma shader. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g/gs: fix cases where number of gs inputs != number of gs outputsDave Airlie2014-02-051-1/+6
| | | | | | | this fixes a bunch of the geom shader built-in tests Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: increase array base for exported parametersDave Airlie2014-02-051-0/+3
| | | | | | | Trivial fix to Vadim's code. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: initialise the geom shader loop registers.Dave Airlie2014-02-051-0/+2
| | | | | | | As we do for vertex and pixel shaders. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: emit NOPs at end of shaders in more casesDave Airlie2014-02-051-2/+5
| | | | | | | | | | | If the shader has no CF clauses at all emit an nop If the last instruction is an ENDLOOP add a NOP for the LOOP to go to if the last instruction is CALL_FS add a NOP These fix a bunch of hangs in the geometry shader tests. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't enable SB for geom shadersDave Airlie2014-02-051-0/+3
| | | | | | | | SB needs fixes for three GS instructions it seems to raise them outside loops etc despite my best efforts. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g/sb: add MEM_RING supportDave Airlie2014-02-054-5/+8
| | | | | | | | Although we don't use SB on geom shaders, the VS copy shader will use it so we might as well implement MEM_RING support in sb. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't fail if we can't map VS->GS ring entriesDave Airlie2014-02-051-4/+3
| | | | | | | | This can happen in normal operation, so don't report an error on it, just continue. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: initial support for geometry shaders on evergreen (v2)Vadim Girlin2014-02-0515-206/+909
| | | | | | | | | | | | | | This is Vadim's initial work with a few regression fixes squashed in. v2: (airlied) fix regression in glsl-max-varyings - need to use vs and ps_dirty fix regression in shader exports from rebasing. whitespace fixing. v2.1: squash fix assert Signed-off-by: Vadim Girlin <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>