summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeon/uvd: add UVD implementation v5Christian König2013-04-1115-21/+1924
| | | | | | | | | | | | | | Just everything you need for UVD with r600g and radeonsi. v2: move UVD code to radeon subdir, clean up build system additions, remove an unused SI function, disable tiling on SI for now. v3: some minor indentation fix and rebased v4: dpb size calculation fixed v5: implement proper fall-back in case the kernel doesn't support UVD, based on patches from Andreas Boll but cleaned up a bit more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeon/winsys: add uvd ring support to winsys v3Christian König2013-04-113-0/+31
| | | | | | | | | | | | | Separated from UVD patch for clarity. v2: sync with next tree for 3.10 v3: as pointed out by Andreas Bool check for drm minor >= 32 http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* r600g: Add support for GL_ARB_texture_buffer_rangeFredrik Höglund2013-04-113-5/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* r600g: fix valgrind warning on CaymanMarek Olšák2013-04-101-1/+1
| | | | Warning: "Conditional jump or move depends on uninitialised value(s)".
* gallivm/tgsi: handle untyped movesZack Rusin2013-04-102-0/+10
| | | | | | | | | | | both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix loops and conditionals within GSZack Rusin2013-04-102-19/+105
| | | | | | | | | | | | | We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries, which were holding the number of vertices and primitives which were emitted, were being reset to zero. Now we're using alloca to allocate those variables to preserve them across conditionals. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: implement PIPE_QUERY_SO_STATISTICSZack Rusin2013-04-102-0/+21
| | | | | | | | | | We were missing the implementation of PIPE_QUERY_SO_STATISTICS query, this change implements it on top of the existing facilities. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix unsigned divide and remainder opcodesZack Rusin2013-04-101-4/+33
| | | | | | | | | | We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0xffffffff. Based on José idea. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix breakcZack Rusin2013-04-101-12/+14
| | | | | | | | | we break when the mask values are 0 not, 1, plus it's bit comparison not a floating point comparison. This fixes both. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: remove sampler writemask v3Christian König2013-04-102-13/+8
| | | | | | | | v2: fix instrinsic name as well v3: LLVM revision incremented as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* pipe-loader: Fix out of source buildNiels Ole Salscheider2013-04-101-2/+2
| | | | Signed-off-by: Niels Ole Salscheider <[email protected]>
* st/osmesa: re-use buffers in OSMesaMakeCurrent()Brian Paul2013-04-091-7/+54
| | | | | | | Rather than creating a new buffer each time. Fixes problems found with vtk. Tested-by: Kevin H. Hobbs <[email protected]>
* st/vdpau: fix subtitle related bug v2Christian König2013-04-091-0/+4
| | | | | | | | | | Drawing subtitles didn't increased the dirty area of the surface. Reported and tested by freeedrich on irc. v2: don't clear the surface Signed-off-by: Christian König <[email protected]>
* softpipe: misc updates to image dumping in softpipe_flush()Brian Paul2013-04-091-3/+4
|
* tgsi: Ensure struct tgsi_ind_register field Index is initialized.Vinson Lee2013-04-081-0/+1
| | | | | | | Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g: Fix UMAD on CaymanMartin Andersson2013-04-091-13/+32
| | | | | | | | | | | | The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. This fixed hardlocks in the EXT_transform_feedback/order tests. NOTE: This is a candidate for the stable branches. (might not be easy to cherry-pick though) Signed-off-by: Marek Olšák <[email protected]>
* r600g/llvm: Add support for native isa for pre EGVincent Lejeune2013-04-082-2/+6
| | | | | This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
* gallium/util: add const to a parameter of util_max_layerMarek Olšák2013-04-061-1/+1
|
* radeonsi: Add compute support v3Tom Stellard2013-04-0511-49/+378
| | | | | | | | | | | v2: - Only dump shaders when env variable is set. v3: - Don't emit VGT registers Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cacheTom Stellard2013-04-051-0/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* radeonsi: Remove si_pm4_inval_vertex_cache()Tom Stellard2013-04-053-8/+1
| | | | | | | | This function is a holdover from r600g and is identical to si_pm4_inval_texture_cache(), so it is not needed. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2Tom Stellard2013-04-0510-80/+105
| | | | | | | | | | | | This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <[email protected]>
* util: add ETC as compressed formatWladimir2013-04-051-0/+1
| | | | | | | Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: fix is_blit_generic_supported() stencil checkingBrian Paul2013-04-051-12/+14
| | | | | | | | | | | | | | | Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* freedreno: use autogenerated register defsRob Clark2013-04-0523-1617/+2116
| | | | | | | | | | | | | Switch to use the envytools generated headers for register/bitfield definitions. This is the first step in preparing to add a3xx support, since it avoids having conflicting names for a3xx and a2xx registers. And since I'm using envytools for a3xx it is simpler to just use it for everything. This shouldn't cause any functional change, it is really just a lot of renaming. Signed-off-by: Rob Clark <[email protected]>
* st/wgl: Install our windows message hook to threads created before the ICD ↵José Fonseca2013-04-052-26/+196
| | | | | | | | | | | | | | | is loaded. Otherwise we will not receive destroy windows events, causing framebuffers to leak. This happens particularly with java and jogl. Tested with java + jogl, MATLAB. VMware Internal Bug Number: 1013086. Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: Work without sse2 if llvm is new enoughAdam Jackson2013-04-051-2/+3
| | | | | | | | At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP 2600+, which has sse and 3dnow but not sse2. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* winsys/radeon: add command stream replay dump for faulty lockup v3Jerome Glisse2013-04-057-37/+443
| | | | | | | | | | | | | | | | | | Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to enable it. When enabled after each cs submission the code will try to detect lockup by waiting on one of the buffer of the cs to become idle, after a timeout it will consider that the cs triggered a lockup and will write a radeon_lockup.c file in current directory that have all information for replaying the cs. To build this file : gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm v2: Add radeon_ctx.h file to mesa git tree v3: Slightly improve dumped file for easier editing, only dump first faulty cs Signed-off-by: Jerome Glisse <[email protected]>
* st/xlib: add HUD support for xlib/GLXBrian Paul2013-04-044-0/+34
| | | | | | For the softpipe and llvmpipe drivers. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: add GALLIUM_HUD_PERIOD env varBrian Paul2013-04-041-1/+16
| | | | | | | To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: initialize sampler stateBrian Paul2013-04-041-0/+6
| | | | | | | | | The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <[email protected]>
* gallivm: some minor cube map cleanupRoland Scheidegger2013-04-041-10/+15
| | | | | | | | | | | | | | | The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: implement ucmpZack Rusin2013-04-042-0/+32
| | | | | | | and add a test for it Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* Avoid spurious GCC warnings in STATIC_ASSERT() macro.Paul Berry2013-04-041-1/+1
| | | | | | | | | | | | | GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produced spurious warnings with the STATIC_ASSERT() macro (which used a typedef to provoke a compile error in the event of an assertion failure). This patch switches to a simpler technique that avoids the warning. v2: Avoid GCC-specific syntax. Also update p_compiler.h. Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno: document debug flagErik Faye-Lund2013-04-041-0/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* st/wgl: add HUD supportBrian Paul2013-04-045-0/+42
| | | | | | v2: fix a few minor issues spotted by Jose. Reviewed-by: José Fonseca <[email protected]>
* st/wgl: make stw_current_context() non-staticBrian Paul2013-04-042-1/+3
| | | | Reviewed-by: José Fonseca <[email protected]>
* util: add debug_memory_check_block(), debug_memory_tag()Brian Paul2013-04-042-0/+61
| | | | | | | | | | The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: replace malloc w/ MALLOCBrian Paul2013-04-041-1/+1
| | | | | | To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <[email protected]>
* r600g/llvm: Workaround for wrong tex.offset_*Vincent Lejeune2013-04-041-0/+3
|
* gallivm: honor explicit derivatives values for cube maps.Roland Scheidegger2013-04-044-28/+60
| | | | | | | | | | | | This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Passes piglit arb_shader_texture_lod-texgradcube test. v2: add the forgotten abs() for all incoming derivatives (discovered by new piglit arb_shader_texture_lod-texgradcube test, though more by luck as it was failing only for exactly one pixel...). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: do per-pixel cube face selection (finally!!!)Roland Scheidegger2013-04-043-82/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: minor rho calculation optimization for 1 or 3 coordsRoland Scheidegger2013-04-042-29/+22
| | | | | | | Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: use f16c hw support for float->half and half->float conversionRoland Scheidegger2013-04-044-4/+53
| | | | | | | | Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. Reviewed-by: Brian Paul <[email protected]>
* draw/llvmpipe: allow independent so attachments to the vsZack Rusin2013-04-036-23/+43
| | | | | | | | | | | | | | When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: reset so buffers when not appendingZack Rusin2013-04-031-0/+6
| | | | | | | | | We need to reset the internal state of the so buffers or we'll keep appending even though we're not supposed to. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: remove unused functionZack Rusin2013-04-032-12/+0
| | | | | | | | we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/llvm: use an enum instead of magic numbersZack Rusin2013-04-032-10/+15
| | | | | | | | | | | I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/gs: cleanup some debugging codeZack Rusin2013-04-031-4/+0
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: maintain an exact number of written verticesZack Rusin2013-04-033-7/+33
| | | | | | | | | It's quite helpful during the rendering when we know exactly the count of the vertices available in the buffer. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>