summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: implement layered framebuffer clear in u_blitterMarek Olšák2013-12-034-5/+4
| | | | | | | | | | | | | All bound layers (from first_layer to last_layer) should be cleared. This uses a vertex shader which outputs gl_Layer = gl_InstanceID, so each instance goes to a different layer. By rendering a quad and setting the instance count to the number of layers, it will trivially clear all layers. This requires AMD_vertex_shader_layer (or PIPE_CAP_TGSI_VS_LAYER), which only radeonsi supports at the moment. r600 could do this too. Standard DX11 hardware will have to use a geometry shader though, which has higher overhead.
* gallium: add support for AMD_vertex_shader_layerMarek Olšák2013-12-0312-0/+17
|
* radeonsi: add driver support for layered rendering and AMD_vertex_shader_layerMarek Olšák2013-12-034-12/+27
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement OpenGL edge flagsMarek Olšák2013-12-033-6/+48
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* freedreno: Add a few texture formatsAndreas Heider2013-12-021-0/+3
|
* trace: Dump PIPE_QUERY_* enums.José Fonseca2013-11-285-15/+62
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* trace: Dump query results faithfully.José Fonseca2013-11-283-15/+133
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: new shader cap bit for the amount of sampler viewsRoland Scheidegger2013-11-2812-6/+26
| | | | | | | | | Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/drivers: support more sampler views than samplers for more driversRoland Scheidegger2013-11-286-7/+7
| | | | | | | | | This adds support for this to more drivers, in particular for all the "special" ones useful for debugging. HW drivers are left alone, some should be able to support it if they want but they may not be interested at this point. Reviewed-by: Jose Fonseca <[email protected]>
* radeon/compute: Unconditionally inline all functions v2Tom Stellard2013-11-251-2/+20
| | | | | | | | | | | We need to do this until function calls are supported. v2: - Fix loop conditional https://bugs.freedesktop.org/show_bug.cgi?id=64225 CC: "10.0" <[email protected]>
* llvmpipe: support 8bit subpixel precisionZack Rusin2013-11-258-148/+321
| | | | | | | | | | | | | 8 bit precision is required by d3d10 but unfortunately requires 64 bit rasterizer. This commit implements 64 bit rasterization with full support for 8bit subpixel precision. It's a combination of all individual commits from the llvmpipe-rast-64 branch. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: implement MSAA for CIKMarek Olšák2013-11-233-11/+28
| | | | | | There are also some changes to the printfs. Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* radeonsi: enable 2D tiling on CIKMarek Olšák2013-11-231-4/+0
| | | | | | libdrm does the DRM version check and decides if 2D tiling is used. Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* llvmpipe: (trivial) disable new accurate origin calculationRoland Scheidegger2013-11-221-1/+1
| | | | It looks like there's some bugs in it...
* nvc0: inform kernel about buffers that screen_create touchesBen Skeggs2013-11-221-0/+2
| | | | | | | Prevents a GPU page fault if somehow the uniform bo gets evicted before the screen_create pushbuf has been submitted. Signed-off-by: Ben Skeggs <[email protected]>
* radeonsi/compute: Fix LDS size calculationTom Stellard2013-11-211-1/+16
| | | | | | We need to include the number of LDS bytes allocated by the state tracker. CC: "10.0" <[email protected]>
* r600g/compute: Add a work-around for flushing issues on CaymanTom Stellard2013-11-213-1/+17
| | | | | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> https://bugs.freedesktop.org/show_bug.cgi?id=69321 CC: "10.0" <[email protected]>
* llvmpipe: calculate more accurate interpolation value at originRoland Scheidegger2013-11-211-6/+82
| | | | | | | | | | | | | | | | | Some rounding errors could crop up when calculating a0. Use a more accurate method (barycentric interpolation essentially) to fix this, though to fix the REAL problem (which is that our interpolation will give very bad results with small triangles far away from the origin when they have steep gradients) this does absolutely nothing (actually makes it worse). (To fix the real problem, either would need to use a vertex corner (or some other point inside the tri) as starting point value instead of fb origin and pass that down to interpolation, or mimic what hw does, use barycentric interpolation (using the coordinates extracted from the rasterizer edge functions) - maybe another time.) Some (silly) tests though really want a high accuracy at fb origin and don't care much about anything else (Just. Don't. Ask.). Reviewed-by: Jose Fonseca <[email protected]>
* svga: remove special-case code for texkil w componentBrian Paul2013-11-211-23/+6
| | | | | | Not actually needed. Fixes piglit ARB_fragment_program/kil-swizzle test. Reviewed-by: José Fonseca <[email protected]>
* svga: improve check for 3D compressed texturesBrian Paul2013-11-191-5/+7
| | | | | | | | | | | This is basically a a respin of f1dfcf4bce35e6796f873d9a00103b280da81e4c per Jose's suggestion. Just set the SVGA3dSurfaceFormatCaps flags for 3D and cube textures when checking the texture format capabilities. This will filter out unsupported combinations like 3D+DXT. Reviewed-by: Jose Fonseca <[email protected]>
* svga: we don't supported 3D compressed texturesBrian Paul2013-11-181-0/+6
| | | | Reviewed-by: Jakob Bornecrantz <[email protected]>
* r600g/compute: Fix handling of global buffers in r600_resource_copy_region()Tom Stellard2013-11-181-1/+36
| | | | | | | | | | | Global buffers do not have an associate cs_buf handle, so we can't copy them using r600_copy_buffer() https://bugs.freedesktop.org/show_bug.cgi?id=64226 Reviewed-by: Marek Ol????k <[email protected]> CC: "10.0" <[email protected]>
* drivers/radeon: remove unused CXXFLAGS, LLVM_CPP_FILESEmil Velikov2013-11-181-4/+0
| | | | | | | | | | | | | | | | | | The above two variables are unused as of commit commit 024fe6852a76f33d7e2afc5621340e387c381bb0 Author: Tom Stellard <[email protected]> Date: Tue Apr 2 10:42:50 2013 -0700 radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2 which removed the only cpp file from drivers/radeon, but missed to remove the CXXFLAGS. The sequential commit reintroduced and empty LLVM_CPP_FILES. Lets cleanup and remove both. Signed-off-by: Emil Velikov <[email protected]>
* r600/sb: Fix broken assertChris Forbes2013-11-171-1/+1
| | | | | | | This would never fire. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* r600g/sb: work around hw issues with stack on eg/cmVadim Girlin2013-11-175-44/+123
| | | | | | | | v2: make it actually work, improve condition Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503 Cc: "10.0" <[email protected]> Signed-off-by: Vadim Girlin <[email protected]>
* gallium/drivers: compact compiler flags into Automake.incEmil Velikov2013-11-1616-109/+62
| | | | | | | | | | * minimise flags duplication * distingush between VISIBILITY C and CXX flags * set only required flags - C and/or CXX v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash) Signed-off-by: Emil Velikov <[email protected]>
* gallium/drivers: enable automake subdir-objectsEmil Velikov2013-11-166-0/+12
| | | | Signed-off-by: Emil Velikov <[email protected]>
* r300: move the final sources list to Makefile.sourcesEmil Velikov2013-11-162-12/+15
| | | | | Reviewed-by: Tom Stellard <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* r300: add symlink to ralloc.c and register_allocate.cEmil Velikov2013-11-163-3/+5
| | | | | | | | Make automake's subdir-objects work. Update includes. Reviewed-by: Tom Stellard <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* freedreno: compact a2xx and a3xx makefiles into parent onesJohannes Obermayr2013-11-166-66/+36
| | | | | | | | | Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <[email protected]>
* radeon/llvm: Free elf_buffer after useAaron Watry2013-11-151-0/+1
| | | | | | | | Prevents a memory leak. v2: Remove null check CC: "10.0" <[email protected]>
* r600/llvm: Free binary.code/binary.config in r600_llvm_compileAaron Watry2013-11-151-0/+3
| | | | | | | | | | | radeon_llvm_compile allocates memory for binary.code, binary.config, or neither depending on what's being done. We need to make sure to free that memory after it's no longer needed. v2: Don't bother checking for null before FREE() CC: "10.0" <[email protected]>
* r600/llvm: initialize radeon_llvm_binaryAaron Watry2013-11-151-0/+1
| | | | | | | | | | | | | use memset to initialize to 0's... otherwise code_size and config_size could be uninitialized when read later in this method. It's also hard to do NULL checks on uninitialized pointers. Reviewed-by: Tom Stellard <[email protected]> v2: Fix indentation CC: "10.0" <[email protected]>
* svga: remove unused vars in svga_hwtnl_simple_draw_range_elements()Brian Paul2013-11-151-12/+2
| | | | | | And simplify the code. Reviewed-by: Jose Fonseca <[email protected]>
* svga: print warning for unsupported indirect dest reg indexingBrian Paul2013-11-151-0/+4
| | | | | | | | | For DX9-level shaders, there's only limited support for indirect indexing of registers (with the loop counter register, not the general address register.) Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: mark dest image as defined in svga_surface_copy()Brian Paul2013-11-151-0/+2
| | | | | | | | | | | | After we blit/copy to a dest texture image we need to mark it as being defined. This fixes broken mipmap generation for quite a few texture formats. Mipgen involves making texture views and svga_texture_view_surface() skips texture images that are undefined. Cc: "10.0" <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: do primitive trimming in translate_indices()Brian Paul2013-11-151-3/+12
| | | | | | | | | | | | | | The index translation code expects the number of indexes to be consistent with the primitive type (ex: a multiple of 3 for PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds in the destination buffer. Fixes failed assertions in the pipebuffer debug code found with Piglit primitive-restart-draw-mode test. Cc: "10.0" <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* radeonsi/compute: Dispose of LLVM module after compiling kernelsAaron Watry2013-11-151-0/+1
| | | | | | | | v2: Fix indentation Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* radeonsi/compute: Free program and program.kernels on shutdownAaron Watry2013-11-151-1/+15
| | | | | | | | v2: Fix indentation Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* radeon/llvm: Free created llvm memory bufferAaron Watry2013-11-151-0/+1
| | | | | | | | v2: Fix indentation Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* radeon/llvm: Free libelf resourcesAaron Watry2013-11-151-0/+3
| | | | | | | | v2: Fix indentation Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* radeon/llvm: fix spelling errorAaron Watry2013-11-151-1/+1
| | | | | | Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* trace: Dump user_buffer members.José Fonseca2013-11-151-0/+2
|
* radeonsi: add support for Hawaii asics (v2)Alex Deucher2013-11-153-0/+15
| | | | | | | Update additional register fields. Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* llvmpipe: (trivial) fix more fallout from the setup cleanup.Roland Scheidegger2013-11-141-2/+4
| | | | Oops... Should have done some more testing.
* llvmpipe: (trivial) fix misplaced bld context assignment.Roland Scheidegger2013-11-141-2/+1
| | | | Should fix polygon offset crashes...
* softpipe: (trivial) fix debug codeRoland Scheidegger2013-11-141-15/+10
| | | | | | The debug printfs wouldn't actually compile when enabled, so kill them off and insert some new one in another place, and make sure it keeps compiling by enclosing it in a if-0 clause.
* llvmpipe: clean up state setup code a bitRoland Scheidegger2013-11-141-115/+59
| | | | | | | In particular get rid of home-grown vector helpers which didn't add much. And while here fix formatting a bit. No functional change. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm,llvmpipe: fix float->srgb conversion to handle NaNsRoland Scheidegger2013-11-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <[email protected]>
* nvc0: release 3d bufctx after drawingBen Skeggs2013-11-131-0/+3
| | | | Signed-off-by: Ben Skeggs <[email protected]>