summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* svga: minor clean-ups in emit_hw_vs_vdecl()Brian Paul2013-08-211-6/+6
|
* gallivm: unify sin and cos implementationRoland Scheidegger2013-08-212-255/+53
| | | | | | | | | | | The (complicated!) math is all identical, there's just minimal differences how sign bit is calculated plus there's an additional subtraction for the argument going into the polynomial for cos. The logic stays 100% the same (with a small exception, sign bit calculation for sin is minimally simplified, applying sign mask after xoring the arguments instead of applying it to each argument). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: add comment for bogus min/mag filter selection with nearest mip filterRoland Scheidegger2013-08-213-2/+10
| | | | | | | Detected this hunting some other bug, not sure if it really needs fixing but it is definitely wrong. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix rho calculation for 1d caseRoland Scheidegger2013-08-211-1/+1
| | | | | | | Was using wrong (undefined) vector element (the elements are at 0/2 position, not 0/1). Reviewed-by: Jose Fonseca <[email protected]>
* vdpau/decode: Fix comment.Rico Schüller2013-08-211-1/+1
| | | | Reviewed-by: Christian König <[email protected]>
* vl/query: Only support VDP_CHROMA_TYPE_420 for 12 bit formats.Rico Schüller2013-08-211-1/+6
| | | | Reviewed-by: Christian König <[email protected]>
* util: add avx2 and xop detection to cpu detection codeRoland Scheidegger2013-08-203-2/+59
| | | | | | | | | | Going to need this soon (not going to bother with avx2 intrinsics at this time but don't want to do workarounds for true vector shifts if llvm itself can use them just fine and won't need the gazillion instruction emulation). Not really tested other than my cpu returns 0 for these features... (I have no idea if llvm actually would emit avx2/xop instructions neither...) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix bogus aos path detectionRoland Scheidegger2013-08-201-5/+11
| | | | | | | | | Need to check the wrap mode of the actually used coords not a fixed 2. While checking more than necessary would only potentially disable aos and not cause any harm I'm pretty sure for 3d textures it could have caused assertion failures (if s,t coords have simple filter and r not). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: do clamping of border color correctly for all formatsRoland Scheidegger2013-08-202-46/+256
| | | | | | | | | | | | | | | | | | Turns out it is actually very complicated to figure out what a format really is wrt range, as using channel information for determining unorm/snorm etc. doesn't work for a bunch of cases - namely compressed, subsampled, other. Also while here add clamping for uint/sint as well - d3d10 doesn't actually need this (can only use ld with these formats hence no border) and we could do this outside the shader for GL easily (due to the fixed texture/sampler relation) do it here too just so I can forget about it. v2: move border color clamping out of fetch texel. Also change it to clamp the whole border vector at once (and use vectorized load of border color), which saves a couple of instructions - needs some different handling of mixed signed/unsigned formats so skip the per channel stuff and just derive this from first channel except for special formats. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: implement better control of per-quad/per-element/scalar lodRoland Scheidegger2013-08-208-55/+149
| | | | | | | | | | | | | | | | There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <[email protected]>
* build: fix out-of-tree builds in gallium/auxiliaryRoss Burton2013-08-201-0/+4
| | | | | | | | | | The rules were writing files to e.g. util/u_indices_gen.py, but in an out-of-tree build this directory doesn't exist in the build directory. So, create the directories just in case. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Ross Burton <[email protected]>
* radeonsi: Always pre-load separate VGPRs for centroid vs. center interpolationMichel Dänzer2013-08-201-1/+2
| | | | | | | | | | | | | The LLVM R600 backend currently always uses separate VGPRs for these. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68162 (Centroid interpolation is identical to center interpolation without multisampling, so the shader hardware was only pre-loading one set of interpolation coefficients, and the pixel shader code was using uninitialized values as the centroid interpolation coefficients) Cc: [email protected] Tested-by: Laurent Carlier <[email protected]>
* radeonsi: Fix SPI_BARYC_CNTL register initializationMichel Dänzer2013-08-201-22/+3
| | | | | | | The centroid / center interpolation related bits have different meanings as of SI. Fixes 7 centroid interpolation related piglit tests.
* gallium/osmesa: add same checks to OSMesaMakeCurrent as the other osmesaMaarten Lankhorst2013-08-201-2/+3
| | | | | | | Fixes a opengl crash in wine. Cc: "9.2" <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]>
* gallium/osmesa: link against static libglapi library too to get the gl exportsMaarten Lankhorst2013-08-201-3/+2
| | | | | | | | | | | | This should fix missing symbols in a osmesa built against shared glapi osmesa build. All opengl exports were missing that are defined in the static glapi, so link against both to fix this. I could swear I've done this before, maybe there was a glitch in the matrix. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824 Cc: "9.2" <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]>
* ilo: add ILO_DEBUG=flushChia-I Wu2013-08-2010-12/+29
| | | | | | | | When specified, ilo will print a line similar to cp flushed for render with 949+888 DWords (22.4%) because of frame end for every ilo_cp_flush() call.
* ilo: add ILO_DEBUG=drawChia-I Wu2013-08-205-2/+83
| | | | It can print out pipe_draw_info and the dirty bits set, useful for debugging.
* r600g/sb: Move memsets of member structs to within constructor bodies.Vinson Lee2013-08-192-6/+3
| | | | | | | Silences "Uninitialized pointer field" defects reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vadim Girlin <[email protected]>
* vl/buffers: consistent use on VL_MAX_SURFACESEmil Velikov2013-08-191-3/+3
| | | | | Reviewed-by: Christian König <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* st/vdpau: drop unnecessary variable profEmil Velikov2013-08-192-6/+8
| | | | | | | | | | | Any decent compiler will do this for us, although doing this will make grepping through the code alot easier. v2: In both mixer and query interface v3: rebase Reviewed-by: Christian König <[email protected]> [v1] Signed-off-by: Emil Velikov <[email protected]>
* vl/idct: cleanup all idct buffersEmil Velikov2013-08-191-1/+1
| | | | | | | | Code should loop through and cleanup the three (VL_NUM_COMPONENTS) idct buffers, rather than doing the first one three times. Reviewed-by: Christian König <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* vl/buffer: add sanity check after CALLOC_STRUCTEmil Velikov2013-08-191-0/+2
| | | | | | | Check if we have successfully allocated memory. Reviewed-by: Christian König <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* st/xvmc: exit gracefully if we fail to create video bufferEmil Velikov2013-08-191-0/+4
| | | | | | | | Free any allocated memory and return BadAlloc if create_video_buffer() has failed to create a buffer. Reviewed-by: Christian König <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* st/vdpau: don't try to create video buffer when the format is FORMAT_NONEEmil Velikov2013-08-191-1/+4
| | | | | | | | Not seen in the wild yet, but seems like a reasonable thing to do. [suggested by Christian] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Christian König <[email protected]>
* vdpau/vl 422 chroma width/height mix upAndy Furniss2013-08-193-4/+4
| | | | | | | | | | | | | | | I was looking into some minor 422 issues/discrepencies I noticed long ago using vdpau on my rv790. I noticed that there is code that is halving height rather than width - 422 is full height AFAIK. Making the changes below doesn't actually make any noticable difference to what I was looking into. Maybe there are more but here's three I've found so far Reviewed-by: Christian König <[email protected]>
* radeonsi: Ensure fmask_format is initialized in release builds.Vinson Lee2013-08-191-0/+1
| | | | | | | Fixes "Uninitialized scalar variable" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vl: add entrypoint to is_video_format_supportedChristian König2013-08-1912-16/+29
| | | | Signed-off-by: Christian König <[email protected]>
* vl: add entrypoint to get_video_paramChristian König2013-08-1924-23/+58
| | | | Signed-off-by: Christian König <[email protected]>
* vl: rename pipe_video_decoder to pipe_video_codecChristian König2013-08-1940-140/+140
| | | | Signed-off-by: Christian König <[email protected]>
* vl: rename enum pipe_video_codec to pipe_video_formatChristian König2013-08-1924-116/+116
| | | | Signed-off-by: Christian König <[email protected]>
* vl: use a template for create_video_decoderChristian König2013-08-1921-252/+125
| | | | Signed-off-by: Christian König <[email protected]>
* nv50: allow non-nv12 buffers to be created, just pass them through to vlIlia Mirkin2013-08-171-5/+1
| | | | | | | | | | Since we expose non-NV12 formats as supported when there is no decoer profile selected, make sure that those formats are actually allowed to be allocated. Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Emil Velikov <[email protected]> Cc: "9.2" <[email protected]>
* dri: Choose a decent global driNConfigOptions.Eric Anholt2013-08-171-4/+1
| | | | | | | | | | Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: fix feature support reportingMarek Olšák2013-08-171-0/+1
| | | | broken by 21d9a1b5ef51ce449e9a82641d0d605c5448b41c
* radeonsi: require LLVM 3.4 for MSAAMarek Olšák2013-08-172-2/+3
|
* radeonsi: don't make scanout resources linear except for cursorsMarek Olšák2013-08-171-1/+1
| | | | | | | | | The surface allocator understands the scanout flag just fine. This seems to improve performance for Ubuntu Unity on top of st/xorg and it fixes the cursor. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove useless code from tex_fetch_argsMarek Olšák2013-08-171-18/+0
| | | | | | The array slice has already been added to "address". Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable unbound colorbuffersMarek Olšák2013-08-171-2/+7
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: port texture improvements from r600gMarek Olšák2013-08-178-268/+367
| | | | | | | | | | | | | | This started as an attempt to add support for MSAA texture transfers and MSAA depth-stencil decompression for the DB->CB copy path. It has gotten a bit out of control, but it's for the greater good. Some changes do not make much sense, they are there just to make it look like the other driver. With a few cosmetic modifications, r600_texture.c can be shared with a symlink. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement texture fetching for compressed MSAA textures (v2)Marek Olšák2013-08-171-5/+116
| | | | | | v2: use resource slots 16..31 for FMASK textures Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add FMASK texture binding slots and resource setup (v2)Marek Olšák2013-08-176-3/+67
| | | | | | v2: bind FMASK textures to shader resource slots 16..31 Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement FMASK decompression for MSAA texturingMarek Olšák2013-08-175-17/+142
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: scanout buffers cannot be a destination of MSAA resolveMarek Olšák2013-08-171-1/+3
| | | | | | Resolving to scanout buffers just doesn't work. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement MSAA colorbuffer compression for renderingMarek Olšák2013-08-179-2/+208
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement uncompressed MSAA texturingMarek Olšák2013-08-172-7/+13
| | | | | | | This is glBlitFramebuffer support for MSAA surfaces as required by GL 3.0 and texturing as required by GL 3.2 and GL_ARB_texture_multisample. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable alpha-to-coverage for integer colorbuffersMarek Olšák2013-08-172-1/+9
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement GL_SAMPLE_ALPHA_TO_ONEMarek Olšák2013-08-175-1/+30
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement uncompressed MSAA rendering and color resolvingMarek Olšák2013-08-179-23/+423
| | | | | | | This is basic MSAA support which should work with most apps. Some features are missing, those will be implemented by other commits. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add flexible shader descriptor management and use it for sampler viewsMarek Olšák2013-08-1711-56/+547
| | | | | | | | | | | | | | | | | | | | | | | It moves all sampler view descriptors to a buffer. It supports partial resource updates and it can also unbind resources (required for FMASK texturing). The buffer contains all sampler view descriptors for one shader stage, represented as an array. On top of that, there are N arrays in the buffer, which are used to emulate context registers as implemented by the previous ASICs (each array is a context). This uses the RCU synchronization approach to avoid read-after-write hazards as discussed in the thread: "radeonsi: add FMASK texture binding slots and resource setup" CP DMA is used to clear the descriptors at context initialization and to copy the descriptors from one context to the next. v2: - use PKT3_DMA_DATA on CIK (I'll test CIK later) - turn the bool CP DMA parameters into self-explanatory flags - add a nice simple API for packet emission to radeon_winsys.h - use 256 contexts, 128 causes texture corruption in openarena
* radeonsi/compute: Let the state tracker do all the flushingTom Stellard2013-08-171-3/+0
| | | | | | | | | | | | | | | | | It shouldn't be necessary to call radeon_winsys::cs_flush() from radeonsi_launch_grid(), because the state tracker is responsible for flushing the pipeline at the appropriate time. The current behavior is also wrong, because radeonsi_launch_grid() submits packets to the compute ring, but when the state tracker calls pipe->flush() everything is submitted to the graphics ring. This has the potential to create a race condition. The downside of removing this flush is that the compute dispatch packets will be sent to the graphics ring rather than the compute ring. In the future we will need to come up with a way to detect 'compute' command streams and submit them to the appropriate ring. Signed-off-by: Marek Olšák <[email protected]>