summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallivm,llvmpipe: fix float->srgb conversion to handle NaNsRoland Scheidegger2013-11-145-28/+45
| | | | | | | | | | | | | | | | | | | | | | | | d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <[email protected]>
* dri: Change value param to unsignedIan Romanick2013-11-134-4/+4
| | | | | | | | | This silences some compiler warnings in i915 and i965. See also 75982a5. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.0" <[email protected]>
* i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiBIan Romanick2013-11-131-3/+7
| | | | | | | | | | Systems with little physical memory installed will report less than 2GiB, and some systems may (hypothetically?) have a larger address space for the GPU. My IVB still reports 1534. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]>
* i915: Use drm_intel_get_aperture_sizes instead of drmAgpSizeIan Romanick2013-11-131-2/+6
| | | | | | | | Send the zombie back to the grave before it infects the townsfolk. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]>
* i965: implement blit path for PBO glDrawPixelsAlexander Monakov2013-11-132-0/+120
| | | | | | | | | | | | This patch implements accelerated path for glDrawPixels from a PBO in i965. The code follows what intel_pixel_read, intel_pixel_copy, intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests show no regressions. In my testing on IVB, performance improvement is huge (about 30x, didn't measure exactly) since generic path goes via _mesa_unpack_color_span_float, memcpy, extract_float_rgba. Signed-off-by: Alexander Monakov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* docs: fill in md5 checksums for 9.2.3 releaseBrian Paul2013-11-131-1/+3
|
* docs: fix 9.2.2 -> 9.2.3 typosBrian Paul2013-11-132-2/+2
|
* haiku: add swrast driverAlexander von Gluck IV2013-11-135-0/+873
| | | | | | | | * This is pretty small and upkeep should be minimal. * Currently fully working. * Cannidate for 10.0.0 branch Acked-by: Brian Paul <[email protected]>
* docs: Import 9.2.3 release notes, add news item.Carl Worth2013-11-133-0/+120
|
* dri: Remove redundant createNewContext function from __DRIimageDriverExtensionKristian Høgsberg2013-11-123-54/+15
| | | | | | | | | | createContextAttribs is a superset of what createNewContext provides. Also remove the function typedef, since createNewContext is deprecated and no longer used in multiple interfaces. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]>
* wayland: Use __DRIimage based getBuffers implementation when availableKristian Høgsberg2013-11-122-47/+96
| | | | | | | | | | | This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chad Versace <[email protected]> Cc: "10.0" <[email protected]>
* gbm: Add support for __DRIimage based getBuffers when availableKristian Høgsberg2013-11-123-10/+72
| | | | | | | | | | | | | | | | | | This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. With this patch, we can now run gbm on render-nodes. A render-node is a drm device that doesn't support modesetting and all the legacy DRI ioctls. flink is also not supported, but now that gbm doesn't need flink, we can run piglit on head-less gbm or head-less GPGPU. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Tested-by: Jordan Justen <[email protected]> Cc: "10.0" <[email protected]>
* dri/i915, dri/i965: Fix support for planar imagesAnder Conselvan de Oliveira2013-11-122-2/+4
| | | | | | | | | | | Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that moved the conversion from dri_format to the mesa format made it impossible to allocate a image with that format. Signed-off-by: Ander Conselvan de Oliveira <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]>
* i965/fs: Try a different pre-scheduling heuristic if the first spills.Eric Anholt2013-11-125-54/+76
| | | | | | | | | | | | | | | | | | | | | | | Since LIFO fails on some shaders in one particular way, and non-LIFO systematically fails in another way on different kinds of shaders, try them both, and pick whichever one successfully register allocates first. Slightly prefer non-LIFO in case we produce extra dependencies in register allocation, since it should start out with fewer stalls than LIFO. This is madness, but I haven't come up with another way to get unigine tropics to not spill while keeping other programs from not spilling and retaining the non-unigine performance wins from texture-grf. total instructions in shared programs: 1626728 -> 1626288 (-0.03%) instructions in affected programs: 1015 -> 575 (-43.35%) GAINED: 50 LOST: 0 Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445 Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Do instruction pre-scheduling just before register allocation.Eric Anholt2013-11-121-2/+2
| | | | | | | | | | | Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling barriers, so we had to run scheduler before them in order for it to be able to do basically anything. Now that that's fixed, we can delay the scheduling until we go to allocate (which will make the next change less scary). Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Ignore actual latency pre-reg-alloc.Eric Anholt2013-11-121-21/+29
| | | | | | | | | | | | | | | | We care about depth-until-program-end, as a proxy for "make sure I schedule those early instructions that open up the other things that can make progress while keeping register pressure low", not actual latency (since we're relying on the post-register-alloc scheduling to actually schedule for the hardware). total instructions in shared programs: 1609931 -> 1609931 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 55 LOST: 43 Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Fix message setup for SIMD8 spills.Eric Anholt2013-11-121-1/+1
| | | | | | | | | | | | | | In the SIMD16 spilling changes, I replaced a "1" in the spill path with "mlen", but obviously it wasn't mlen before because spills have the g0 header along with the payload. The interface I was trying to use was asking for how many physical regs we're writing, so we're looking for "1" or "2". I'm guessing this actually passed piglit because the high 8 bits of the execution mask in SIMD8 mode are all 0s. Cc: "10.0" <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Prefer things we know reduce reg pressure when pre-scheduling.Eric Anholt2013-11-121-0/+144
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, the best thing we had was to schedule the things unblocked by the last chosen instruction, on the hope that it would be consuming two values at the end of their live intervals while only producing one new value. But that's just a guess, and we can do counting of usage of registers to know when an instruction would (almost surely) reduce register pressure. The only failure mode I know of in this new dominant heuristic is that inside of a loop when scheduling the iterator (for example), choosing the last use of the iterator doesn't actually reduce the live interval of the iterator. But it doesn't seem to matter in shader-db: total instructions in shared programs: 1618700 -> 1618700 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 13 LOST: 0 Note: The new functions are made virtual because I expect we'll soon lift the pre-regalloc scheduling heuristic over to the vec4 backend. Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix undefined value usage in ABO setup.Eric Anholt2013-11-121-1/+1
| | | | | | | Fixes a compiler warning. Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add a warning if something ever hits a bug I noticed.Eric Anholt2013-11-121-0/+4
| | | | | | We'd have to map the VBO and rewrite things to a lower stride to fix it. Reviewed-by: Matt Turner <[email protected]>
* nvc0: release 3d bufctx after drawingBen Skeggs2013-11-131-0/+3
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* clover: Fix the const variant of adaptor_range::end to deal with mismatching ↵Francisco Jerez2013-11-121-1/+2
| | | | | | | | | | | range sizes. Fixes infinite loop in find_grid_optimal_factor() in cases where the user specifies a grid size with less dimensions than the device supports. Reported-by: Tom Stellard <[email protected]> Cc: "10.0" <[email protected]>
* draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offsetRoland Scheidegger2013-11-122-19/+28
| | | | | | | | | | Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <[email protected]>
* gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detectionCyril Brulebois2013-11-121-6/+6
| | | | | | | | Thanks to Pino Toscano. Patch from Debian package. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: enable vertex attributes in the context of the newly created array objectPetr Sebor2013-11-121-2/+3
| | | | | | | | | | | | | | | Otherwise, the function would enable generic vertex attributes 0 and 1 of the array object it does not own. This was causing crashes in Euro Truck Simulator 2, since the incorrectly enabled generic attribute 0 in the foreign context got precedence before vertex position attribute at later time, leading to NULL pointer dereference. Cc: "9.2" <[email protected]> Cc: "10.0" <[email protected]> Signed-off-by: Petr Sebor <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: 80-column wrapping, remove trailing whitespace in arrayobj.cBrian Paul2013-11-121-13/+16
|
* mesa: add comment for struct gl_vertex_buffer_bindingBrian Paul2013-11-121-0/+6
|
* mesa: call update_array_format() after error checkingBrian Paul2013-11-121-5/+5
| | | | | | | | We try to do all error checking before changing any GL state. Cc: "10.0" <[email protected]> Jordan Justen <[email protected]>
* mesa: use _mesa_is_bufferobj() helper in _mesa_vertex_attrib_address()Brian Paul2013-11-121-3/+4
| | | | | | And use a regular if statment to slightly improve readability. Jordan Justen <[email protected]>
* mesa: add const qualifiers to vertex array helper functionsBrian Paul2013-11-121-4/+4
| | | | Jordan Justen <[email protected]>
* nouveau/video: mark bitstream-level acceleration as unsupportedIlia Mirkin2013-11-121-2/+2
| | | | | | | | | | Adding a vl_mpeg-based helper didn't seem to work, as it produced data that the card couldn't handle. (And I didn't investigate further.) This makes the decoding functionality only accessible via XvMC and avoids crashes when attempting to use VDPAU. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.0" <[email protected]>
* nouveau/video: don't try on nv3xIlia Mirkin2013-11-121-2/+2
| | | | | | | | It doesn't work, I don't know why, but no point in hanging people's displays until it gets figured out. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.0" <[email protected]>
* egl-static: Only export necessary symbols v3Tom Stellard2013-11-112-0/+5
| | | | | | | | | | | | This fixes a crash in glamor when mesa links against static LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai Wasserbäch <[email protected]>
* configure.ac: Don't require shared LLVM when building OpenCLTom Stellard2013-11-111-6/+0
| | | | | | This works now that pipe_*.so is no longer exporting LLVM symbols. Tested-by: Kai Wasserbäch <[email protected]>
* pipe-loader: Only export necessary symbols v3Tom Stellard2013-11-112-0/+5
| | | | | | | | | | | | This makes it possible to use clover with statically linked LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai Wasserbäch <[email protected]>
* radeonsi/compute: Add Sea Islands supportTom Stellard2013-11-111-3/+12
|
* r600/llvm: Store inputs in function argumentsVincent Lejeune2013-11-113-0/+121
|
* tests: Fix make check for out of tree builds.Rico Schüller2013-11-112-0/+2
| | | | | | Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Rico Schüller <[email protected]>
* i965: Move #define's inside function as local variablesAnuj Phogat2013-11-111-8/+5
| | | | | | | | | X_f, Y_f, Xp_f, Yp_f variables are used just inside translate_dst_to_src().So, they can be defined just as local variables. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i915, i965: Fix memory leak in intel_miptree_create_for_bo.Vinson Lee2013-11-112-2/+6
| | | | | | | Fixes "Resource leak" defects reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* osmesa: assorted code clean-upsBrian Paul2013-11-111-25/+38
|
* osmesa: fix broken triangle/line drawing when using float color bufferBrian Paul2013-11-111-0/+16
| | | | | | | | Doesn't seem to help with bug 71363 but it fixed a failure I found in my testing. Cc: "9.2" <[email protected]> Cc: "10.0" <[email protected]>
* svga: improve loops over color buffersBrian Paul2013-11-116-10/+20
| | | | | | | Only loop over the actual number of color buffers supported, not PIPE_MAX_COLOR_BUFS. Reviewed-by: José Fonseca <[email protected]>
* svga: document magic number of 8 render targets per batchBrian Paul2013-11-111-1/+13
| | | | | Grab the comments from commit message b84b7f19dfdc0 to explain what the code is doing.
* util: set all unused cbufs to NULL in util_copy_framebuffer_state()Brian Paul2013-11-111-1/+2
| | | | | | This helps fix an issue in the svga driver, and is just safer all-around. Reviewed-by: José Fonseca <[email protected]>
* glx: declare glx_screen struct to silence warningBrian Paul2013-11-111-0/+2
|
* glx: change query_renderer_integer() value param to unsignedBrian Paul2013-11-113-3/+4
| | | | | | | | | When this function was added, the returned value was signed in some places, unsigned in others. v2: also add unsigned in the unit test, per Ian. Reviewed-by: Ian Romanick <[email protected]>
* glx: Fix scons build.José Fonseca2013-11-111-0/+3
| | | | Reviewed-by: Brian Paul <[email protected]>
* EGL: fix build without libdrmSamuel Thibault2013-11-102-0/+22
| | | | | | This fixes building EGL without libdrm support. Signed-off-by: Samuel Thibault <[email protected]>
* i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitorChris Forbes2013-11-101-7/+11
| | | | | | | | | | Previously, we would bogusly replace the entire statement containing the ir_texture node with an ir_dereference_variable. Correct this to just replace the ir_texture node itself as intended. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>