summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: implement PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-145-9/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: reset valid_buffer_range on PIPE_TRANSFER_DISCARD_WHOLE_RESOURCENicolai Hähnle2016-01-141-0/+3
| | | | | | | This accomodates a streaming pattern where the discard flag is set when the application wraps back to the beginning of the buffer. Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: implement Driver.InvalidateBufferSubDataNicolai Hähnle2016-01-143-3/+33
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: use pipe->invalidate_resource instead of buffer re-allocationNicolai Hähnle2016-01-141-13/+18
| | | | | | | Drivers are expected to avoid unnecessary work when possible in this code path. Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-1417-2/+23
| | | | | | | | | It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <[email protected]>
* mesa: add Driver.InvalidateBufferSubDataNicolai Hähnle2016-01-142-8/+9
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: fix the checks in _mesa_InvalidateBuffer(Sub)DataNicolai Hähnle2016-01-141-3/+15
| | | | | | | | Change the check to be in line with what the quoted spec fragment says. I have sent out a piglit test for this as well. Reviewed-by: Ian Romanick <[email protected]>
* winsys/radeon: fix warnings about incompatible pointer typesNicolai Hähnle2016-01-141-6/+6
| | | | | | | Some confusion between pb_buffer and radeon_bo as well as between radeon_drm_winsys and radeon_winsys. Reviewed-by: Marek Olšák <[email protected]>
* texobj: Check completeness with InternalFormat rather than Mesa formatNeil Roberts2016-01-141-1/+1
| | | | | | | | | | | | | | The internal Mesa format used for a texture might not match the one requested in the internalFormat when the texture was created, for example if the driver is internally remapping RGB textures to RGBA. Otherwise it can cause false positives for completeness if one mipmap image is created as RGBA and the other as RGB because they would both have an RGBA Mesa format. If we check the InternalFormat instead then we are directly checking the API usage which I think better matches the intention of the check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93700 Reviewed-by: Anuj Phogat <[email protected]>
* i965: Remove unused hw_must_use_separate_stencilBen Widawsky2016-01-133-5/+1
| | | | | | | | | | I spotted this while looking for what needs updating in future platforms. I'm too lazy to go through the git logs, but it was probably missed by Jason when all the brw refactoring happened. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop extra newline from shader compile messages.Matt Turner2016-01-132-2/+2
| | | | | Ilia changed shader-db's run.c to not expect messages to contain a newline in shader-db commit 51bbc8035.
* nir: Change bfm's semantics to match Intel/AMD/SM5.Matt Turner2016-01-131-3/+6
| | | | | | | | | | | | Intel/AMD's hardware instructions do not handle arguments of 32. Constant evaluation should not produce a result different from the hardware instruction. The s/1ull/1u/ change is intentional: previously we wanted defined behavior for the "1 << 32" case, but we're making this case undefined so we can make it 1u and save ourselves a 64-bit operation. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Fix undefined shifts.Matt Turner2016-01-132-7/+7
| | | | | | | Shifting into the sign bit is undefined, as is shifting by 32. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Handle failure of Python codegen scripts.Matt Turner2016-01-131-5/+5
| | | | | | | | If a Python codegen script failed, it would write a zero-byte file, which on subsequent invocations of make would trick it into thinking the file was appropriately generated. Reviewed-by: Ian Romanick <[email protected]>
* glsl, nir: Make ir_triop_bitfield_extract a vectorized operation.Kenneth Graunke2016-01-136-15/+20
| | | | | | | | | | | | | | | | | | | | We would like to be able to combine result.x = bitfieldExtract(src0.x, src1.x, src2.x); result.y = bitfieldExtract(src0.y, src1.y, src2.y); result.z = bitfieldExtract(src0.z, src1.z, src2.z); result.w = bitfieldExtract(src0.w, src1.w, src2.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all three operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl, nir: Make ir_quadop_bitfield_insert a vectorized operation.Kenneth Graunke2016-01-137-18/+24
| | | | | | | | | | | | | | | | | | | | We would like to be able to combine result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all four operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes.Kenneth Graunke2016-01-1313-145/+16
| | | | | | | | | | | | | TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Fix constant evaluation of bfm.Matt Turner2016-01-131-1/+1
| | | | | | | | NIR's bfm, like Intel/AMD's hardware instructions and GLSL IR's ir_binop_bfm takes <bits> as src0 and <offset> as src1. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Skip assertion on NaN.Matt Turner2016-01-131-1/+2
| | | | | | | | | A shader in Unreal4 uses the result of divide by zero in its color output, producing NaN and triggering this assertion since NaN is not equal to itself. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93560 Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Add debugging to constant combining pass.Matt Turner2016-01-131-1/+20
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* meta: remove const qualifier on _mesa_meta_fb_tex_blit_begin()Brian Paul2016-01-132-2/+2
| | | | | | To silence a compiler warning about a const/non-const mismatch. Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: fix incorrect buffer token passed to _mesa_BindFramebuffer()Brian Paul2016-01-131-2/+2
| | | | | | | | I added this code right at the end, and got it wrong. Only used by the WGL_ARB_render_texture code. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* docs: add news item and link release notes for 11.1.1Emil Velikov2016-01-132-0/+7
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 11.1.1Emil Velikov2016-01-131-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 4b2d9f29e9b75cbbeb76ccf753a256e11f07ee1a)
* docs: add release notes for 11.1.1Emil Velikov2016-01-131-0/+196
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 330aa44a0da7548000a6b2fc2bb580e9c8e733cc)
* i965/gen9: Don't allow the RGBX formats for texturing/renderingNeil Roberts2016-01-131-0/+28
| | | | | | | | | | | | | | | | | The RGBX surface formats aren't renderable so we internally remap them to RGBA when rendering. They are retained as RGBX when used as textures. However since the previous patch fast clears are disabled for surfaces that use a different format for rendering than for texturing. To avoid this situation we can just pretend not to support RGBX formats at all. This will cause the upper layers of mesa to pick an RGBA format internally instead. This should be safe because we always override the alpha component to 1.0 for RGBX in the texture swizzle anyway. We could also do this for all gens except that it's a bit more difficult when the hardware doesn't support texture swizzling. Gens using the blorp have further problems because that doesn't implement this swizzle override. Reviewed-by: Anuj Phogat <[email protected]>
* radeonsi: move POSITION and FACE fragment shader inputs to system valuesMarek Olšák2016-01-133-45/+25
| | | | | | And FACE becomes integer instead of float. Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: simplify gl_FragCoord behaviorMarek Olšák2016-01-131-23/+22
| | | | | | It will become a system value, not an input. Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: add image_format check in cross_validate_globals()Samuel Iglesias Gonsálvez2016-01-131-0/+6
| | | | | | | | | | | Fixes CTS test: ES31-CTS.shader_image_load_store.negative-linkErrors Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93410 Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: do not validate io of non-compute and compute stageTapani Pälli2016-01-131-0/+7
| | | | | | | | | Fixes regression on SSO tests that have both non-compute and compute programs in a program pipeline. Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93532 Reviewed-by: Marta Lofstedt <[email protected]>
* glsl: add packed varyings for outputs with single stage programTapani Pälli2016-01-131-7/+2
| | | | | | | | | | | | | | Commit 8926dc8 added a check where we add packed varyings of output stage only when we have multiple stages, however duplicates are already handled by changes in commit 0508d950 and we want to add outputs also in case where we have only one stage. Fixes regression caused by 8926dc8 for following test: ES31-CTS.program_interface_query.separate-programs-vertex Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* llvmpipe: (trivial) use cast wrapper for __m128d to __m128 castsRoland Scheidegger2016-01-131-2/+2
| | | | some compiler was unhappy.
* llvmpipe: avoid most 64 bit math in rasterizationRoland Scheidegger2016-01-132-65/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + n*dcdx) == sign((c >> FIXED_ORDER) + n*(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (9773722c2b09d5f0615a47cecf4347859474dc56) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: scale up bounding box planes to subpixel precisionRoland Scheidegger2016-01-133-30/+30
| | | | | | | | | Otherwise some planes we get in rasterization have subpixel precision, others not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports with subpixel accuracy, so could even do bounding box calcs with that). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: add sse code for fixed position calculationRoland Scheidegger2016-01-131-8/+50
| | | | | | | | | | | | | | This is quite a few less instructions, albeit still do the 2 64bit muls with scalar c code (they'd need way more shuffles, plus fixup for the signed mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed scalar muls natively just fine after all (even on 32bit). (This still doesn't have a very measurable performance impact in reality, although profiler seems to say time spent in setup indeed has gone down by 10% or so overall. Maybe good for a 3% or so improvement in openarena.) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix key comparison with uninitialized valueRoland Scheidegger2016-01-132-6/+6
| | | | | | | | Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.
* mesa: print the invalid enum when CreateShader failsTimothy Arceri2016-01-131-1/+2
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Make read_from_write_only_variable_visitor ignore .length().Kenneth Graunke2016-01-121-0/+9
| | | | | | | | | | | .length() on an unsized SSBO variable doesn't actually read any data from the SSBO, and is allowed on variables marked 'writeonly'. Fixes compute shader compilation in Shadow of Mordor. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Mark TCS URB writes as having side effects.Kenneth Graunke2016-01-121-0/+1
| | | | | | | | | | | | | | | This adds barrier dependencies around TCS_OPCODE_URB_WRITE, preventing reads and writes from being incorrectly scheduled. Fixes rendering in GFXBench 4.0's tessellation demo. For some reason, we haven't ever listed URB writes as having side-effects. This hasn't been a problem because in most stages, we never read from the URB, and only write to each location once. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93526 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* st/omx: Avoid segfault in deconstructor if constructor failsTom St Denis2016-01-121-0/+3
| | | | | | | | | If the constructor fails before the LIST_INIT calls the pointers will be null and the deconstructor will segfault. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: use preferred format for deinterlacingChristian König2016-01-121-1/+7
| | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl: improve motion adaptive deinterlacerChristian König2016-01-122-22/+49
| | | | | | | Handle other formats than YV12 as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: add BOB deinterlacing v2Christian König2016-01-122-11/+79
| | | | | | | | | Tested with MPV. v2: correctly handle compositor deinterlacing as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: add NV12 -> NV12 post processing v2Christian König2016-01-122-37/+124
| | | | | | | | | | Usefull for mpv and GStreamer. v2: use common functionality for size adjustment. Signed-off-by: Indrajit-kumar Das <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: use vl_video_buffer_adjust_sizeChristian König2016-01-121-9/+4
| | | | | | | Use the new helper function instead of open coding it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/vdpau: use vl_video_buffer_adjust_sizeChristian König2016-01-121-10/+3
| | | | | | | Use the new helper function instead of open coding it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl/buffers: extract vl_video_buffer_adjust_size helperChristian König2016-01-122-8/+20
| | | | | | | Useful for the state trackers as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: make the implementation thread safe v2Christian König2016-01-127-54/+199
| | | | | | | | | | | Otherwise we might crash with MPV. v2: minor cleanups suggested on the list. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Julien Isorce <[email protected]> Tested-by: Julien Isorce <[email protected]>
* mesa: use gl_shader_variable in program resource listTapani Pälli2016-01-123-28/+129
| | | | | | | | | | | | | | Patch changes linker to allocate gl_shader_variable instead of using ir_variable. This makes it possible to get rid of ir_variables and ir in memory after linking. v2: check that we do not create duplicate entries with packed varyings v3: document 'patch' bit (Ilia Mirkin) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: track total amount of uniform locations usedTapani Pälli2016-01-121-2/+15
| | | | | | | | | | | | | | Linker missed a check for situation where we exceed max amount of uniform locations with explicit + implicit locations. Patch adds this check to already existing iteration over uniforms in linker. Fixes following CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-negative-link-max-num-of-locations v2: use var->type->uniform_locations() (Timothy) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>