summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* nvc0: respect edgeflag attribute widthIlia Mirkin2015-11-051-7/+33
| | | | | | | | | | | | | The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for the edgeflag in the pointer case. Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind complaints about reading uninitialized memory). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit e05021ff72abb7de6506c90dd70a9f7ab490bf90)
* gallivm: disable f16c when not using AVXRoland Scheidegger2015-11-051-0/+3
| | | | | | | | | | | | | | | | | | | f16c intrinsic can only be emitted when AVX is used. So when we disable AVX due to forcing 128bit vectors we must not use this intrinsic (depending on llvm version, this worked previously because llvm used AVX even when we didn't tell it to, however I've seen this fail with llvm 3.3 since 718249843b915decf8fccec92e466ac1a6219934 which seems to have the side effect of disabling avx in llvm albeit it only touches sse flags really, but with ea421e919ae6e72e1319fb205c42a6fb53ca2f82 it's now really disabled). Albeit being able to use AVX with 128bit vectors also would have its uses, the code as is really was meant to emulate jit code creation for less capable cpus. v2: add some (ifdefed out) missing de-featuring options for simulating less capable cpus. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> (cherry picked from commit 711489648bcce5cd8fcf14e73e5affe069010c01) Nominated-by: Roland Scheidegger <[email protected]>
* gallivm: Explicitly disable unsupported CPU features.Jose Fonseca2015-11-051-38/+34
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214 CC: "10.6 11.0" <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (cherry picked from commit ea421e919ae6e72e1319fb205c42a6fb53ca2f82)
* radeon/uvd: don't expose HEVC on old UVD hw (v3)Alex Deucher2015-11-051-32/+18
| | | | | | | | | | | | | | | | | The section for UVD 2 and older was not updated when HEVC support was added. Reported by Kano on irc. v2: integrate the UVD2 and older checks into the main switch statement. v3: handle encode checking as well. Encode is already checked in the top case statement, so drop encode checks in the lower case statement. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] (cherry picked from commit 7b636581253fe858ac883e3d3eec21173ac069d4)
* gallivm: Translate all util_cpu_caps bits to LLVM attributes.Jose Fonseca2015-11-051-2/+34
| | | | | | | | | | | | | | | This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <[email protected]> CC: "10.6 11.0" <[email protected]> (cherry picked from commit 718249843b915decf8fccec92e466ac1a6219934)
* mesa/glformats: Undo code changes from _mesa_base_tex_format() moveNanley Chery2015-11-051-142/+6
| | | | | | | | | | | | | | | | | | | The refactoring commit, c6bf1cd, accidentally reverted cd49b97 and 99b1f47. These changes caused more code to be added to the function and removed the existing support for ASTC. This patch reverts those modifications. v2. Actually include ASTC support again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92221 Cc: "11.0" <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit f1147a238ab35a56fa7d1c64f6025ff3b909dad8) [Emil Velikov] - Drop the KHR_texture_compression_astc_ldr check - Add texcompress.h include. Signed-off-by: Emil Velikov <[email protected]>
* osmesa: Expose GL entry points for Windows build via DEF file.Nigel Stewart2015-11-052-0/+674
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92437 CC: "10.6 11.0" <[email protected]> Signed-off-by: Jose Fonseca <[email protected]> (cherry picked from commit 04703762e544bc732f6f8b07033221dfbd58159f)
* cherry-ignore: ignore a possible wrong nominationEmil Velikov2015-11-051-0/+2
| | | | | | | The commit base varies greatly between master and 11.0. It seems that the commit (in it's current form) is not applicable for the branch. Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 11.0.4Emil Velikov2015-10-251-1/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add release notes for 11.0.4mesa-11.0.4Emil Velikov2015-10-241-0/+167
| | | | Signed-off-by: Emil Velikov <[email protected]>
* Update version to 11.0.4Emil Velikov2015-10-241-1/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* configure.ac: ensure RM is setJonathan Gray2015-10-211-0/+2
| | | | | | | | | | | | GNU make predefines RM to rm -f but this is not required by POSIX so ensure that RM is set. This fixes "make clean" on OpenBSD. v2: use AC_CHECK_PROG Signed-off-by: Jonathan Gray <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 99c4079c37ac04a0dad4ead3117f786706c80aaf)
* mesa: fix ARRAY_SIZE query for GetProgramResourceivTapani Pälli2015-10-213-43/+62
| | | | | | | | | | | | | | | | | Patch also refactors name length queries which were using array size in computation, this has to be done in same time to avoid regression in arb_program_interface_query-resource-query Piglit test. Fixes rest of the failures with ES31-CTS.program_interface_query.no-locations v2: make additional check only for GS inputs v3: create helper function for resource name length so that it gets calculated only in one place Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Martin Peres <[email protected]> (cherry picked from commit c0722be9f58ef89dae98d8c459ec4f9589f97748)
* i965: Remove early release of DRI2 miptreeChris Wilson2015-10-211-1/+0
| | | | | | | | | | | | | | | | intel_update_winsys_renderbuffer_miptree() will release the existing miptree when wrapping a new DRI2 buffer, so we can remove the early release and so prevent a NULL mt dereference should importing the new DRI2 name fail for any reason. (Reusing the old DRI2 name will result in the rendering going astray, to a stale buffer, and not shown on the screen, but it allows us to issue a warning and not crash much later in innocent code.) Signed-off-by: Chris Wilson <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281 Reviewed-by: Martin Peres <[email protected]> Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit 70e91d61fde239e8ae58148cacd4ff891126e2aa)
* i965/vec4: fill src_reg type using the constructor type parameterAlejandro Piñeiro2015-10-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | The src_reg constructor that received the glsl_type was using it only to build the swizzle, but not to fill this->type as dst_reg is doing. This caused some type mismatch between movs and alu operations on the NIR path, so copy propagation optimization was not applied to remove unneeded movs if negate modifier was involved. This was first detected on minus (negate+add) operations. Shader DB results (taking into account only vec4): total instructions in shared programs: 20019 -> 19934 (-0.42%) instructions in affected programs: 2918 -> 2833 (-2.91%) helped: 79 HURT: 0 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 4de86e1371b0d59a5b9a787b726be3d373024647) Nominated-by: Christoph Brill <[email protected]>
* i965/vec4: check writemask when bailing out at register coalesceAlejandro Piñeiro2015-10-211-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | opt_register_coalesce stopped to check previous instructions to coalesce with if somebody else was writing on the same destination. This can be optimized to check if somebody else was writing to the same channels of the same destination using the writemask. Shader DB results (taking into account only vec4): total instructions in shared programs: 1781593 -> 1734957 (-2.62%) instructions in affected programs: 1238390 -> 1191754 (-3.77%) helped: 12782 HURT: 0 GAINED: 0 LOST: 0 v2: removed some parenthesis, fixed indentation, as suggested by Matt Turner v3: added brackets, for consistency, as suggested by Eduardo Lima Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit d4e29af2344c06490913efc35430f93a966061bb) Nominated-by: Jason Ekstrand <[email protected]>
* mesa: fix incorrect opcode in save_BlendFunci()Brian Paul2015-10-211-1/+1
| | | | | | | | | | Fixes assertion failure with new piglit arb_draw_buffers_blend-state_set_get test. Cc: [email protected] Reviewed-by: Jose Fonseca <[email protected]> (cherry picked from commit e24d04e436ed48d4a0aac90590cbaa40da936208)
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-2116-1/+45
| | | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 814f31457e9ae83d4f1e39236f704721b279b73d)
* st/omx/dec/h264: fix field picture type 0 poc disorderLeo Liu2015-10-211-4/+8
| | | | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 867284a8f07b69887f8adb109fb6c71156668227)
* st/va: Used correct parameter to derive the value of the "h" variable in ↵Indrajit Das2015-10-211-1/+1
| | | | | | | | | vlVaCreateImage Cc: "11.0" <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 381c17d695b39f9ab501f5aa5a3cc42c8519ac3b)
* radeonsi: fix a GS copy shader leakMarek Olšák2015-10-211-1/+3
| | | | | | | | | | | Cc: [email protected] Reviewed-by: Michel Dänzer <[email protected]> (cherry picked from commit aa060e276c203baf4691d4a4722accd5bdbb8526) [Emil Velikov: si_shader_destroy() wants the ctx as first argument] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/radeonsi/si_shader.c
* st/mesa: fix clip state dependenciesMarek Olšák2015-10-211-1/+4
| | | | | | | | This allows removing FLUSH_VERTICES in MatrixMode. Cc: [email protected] Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 3c6156a4a7b647cc55cbe3a4c13d53b5ffe505e6)
* mesa: Set api prefix to version string when overriding versionTapani Pälli2015-10-211-1/+18
| | | | | | | | | | | | | | | | | | | | | | | Otherwise there are problems when user overrides version and application such as Piglit wants to detect used api with glGetString(GL_VERSION). This makes it currently impossible to run glslparsertest tests for OpenGL ES when using version override. Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1. Before: "3.1 Mesa 11.1.0-devel (git-24a1a15)" After: "OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)" v2: only include api prefix for OpenGL ES (Boyan Ding) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit dc8c221e2890cc9913dfc99e1e0fcb73c89af52c)
* freedreno/a3xx: cache-flush is needed after MEM_WRITERob Clark2015-10-213-5/+14
| | | | | | | | | Otherwise the mem2gmem blit would see potentially bogus texture coordinates. Fixes an issue that shows up with glamor. CC: "11.0" <[email protected]> Signed-off-by: Rob Clark <[email protected]> (cherry picked from commit 6206da736c84c4f7316ab586c886b4865fda8805)
* nv30: include the header of ffs prototypeChih-Wei Huang2015-10-211-0/+1
| | | | | | | | | It fixes a building error of the android 6.0 64-bit target. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 7599f8b167321cb8adb2ba51a53163752b668532)
* nv50/ir: use C++11 standard std::unordered_map if possibleChih-Wei Huang2015-10-211-3/+17
| | | | | | | | | Note Android version before Lollipop is not supported. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit d31005e3e5588b20760c774f14ac0ea80375a181)
* mesa: android: Fix the incorrect path of sse_minmax.cChih-Wei Huang2015-10-211-1/+1
| | | | | | | | | | Cc: "10.6 11.0" <[email protected]> Fixes: 669cfc267a1 (android: mesa: fix the path of the SSE4_1 optimisations) Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 67d8518a0e5a3df400a6e70de667d69e4b6ce9c5)
* st/fbo: use pipe_surface_release instead of pipe_surface_referenceKrzysztof Sobiecki2015-10-211-1/+1
| | | | | | | | | | | | pipe_surface_reference have problems with deleted contexts, so use of pipe_surface_release might be more appropriate. Fixes Wasteland 2 Director's Cut crash on start. Cc: [email protected] Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 14f7ce42484c31a45fcb6aabdf503f7496a9a94c)
* vbo: fix incorrect switch statement in init_mat_currval()Brian Paul2015-10-211-1/+1
| | | | | | | | | | | | | The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default switch clause for all loop iterations. This doesn't fix any known issues but was clearly incorrect. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit dd293d8aae324ac7b9d5297e33a1e732e1f3f4d3)
* glsl: In later GLSL versions, sequence operator is cannot be a constant ↵Ian Romanick2015-10-211-1/+42
| | | | | | | | | | | | | | | | | | expression Fixes: ES3-CTS.shaders.negative.constant_sequence spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag v2: Fix a couple copy-and-paste mistake in the spec quotations. Suggested by Matt. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 92635a84a7f464b827baa406578420dd6109e1a4)
* glsl: Add method to determine whether an expression contains the sequence ↵Ian Romanick2015-10-213-0/+97
| | | | | | | | | | | | | | | | | | operator This will be used in the next patch to enforce some language sematics. v2: Fix inverted logic in ast_function_expression::has_sequence_subexpression. The method originally had a different name and a different meaning. I fixed the logic in ast_to_hir.cpp, but I only changed the names in ast_function.cpp. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]> [v1] Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 05e4601c6b9ce456cc4a4c395677a22125d889d2)
* glsl: Restrict initializers for global variables to constant expression in ESIan Romanick2015-10-211-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Combine this check with the existing const and uniform checks. This change depends on the previous patch (glsl: Only set ir_variable::constant_value for const-decorated variables). Fixes: ES2-CTS.shaders.negative.initialize ES3-CTS.shaders.negative.initialize spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag spec/glsl-es-1.00/compiler/global-initializer/from-global.vert spec/glsl-es-1.00/compiler/global-initializer/from-global.frag spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag spec/glsl-es-3.00/compiler/global-initializer/from-in.vert spec/glsl-es-3.00/compiler/global-initializer/from-in.frag spec/glsl-es-3.00/compiler/global-initializer/from-global.vert spec/glsl-es-3.00/compiler/global-initializer/from-global.frag Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.* still fail because the result of a sequence operator is still considered to be a constant expression. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304 Reviewed-by: Tapani Pälli <[email protected]> [v1] Reviewed-by: Iago Toral Quiroga <[email protected]> [v1] Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit bb329f2ff6e8bf8910a467b09f69a4d843689617)
* glsl: Only set ir_variable::constant_value for const-decorated variablesIan Romanick2015-10-211-3/+6
| | | | | | | | | | | Right now we're also setting for uniforms, and that doesn't seem to hurt things. The next patch will make general global variables in GLSL ES, and those definitely should not have constant_value set! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 3524d6df33b1e3716992f9a555ffb0f7b1ae2f4f)
* glsl: Use constant_initializer instead of constant_value to determine ↵Ian Romanick2015-10-211-1/+1
| | | | | | | | | | | | whether to keep an unused uniform This even matches the comment "uniform initializers are precious, and could get used by another stage." Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 5bc68f0f2b80b21997435742af74c49eb72891f7)
* glsl/linker: Use constant_initializer instead of constant_value to ↵Ian Romanick2015-10-211-2/+2
| | | | | | | | | initialize uniforms Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 313372cae8e10e4b9a3de093d65c0a0d8954bb0d)
* ff_fragment_shader: Use binding to set the sampler unitIan Romanick2015-10-211-6/+4
| | | | | | | | | | | | | This is the way layout(binding=xxx) works from GLSL. The old method just happened to work (and significantly predated support for layout(binding=xxx)), but future changes will break this. v2: Remove some stale comments. Suggested by Matt and Chris Forbes. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 8acce5d53af44a3d1d05a26e69559fd35f835de4)
* glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00Ian Romanick2015-10-211-5/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In d4a24745 (August 2012), Paul made functions calls not be constant expressions in GLSL ES 1.00. Since this feature was added in desktop GLSL 1.20, we believed that it was added in GLSL ES 3.00. That turns out to be completely wrong. Built-in functions have always been allowed as constant expressions in GLSL ES, and the patch adds the (many) spec quotations to prove it. While we never previously encountered this, a later patch enforces a GLSL ES 1.00 rule that global variable initializers must be constant expressions. Without this fix, several dEQP tests fail. Fixes: tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <[email protected]> Yes, I know we don't maintain stable branches that far back, but that *is* how far back this bug goes! (cherry picked from commit 43b07eb60faba1c65fc6f7a99087d051b00e9c0f)
* u_vbuf: fix vb slot assignment for translated buffersNicolai Hähnle2015-10-211-0/+1
| | | | | | | | | | | | Vertex attributes of different categories (constant/per-instance/ per-vertex) go into different buffers for translation, and this is now properly reflected in the vertex buffers passed to the driver. Fixes e.g. piglit's point-vertex-id divisor test. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 45ed627d894aa4d51682e8b07e7234bbc6e7c02d)
* mesa/uniforms: fix get_uniform for doubles (v2)Dave Airlie2015-10-211-16/+37
| | | | | | | | | | | | | | | | The initial glGetUniformdv support didn't cover all the casting cases that are apparantly legal, and cts seems to test for them. I've updated the piglit test to cover these cases now. v2: fix indentation - it's all broken in this file (Ilia) fix src/dst index tracking in light of fp64 support (Ilia) cc: "11.0" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit bcfaab38858fdcfbd8ffeaf6b0e3da8a726f02e6)
* mesa: Get rid of texture-dependent image unit derived state.Francisco Jerez2015-10-214-33/+0
| | | | | | | | | | | | | | | The point is to avoid having to re-validate all image units when _NEW_TEXTURE is flagged, which can be expensive if the driver exposes a large number of image units. This has been reported to fix a 36% performance regression in the Synmark2 Multithread benchmark on the i965 driver which exposes 192 image units. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91788 Reported-by: Wendy Wang <[email protected]> Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 7e441bf025cf8c5d088430d546acb4c0ed58d27b)
* i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.Francisco Jerez2015-10-213-6/+10
| | | | | | | | | gl_image_unit::_Valid will be removed in a future commit. Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 2d97a78b37ddf325d90e056f5eefee0548092530)
* mesa: Skip redundant texture completeness checking during image validation.Francisco Jerez2015-10-211-1/+2
| | | | | | | | | | | | | | The call to _mesa_test_texobj_completeness() is unnecessary if the texture is already known to be complete. If the texture object is dirtied in the meantime _BaseComplete and _MipmapComplete will be reset to false. _mesa_is_image_unit_valid() will start to be called more frequently in a future commit, so it seems desirable to avoid the unnecessary work. Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 25d3338be37ddbfe676716034ec5f29e27323704)
* mesa: Expose function to calculate whether a shader image unit is valid.Francisco Jerez2015-10-212-4/+15
| | | | | | | | | | | | | | A future commit will remove all texture object-dependent derived state from the image unit struct to make validation unnecessary on texture state changes. Instead of checking gl_image_unit::_Valid drivers will be required to call this function when needed to find out whether an image unit is in a valid state and whether access from the shader is allowed. Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 5152db415f4047569822d648fda09bdde4171d6d)
* i965: Don't tell the hardware about our UAV access.Francisco Jerez2015-10-216-19/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware documentation relating to the UAV HW-assisted coherency mechanism and UAV access enable bits is scarce and sometimes contradictory, and there's quite some guesswork behind this commit, so let me summarize the background first: HSW and later hardware have infrastructure to support a stricter form of data coherency between shader invocations from separate primitives. The mechanism is controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and _PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on the 3DPRIMITIVE command. Regardless of whether "UAV Coherency Required" is set, the hardware fixed-function units will increment a per-stage semaphore for each request received if "Accesses UAV" is set for the same or any lower stage. An implicit DC flush is emitted by the lowermost stage with "Accesses UAV" set once it's done processing the request, this also happens regardless of the value of "UAV Coherency Required". The completion of the DC flush will cause the same stage and all previous ones to decrement the semaphore, marking the UAV accesses for the primitive as coherent with L3. The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline stall before any threads are dispatched for the first FF stage with "Accesses UAV" set until the semaphore is cleared for the same stage. Effectively this guarantees that UAV memory accesses performed by previous primitives from any stage will be strictly ordered (and thanks to the implicit DC flush visible in memory) with UAV accesses from the following primitives. None of this is required by the usual image, atomic counter and SSBO GL APIs which have very relaxed cross-primitive coherency and ordering requirements, so we don't actually ever set the "UAV Coherency Required" bit -- Ordering with respect to shader invocations from previous stages on the same primitive where there is a data dependency is of course already guaranteed as the spec requires, regardless of this mechanism being enabled. We do set the "Accesses UAV" bits though since my commit ac7664e493655e290783c23a0412b9c70936da50 (which this patch partially reverts), mainly because of comments like the following from the BDW PRM: > 3DSTATE_GS >[...] > 12 Accesses UAV > Format: Enable > This field must be set when GS has a UAV access. There are similar comments in the documentation for the other 3DSTATE_*S commands. The "must" part is misleading and unjustified AFAIK. Most of the "Accesses UAV" bits don't seem to have any side effects other than the implicit DC flushes and the related book-keeping in anticipation for a subsequent primitive with "UAV Coherency Required" set, so in most cases they are unnecessary and may incur a performance penalty. There is an exception though. On Gen8+ the PS_EXTRA UAV access bit influences the calculation of the PS UAV-only and ThreadDispatchEnable signals which on previous generations were set explicitly by the driver, so we cannot always avoid enabling it on the PS stage. The primary motivation for this change is that in fact the hardware coherency mechanism is buggy and will cause a rather non-deterministic hang on Gen8 when VS is the only stage with "Accesses UAV" set and the processing of a request terminates immediately after the implicit DC flush is sent for a previous primitive with no additional vertices being emitted for the second primitive, what will cause the hardware to skip sending a second DC flush and cause the VS to stall indefinitely waiting for a response from the DC (BDWGFX HSD 1912017). This hardware bug can be reproduced on current master with the spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit subtest (if you have the patience to run it a few dozen times). The proposed workaround is to insert CS STALLs speculatively between 3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage only. Because this would affect one of the hottest paths in the driver and likely decrease performance even further due to the unnecessary serialization, and because we don't actually need the implicit DC flushes, it seems better to just disable them. Cc: 11.0 <[email protected]> (cherry picked from commit 5346c1167064d6429c6338974c6342f8346fd34b)
* mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_spanTapani Pälli2015-10-211-0/+15
| | | | | | | | | | | | Patch adds missing type (used with NV_read_depth) so that it gets handled correctly. This fixes errors seen with following CTS test: ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit d8d0e4a81e42678cc8c8b876dfee24d5c2f4ba38)
* nouveau: make sure there's always room to emit a fenceIlia Mirkin2015-10-218-41/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I started seeing a lot of situations on nv30 where fence emission wouldn't fit into the previous buffer (causing assertions). This ensures that whenever checking for space, we always leave a bit of extra room for the fence emission commands. Adjusts the nv30 and nvc0 fence emission logic to bypass the space checking as well. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 47d11990b2ca3eb666b8ac81fee7f7eb5019eba1) Squashed with commit nouveau: avoid emitting new fences unnecessarily Right now we emit on every kick, but this is only necessary if something will ever be able to observe that the fence completed. If there are no refs, leave the fence alone and emit it another day. This also happens to work around an issue for the kick handler -- a kick can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be due to lack of space in the pushbuf. We want the emit to happen in the current batch, so we want there to always be enough space. However an explicit kick could take the reserved space for the implicitly-triggered kick's fence emission if it happened right after. With the new mechanism, hopefully there's no way to cause two fences to be emitted into the same reserved space. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence) Cc: [email protected] (cherry picked from commit 8053c9208f30964d89dc4e262fdf2148f0664696) Squashed with commit nv50,nvc0: don't base decisions on available pushbuf space We still have to push everything out, might as well kick earlier and flip pushbufs when we know we'll need it. This resolves some issues with the new policy of making sure that we always leave a bit of room at the end for fences. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence) Cc: [email protected] (cherry picked from commit 9fe458335ffd35366ef0f4b741aad0cdb3503783) Squashed with commit nouveau: avoid double-emitting fence The act of ensuring that there is space can cause a flush to happen, which will emit the current screen fence. If that is the fence we're trying to wait on, then it will have been emitted as a result of doing the PUSH_SPACE. Don't attempt to emit it a second time. Signed-off-by: Ilia Mirkin <[email protected]> Fixes: 8053c9208f (nouveau: avoid emitting new fences unnecessarily) Cc: [email protected] (cherry picked from commit bf97f8d467ad1d485c2327da3f4fe1f9e1dc7379)
* docs: add sha256 checksums for 11.0.3Emil Velikov2015-10-101-1/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add release notes for 11.0.3mesa-11.0.3Emil Velikov2015-10-101-0/+184
| | | | Signed-off-by: Emil Velikov <[email protected]>
* Update version to 11.0.3Emil Velikov2015-10-101-1/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* Revert "nouveau: make sure there's always room to emit a fence"Emil Velikov2015-10-104-8/+2
| | | | | | | | | | | | This reverts commit 30570b262971c881366deab58caf8d8d48d7d79d. As mentioned by Ilia Mirkin: Please remove this one from your list of cherry-picked patches. While it fixes real issues on nv30 (and probably the other generations too), it appears to introduce some new ones on nvc0. I've figured out what's causing it, but haven't figured out a proper fix. Not sure I'll be able to before you do a release.