summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Use hash tables for opt_constant_propagation() kill sets.Kenneth Graunke2015-09-111-18/+28
| | | | | | | | | | | | | | | | Cuts compile/link time of the fragment shader in #91857 by 19% (16.28 -> 13.05). I didn't bother with the acp sets because they're smaller, but it might be worth doing as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Tapani Pälli <[email protected]> (cherry picked from commit 4654439fdd766f79a78fe0d812fd916f5815e7e6) Nominated-by: Emil Velikov <[email protected]>
* i965: Use hash tables for brw_fs_vector_splitting().Kenneth Graunke2015-09-111-22/+22
| | | | | | | | | | | | | | | | | Cuts compile/link time of the fragment shader in #91857 by 25% (21.64 -> 16.28). v2: Drop unnecessary _mesa_hash_table_destroy call, and use refs.ht->entries == 0 rather than ad-hoc checking (suggested by Timothy Arceri). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Tapani Pälli <[email protected]> (cherry picked from commit e20f30eb5181cddf8286d2247cfaf7e0fac7e417) Nominated-by: Emil Velikov <[email protected]>
* glsl: Use hash tables in opt_constant_variable().Kenneth Graunke2015-09-111-18/+21
| | | | | | | | | | | | | | Cuts compile/link time of the fragment shader in bug #91857 by 31% (31.79 -> 21.64). It has over 8,000 variables so linked lists are terrible. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Tapani Pälli <[email protected]> (cherry picked from commit 2fc0ce293ac58237f02cc5dd2eee4e35abea06b5) Nominated-by: Emil Velikov <[email protected]>
* meta: Always bind the textureIan Romanick2015-09-111-3/+6
| | | | | | | | | | | | | | | | | We may have been called from glGenerateTextureMipmap with CurrentUnit still set to 0, so we don't know when we can skip binding the texture. Assume that _mesa_BindTexture will be fast if we're rebinding the same texture. v2: Remove currentTexUnitSave because it is now unused. Suggested by both Neil and Anuj. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91847 Cc: "11.0" <[email protected]> Reviewed-by: Neil Roberts <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 767c33e88138afa64443417860b264a494eba33d)
* r600g: use pipe_resource::width0 instead pb_buffer::sizeMarek Olšák2015-09-112-6/+6
| | | | | | | | | | | | pb_buffer::size was aligned by 29aaab2b5f55cc6d9a84f58ce2bb8607e76a9dde, which broke the CMASK code I think. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91881 Cc: 11.0 <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (cherry picked from commit 5c6c5b524649997805d0128d4df9dda5e8567cbb)
* radeonsi: enable VGPR spilling on VIMarek Olšák2015-09-111-3/+1
| | | | | | | | This fixes corruption in Unigine Heaven on VI Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit 7956eae1c76e298ca1ded46679c1a9bf875ec4ee)
* winsys/amdgpu: calculate the maximum number of compute unitsMarek Olšák2015-09-111-2/+13
| | | | | | | | Required for register spilling. Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit c6502e880bba00f8a68f004fe6be7a4bc275494a)
* clover: Avoid using typename to allow compilation of clover by clangAlbert Freeman2015-09-111-1/+1
| | | | | | | | | | | | | | | | | When parsing an variable declaration qualified with the typename keyword, clang attempted to declare a variable with the type of non type member "enum type type" of module::argument (within the header file clover/core/module.hpp) instead of the typed member of module::argument "enum type". Replaced "typename" with "enum" to force clang to declare the variable marg_type with type "enum type" of module::argument. CC: "11.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Albert Freeman <[email protected]> (cherry picked from commit 1691ead1b8ae4018a805af58977a43ef90af4203)
* i965: Advertise 65536 for GL_MAX_UNIFORM_BLOCK_SIZE.Kenneth Graunke2015-09-111-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Our old value of 16384 is the minimum value. DirectX apparently requires 65536 at a minimum; that's also what nVidia and the Intel Windows driver advertise. AMD advertises MAX_INT. Ilia Mirkin noticed that "Shadow Warrior" uses UBOs larger than 16k on Nouveau, which advertises 65536 bytes for this limit. Traces captured on Nouveau don't work on i965 because our lower limit causes the GLSL linker to reject the captured shaders. While this isn't important in and of itself, it does suggest that raising the limit would be beneficial. We can read linear buffers up to 2^27 bytes in size, so raising this should be safe; we could probably even go larger. For now, matching nVidia and Intel/Windows seems like a good plan. We have to reinitialize MaxCombinedUniformComponents as core Mesa will have set it based on a stale value for MaxUniformBlockSize. According to Tapani, there's an unreleased game that asserts on this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit bf58a2c362d5afdba512f40b3eb300154201c7f0)
* nv50/ir: don't fold immediate into mad if registers are too highIlia Mirkin2015-09-111-0/+4
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit 74b86b971f3bf9b0482341b07c1cbc2e520fb1d0)
* nv50/ir: fix emission of 8-byte wide interp instructionIlia Mirkin2015-09-111-5/+6
| | | | | | | | | | This can come up if the target register number is > 63, which is fairly rare. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit ce28ca713364dbe83cb3c371ca034bc2c2947616)
* nv50/ir: r63 is only 0 if we are using less than 63 registersIlia Mirkin2015-09-111-1/+4
| | | | | | | | | | It is advantageous to use r63 instead of r127 since r63 can fit into the shorter encoding. However if we've RA'd over 63 registers, we must use r127 as the replacement instead. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit 641eda0c792e10c2792730b1833353564479a557)
* nv50/ir: make edge splitting fix up phi node sourcesIlia Mirkin2015-09-111-13/+77
| | | | | | | | | | | | | | | | | | | | | | Unfortunately nv50_ir phi nodes aren't directly connected to the CFG, so the mapping between source and the actual BB is by inbound edge order. So when manipulating edges one has to be extremely careful. We were insufficiently careful when splitting critical edges which resulted in the phi nodes being confused as to where their sources were coming from. This primarily manifests itself with the TXL-lowering logic on nv50, when it is inside of a conditional. I've been unable to trigger the issue anywhere else so far. This resolves rendering failures in a number of games like Two Worlds 2, Trine: Enchanted Edition, Trine 2, XCOM:Enemy Unknown, Stacking. It also improves the situation in Hearthstone, Sonic Generations, and The Raven: Legacy of a Master Thief. However more work needs to be done there (splitting a lot more edges solves it, so it's some other sort of RA-related issue). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90887 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit a072ef8748a65d286e9b542bb9ea6e020fdcc7f8)
* nvc0: remove BGRA4 format supportIlia Mirkin2015-09-111-0/+2
| | | | | | | | | | | | Something is wrong with the support somewhere. I couldn't get the blob driver to use it either, although it happily used RGB5_A1. teximage-colors works, but WoW seems to fail in the menus for drawing text. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 342e68dc60eebb20ac1be9f47800ee9e604354f0)
* nvc0: keep track of cb bindings per buffer, use for upload settingsIlia Mirkin2015-09-117-12/+58
| | | | | | | | | | | | | | | | | | | | CB updates to bound buffers need to go through the CB_DATA endpoints, otherwise the shader may not notice that the updates happened. Furthermore, these updates have to go in to the same address as the bound buffer, otherwise, again, the shader may not notice updates. So we keep track of all the places where a constbuf is bound, and iterate over all of them when updating data. If a binding is found that encompasses the region to be updated, then we use the settings of that binding for the upload. Otherwise we upload as a regular data update. This fixes piglit 'arb_uniform_buffer_object-rendering offset' as well as blurriness in Witcher2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91890 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit e50c01d5af305e07110cb4a38d5a655437058f04)
* nv30: Disable msaa unless requested from the env by NV30_MAX_MSAAHans de Goede2015-09-112-1/+21
| | | | | | | | | | | | | | | | | | | | | | Some modern apps try to use msaa without keeping in mind the restrictions on videomem of older cards. Resulting in dmesg saying: [ 1197.850642] nouveau E[soffice.bin[3785]] fail ttm_validate [ 1197.850648] nouveau E[soffice.bin[3785]] validating bo list [ 1197.850654] nouveau E[soffice.bin[3785]] validate: -12 Because we are running out of video memory, after which the program using the msaa visual freezes, and eventually the entire system freezes. To work around this we do not allow msaa visauls by default and allow the user to override this via NV30_MAX_MSAA. Signed-off-by: Hans de Goede <[email protected]> [imirkin: move env var lookup to screen so that it's only done once] Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 3e9df0e3af7a8a84147ae48f588e9c435bf65b98)
* nv30: Fix color resolving for nv3x cardsHans de Goede2015-09-111-1/+37
| | | | | | | | | | | | | | We do not have a generic blitter on nv3x cards, so we must use the sifm object for color resolving. This commit divides the sources and dest surfaces in to tiles which match the constraints of the sifm object, so that color resolving will work properly on nv3x cards. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit ac066bf65cb585a4f6b4a2fb1d055b033f2b94ae)
* nouveau: android: add space before PRIx64 macroMauro Rossi2015-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | Otherwise the android build fails with error : unable to find string literal operator ‘operator"" PRIx64’ There are several resources referring to the problem, which is related to c++11, in our case used when building mesa for lollipop. http://comments.gmane.org/gmane.comp.graphics.opensg.user/5883 I've not investigated all the semantics, some people even suggested a bug in the gcc compiler, I just saw the building error was solved with one little space for lollipop and no side effect when c+11 not used. v2: [Emil Velikov] add an alternative commit message from Mauro. Cc: 11.0 <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit e838d91b94c3d1d20db62a61bfd9163f675d3139)
* auxiliary: rework the python generated sources rulesEmil Velikov2015-09-111-12/+17
| | | | | | | | | | | | | | | | | | | There are a few bits this commit aims to resolve: One can generalise the mkdir rule to a simple MKDIR_P $(@D) which will expand appropriately for even if we change the subdir name, and/or add new rules. We can also drop the explicit $(srcdir) prefix for the dependency rules, they they are not strictly required, nor used elsewhere in mesa. Finally replace $< with explicit filename to be consistent through the file, and honour PYTHON_FLAGS. v2: Add comprehensive commit summary/message (Ian, Matt) Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 0d39279448bbda6e824bcfd4997b4583bc0481af)
* glsl: build: remove bogus dependencyEmil Velikov2015-09-112-3/+2
| | | | | | | | | | v2: rebase on top of the previous commit - don't touch the LOCAL_PATH prefix for nir_constant_expressions.h Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit c373eaedfc09ff2af7002b64ba0ae8ba71df86a1)
* glsl: build: use makefile.sources variables when possibleEmil Velikov2015-09-113-18/+11
| | | | | | | | | | | | Rather than folding one variable within the other only to unwrap them, just use the ones we need. v2: bring back LOCAL_PATH prefix for nir_constant_expressions,h Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (v1) (cherry picked from commit a3b05e04921a4fcc05cfc994e415e3ceb39fd184)
* glsl: automake: reuse $(NIR_GENERATED_FILES) where possibleEmil Velikov2015-09-111-5/+1
| | | | | | | Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit da5e4559ee3b239d2483645ed54b35aa6628fbaf)
* glsl: automake: rework the sources generation rulesEmil Velikov2015-09-111-16/+22
| | | | | | | | | | | | The glsl equivalent of "mesa: automake: rework the source generation rules". Plus let's make things consistent and always explicitly provide the header name. v2: Rebase on top of reverted "remove custom AM_V_LEX/YACC" (Matt) Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 9e0594418d8fa47e19bfe57450198d3fa7d087a0)
* mesa: automake: rework the source generation rulesEmil Velikov2015-09-111-27/+18
| | | | | | | | | | | | | | | | Same logic as previous commit applies. Additionally remove the odd (set -e/mv/INDENT) from the rules. The last one is the only one we remotely care about, if reading the generated sources. Upcoming work from DylanB which will replace the existing python scripts with ones that produce more readable output anyway. Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit fd913f47b7fcc724d8d191f2752f328d037abb20)
* mapi: automake: rework the source generation rulesEmil Velikov2015-09-111-19/+19
| | | | | | | | | | Same logic as previous commit applies. Also fix bogus MESA_MAPI_DIR - the sources are located in the source dir (duh). Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 96509aa80429db1884a78fae95c169aa40641e84)
* mapi: automake: rework the *api/glapi_mapi_tmp.h rulesEmil Velikov2015-09-111-11/+12
| | | | | | | | | | | Same logic as previous commit applies. v2: Merge with "inline glapi_gen_mapi define" (Matt) Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 449ce5d64f3d0e5840287040755df23e86ce6bb2)
* util: automake: rework the format_srgb.c ruleEmil Velikov2015-09-111-2/+4
| | | | | | | | | | | | | | | A handful of changes/cleanups paving the way to bmake support: - Remove optional $(srcdir)/ prefix for files in the prereq list. - Drop the space after the AM_V_GEN variable. - Using $< in a non-suffix rule is a GNU make idiom. - Use $(@D) over $(dir $@). The latter is a POSIX standard. v2: Cosmetic tweaks in the commit summary. Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (v1) (cherry picked from commit d65bd7a7be48d7805f68cd45218794f3e4590408)
* xmlpool: 'promote' LOCALEDIR variableEmil Velikov2015-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This is the only place in mesa that uses this constuct which seems to be GNUmake-ism. Attempting to build with POSIX make implementations (bmake) would fail as below. --- options.h --- LOCALEDIR := . sh: line 2: LOCALEDIR: command not found *** [options.h] Error code 127 So let's keep things consistent and compatible by making the variable non target specific. v2: - Bring back LOCALEDIR. - Reword the commit message - Change mesa-stable tag 10.6 > 11.0 Cc: 11.0 <[email protected]> Cc: Jonathan Gray <[email protected]> Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit c8984a7a4686c2045666d32fbe5733ff5a5c3bd8)
* r600: don't use shader key without verifying shader type (v2)Dave Airlie2015-09-111-7/+12
| | | | | | | | | | | | | | | | | | | | | | Since 7a32652231f96eac14c4bfce02afe77b4132fb77 r600: Turn 'r600_shader_key' struct into union we were accessing key fields that might be aliased in the union with other fields, so we should check what shader type we are compiling for before using key values from it. v1.1: make it compile v2: have caffeine, make it work - we don't set type until later, so don't reference it until we've set it. Reviewed-by: Edward O'Callaghan <[email protected]> Cc: "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit 6d2ceb10cd63b89892131a27d238620f00922dfb) Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/r600/r600_shader.c
* st/mesa: increase viewport bounds limits for GL4 hwIlia Mirkin2015-09-111-2/+7
| | | | | | | | | | According to the ARB_viewport_array spec, GL4 limit is higher than the GL3 limit. Also take this opportunity to fix the GL3 limit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 458e55d7c5793b02af8b08ebec90906a829d3f65)
* nvc0: always emit a full shader colormaskIlia Mirkin2015-09-111-1/+1
| | | | | | | | | | | | | | | Indications are that if the colormask indicates a single bit set on fermi, that value will always be read from $r0 instead of a potentially higher register (if e.g. green is set). Not to upset the counting logic, always set the header up with a full color mask for each RT. Such a situation can basically only ever happen with generated blit shaders. Fixes the following piglit on Fermi (Kepler is unaffected): fbo-stencil blit GL_DEPTH32F_STENCIL8 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 39df725f731f75f488c75a4910169beb352213fb)
* nv30: Fix max width / height checks in nv30 sifm codeHans de Goede2015-09-111-2/+2
| | | | | | | | | | | | | | The sifm object has a limit of 1024x1024 for its input size and 2048x2048 for its output. The code checking this was trying to be clever resulting in it seeing a surface of e.g 1024x256 being outside of the input size limit. This commit fixes this. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 87073c69f3e253044bc235f34917aaa89041a63c)
* i965: Disallow fast blit paths for CopyTexImage with PixelTransfer opsChris Wilson2015-09-112-0/+8
| | | | | | | | | | | | | | | | | glCopyTexImage behaves similarly to glReadPixels with respect to the pixel transfer operations. Therefore if any are set we cannot use the simple blit-only fast paths. (Though if would be possible to relax the blorp path to handle pixel zoom, or we can just enhance meta.) Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviwewed-by: Iago Toral <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit be519c2d50f4aaa48fdb8b27707114cc5bfd348f)
* st/mesa: don't fall back to 16F when 32F is requestedIlia Mirkin2015-09-111-14/+8
| | | | | | | | | | | | Nothing in the spec allows for the reduced precision, and this also fixes st_QuerySamplesForFormat for nv50, which does not allow MS8 on RGBA32F. Now this will be respected instead of reporting MS8 as supported with an assumption that the format used will be RGBA16F. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit e40f32d5626c87d9e77bbc261df3648cd54bd066)
* nouveau: don't mark full range as used on unmap with explicit flushIlia Mirkin2015-09-061-5/+7
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit a778831735ea45f789c247c40677cd26adc78e3e)
* nv50: avoid using inline vertex data submit when gl_VertexID is usedIlia Mirkin2015-09-064-2/+14
| | | | | | | | | | | | The hardware only generates vertexid when vertices come from a VBO. This fixes: vertexid-drawelements vertexid-drawarrays Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit c830d193db5c90cf0af57ff73606e2aa12aed9a8)
* nv50: don't flush vertex arrays when index buffer changesIlia Mirkin2015-09-061-4/+0
| | | | | | | | | The index buffer is fed in inline over a pushbuf. It's not related to vertices or any caching that might be done on them. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 4a025c6bc835387a31007fdf30a130e612e54e19)
* nv50: rebind bo to bufctx when invalidating idxbuf storageIlia Mirkin2015-09-061-1/+5
| | | | | | | | | There is nothing to be done on a dirty idxbuf, but the bo may have changed, so we have to rebind it to the bufctx. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 1f62d36ae21043c472fc182fd4b738ec1d54a2d2)
* nv50: clear buffer status on all vertex bufs, not just the first oneIlia Mirkin2015-09-061-1/+0
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 114cc18b98b6e016ab1986577aa3df12acc22cca)
* nv50: fix drawing from tfb, direct-to-pushbuf submitsIlia Mirkin2015-09-064-14/+15
| | | | | | | | | | | | The stride was being set to 0, which is illegal (and also non-sensical). Also we must wait for the buffer to become available for reading as otherwise a wrong value may be prefetched. Since we must wait for the buffer anyways, and it's mapped and in GART, we may as well avoid the annoyance of the indirect pushbuf submit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 75e34d1df8b0ab56e5e658b8ef90ff6057ec954e)
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-061-1/+1
| | | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (cherry picked from commit 4f2290d1612569686284609059d29a85c9de67cf)
* nv30: Implement color resolve for msaaHans de Goede2015-09-062-14/+8
| | | | | | | | | | | | Note this is not ideal. Since the sifm can only do source sizes upto 1024x1024 we end up using the blitter on nv4x, which is not that fast. And on nv3x we end up using the cpu which is really slow. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (cherry picked from commit 3c6c4d4f298ec81fe57992790a68aaab2e573519)
* nv30: Fix creation of scanout buffersHans de Goede2015-09-061-0/+10
| | | | | | | | | | | | | | | | | | | | | | Scanout buffers on nv30 must always be non-swizzled and have special width alignment constraints. These constrains have been taken from the xf86-video-nouveau src/nv_accel_common.c: nouveau_allocate_surface() function. nouveau_allocate_surface() applies these width constraints only when a tiled attribute is set, which it sets for all surfaces allocated via dri, and this "tiling" is not the same as swizzling, scanout surfaces must be linear / have a uniform_pitch or only complete garbage is shown. This commit fixes dri3 on nv30 showing a garbled display, with dri3 the scanout buffers are allocated by mesa, rather then by the ddx, and the wrong stride of these buffers was causing the garbled display. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (cherry picked from commit 3329703eb116a7ad73bc694356b43e014532240b)
* vc4: Initialize pack field of qreg to 0 in qir_get_tempBoyan Ding2015-09-061-0/+1
| | | | | | | | | | | | | | This avoids generation of undefined packing in qir and qpu instructions, fixing a lot of rendering errors. Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL instruction when possible.) Cc: [email protected] Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 48de40ce9c45de154965490843f9e50407970c26)
* i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixelsChris Wilson2015-09-062-0/+8
| | | | | | | | | | | | | | | | The tiled memcpy fast paths perform a simple blit (with only a couple of trivial pixel conversion routines) and do not accommodate PixelTransfer operations. Therefore if any are set, fallback to the regular routines. Note that PixelTransfer only applies to TexImage and ReadPixels, not to GetTexImage. Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit 099f5b3a62be1919add02a4cb887841c9f0f2fe4)
* i965: Fix copy propagation type changes.Kenneth Graunke2015-09-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | commit 472ef9a02f2e5c5d0caa2809cb736a0f4f0d4693 introduced code to change the types of SEL and MOV instructions for moves that simply "copy bits around". It didn't account for type conversion moves, however. So it would happily turn this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:F, vgrf6:UD into this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:D, -vgrf5:D which erroneously drops the conversion to float. Cc: "11.0 10.6" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 2ace64fd598816fd1be9877962734242fc27b87b)
* winsys/radeon: remove exported buffers from the cacheMarek Olšák2015-09-061-0/+3
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit efea7c3a3f91219db6e2fa3588388b6be4ecfa40)
* winsys/amdgpu: remove exported buffers from the cacheMarek Olšák2015-09-061-0/+3
| | | | | | Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit 54964c77510b060806615c842692c0f393e807e6)
* gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitlyMarek Olšák2015-09-062-6/+41
| | | | | | | | | This must be done before exporting a buffer as dmabuf fds, because we lose track of who is using it and can't trust the reference counter. Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit 35d0f12797237cdd38e7fd2c39d3c19e875875ca)
* glsl: Handle attribute aliasing in attribute storage limit check.Kenneth Graunke2015-09-061-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | In various versions of OpenGL and GLSL, it's possible to declare multiple VS input variables with aliasing attribute locations. So, when computing the storage requirements for vertex attributes, we can't simply add up the sizes. Instead, we need to look at the enabled slots. This patch begins tracking which attributes are double types that are larger than 128-bits (i.e. take up two vec4 slots). We then count normal attributes once, and count the double-size attributes a second time. Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests on i965, which regressed with commit ad208d975a6d3aebe14f7c2c16039ee20. No Piglit changes on llvmpipe (which actually supports dvecs). Cc: "10.6 11.0" <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> (cherry picked from commit c3294ca5a13cf3f0eb3d9907a46ff8ce4bc2963b)