aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: don't hang on shader compile failureMarek Olšák2017-03-291-1/+1
| | | | | | | Cc: 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit 518d8341627ac80f8757fd09cc3cd5c2884f58e0)
* swr: [rasterizer jitter] fix llvm >= 5.0 build breakTim Rowley2017-03-293-3/+3
| | | | | | | | | Function::getArgumentList() doesn't exist anymore, switch to using arg_begin() (existed back to at least llvm-3.6.0). Reviewed-by: Vedran Miletić <[email protected]> CC: <[email protected]> (cherry picked from commit 08f864abd9e241c7db9c99212a66cdad69bdd4d8)
* anv/image: Return early when unbinding an imageJason Ekstrand2017-03-291-4/+5
| | | | | | | | | | Found by inspection. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit c942faf8f37d14e7934a21c15ad2438dde2d501e)
* mesa/main: fix MultiDrawElements[BaseVertex] validation of primcountNicolai Hähnle2017-03-292-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | primcount must be a GLsizei as in the signature for MultiDrawElements or bad things can happen. Furthermore, an error should be flagged when primcount is negative. Curiously, this code used to work somewhat correctly even when primcount was negative, because the loop that checks count[i] would iterate out of bounds and almost certainly hit a negative value at some point. Found by an ASAN error in GL45-CTS.gtf32.GL3Tests.draw_elements_base_vertex.draw_elements_base_vertex_primcount Note that the OpenGL spec seems to have s/primcount/drawcount/ at some point, and the code still reflects the old language. v2: provide the correct spec quotes (pointed out by Ian) Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit c11dcfb5e9b051b9036949b3e40a9dc15138bd97)
* i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.Kenneth Graunke2017-03-291-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5 on Haswell...but failed to check if we could do indirect compute shader dispatch...and query buffer objects. Indirect compute shader dispatch requires command parser version 5 (kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in Linux v4.4). On earlier kernels we would have disabled ARB_compute_shader, which is a mandatory part of OpenGL 4.3+. Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG, which mean command parser version 7 (Linux v4.8). On earlier kernels we would have disabled ARB_query_buffer_object, which is a mandatory part of OpenGL 4.4+. The new version support looks like: - Kernel 4.1 and older => OpenGL 3.3 - Kernel 4.2-4.3 => OpenGL 4.2 - Kernel 4.4-4.7 => OpenGL 4.3 - Kernel 4.8+ => OpenGL 4.5 Cc: "17.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (cherry picked from commit 9b324e4dca4754801e5db59aba0ab559f2cf35ea)
* intel: Correct the BDW surface state sizeNanley Chery2017-03-292-4/+3
| | | | | | | | | | | | The PRMs state that this packet is 16 DWORDS long. Ensure that the last three DWORDS are zeroed as required by the hardware when allocating a null surface state. Cc: <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 7c50f9903f58ef04ff393505a383d06f499f1fdc)
* anv/genX: Solve the vkCreateGraphicsPipelines crashXu,Randy2017-03-291-2/+2
| | | | | | | | | | | | | The crash is due to NULL pColorBlendState, which is legal if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. Test: Sample subpasses from LunarG can run without crash Signed-off-by: Xu,Randy <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit 57595cb0739d50a3fbd0841d7475bd775f3e24f3)
* radv: fix primitive reset index emissionDave Airlie2017-03-291-1/+1
| | | | | | | | | | | This was meant to be checking the index type to get the correct index not the last emitted one. This fixes: dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit d06e168b878be45029bf66c2ac627d16144a7823)
* st/mesa: set result writemask based on ir typeIlia Mirkin2017-03-291-0/+1
| | | | | | | | | | | This prevents textureQueryLevels, which maps as LODQ, from ending up with a xyzw writemask, which is illegal. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit dab88e9af7a35ebcdd0fc87df97f4b13e908552a)
* nvc0/ir: treat FMA like MAD for operand propagationKarol Herbst2017-03-291-0/+1
| | | | | | | | | | | | | | | | | | | Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 09f16de7e624938d46a63b8285fc5b21050962e9)
* anv/GetQueryPoolResults: Actually implement the specJason Ekstrand2017-03-291-16/+36
| | | | | | | | | | | | | | | | | | | | The Vulkan spec is fairly clear about when we should and should not write query pool results. We're also supposed to return VK_NOT_READY if VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries which are not yet finished. This fixes rendering corruptions on The Talos Principle where geometry flickers in and out due to bogus query results being returned by the driver. These issues are most noticable on Sky Lake GT4 2hen running on "ultra" settings. Reviewed-By: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182 Cc: "17.0 13.0" <[email protected]> (cherry picked from commit 08df015b9de8ccb16ce6db93890910f8a02be4c6) [Andres Gomez: use anv_query.c instead of genX_query.c] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/intel/vulkan/genX_query.c
* anv/query: Invalidate the correct rangeJason Ekstrand2017-03-291-2/+6
| | | | | | | | | | | Reviewed-By: Lionel Landwerlin <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit 81840130c0f147ed6ae4c26872c2f04a2167bc54) [Andres Gomez: use anv_query.c instead of genX_query.c] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/intel/vulkan/genX_query.c
* i965/gen8+: Do full stall when switching pipelineTopi Pohjolainen2017-03-291-1/+2
| | | | | | | | | | just as earlier gens do. CC: "17.0 13.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743 Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> (cherry picked from commit bd25d9670b466043cdb5d9668f82accbd587c889)
* Revert "radv: Emit cache flushes before CP DMA."Bas Nieuwenhuizen2017-03-171-3/+0
| | | | | | | | | This reverts commit cce43f6d8c40222099badaf52344d6a0eed993f3. Redundant, as the flush already happens at si_cp_dma_prepare. Acked-by: Dave Airlie <[email protected]> (cherry picked from commit ad4dee521d7968a88393dc3685e7c593d27efba5)
* radv/ac: Fix shared memory offset calculationAlex Smith2017-03-171-1/+1
| | | | | | | | | | | | The index passed to get_shared_memory_ptr is an attribute slot index, i.e. the index of a vec4 within LDS. Therefore this must be scaled by sizeof(vec4) to give the LDS byte offset. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> CC: <[email protected]> (cherry picked from commit ce4058dafd2dd283addaa99e8d5b51e53f634f9b)
* radv: Fix using more than 4 bound descriptor setsJames Legg2017-03-171-1/+3
| | | | | | | | | Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when using more than 4 descriptor sets. radv claims support for 8. Cc: 17.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit e88cac1df03d01a9e8a1de1a4a2ee888149e727a)
* radeonsi: disable sinking common instructions down to the end blockSamuel Pitoiset2017-03-171-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 7751ed39e40e08e5aa0633d018c9f25ad17f9bb0) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
* radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBufferAlex Smith2017-03-161-0/+2
| | | | | | | | | | | | | | | | Need to flush before updating the buffer to ensure that the copy is ordered after previous accesses (assuming the app has performed the appropriate barriers). This fixes potential issues due to draws prior to an update reading the new buffer content, despite having the necessary barriers between them. Signed-off-by: Alex Smith <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]> (cherry picked from commit e0cc32b85bd8cf5c2202037838a208983e2d793a)
* radv: Emit cache flushes before CP DMA.Bas Nieuwenhuizen2017-03-161-0/+3
| | | | | | | | | The flushes could be due to TRANSFER barriers. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Dave Airlie <[email protected]> (cherry picked from commit cce43f6d8c40222099badaf52344d6a0eed993f3)
* nir/intrinsics: Make load_barycentric_input take a 2-component coorJason Ekstrand2017-03-161-1/+3
| | | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit 60d1aac28a1f44ac166e72262e378e063155d6fd)
* anv/blorp: Only set a clear color for resolves if fast-clearedJason Ekstrand2017-03-161-1/+2
| | | | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 678fd00f2f5b213d0317ba51a8163c4c5bd1f3dc)
* anv/blorp: Turn off AUX after doing a CCS_D resolveJason Ekstrand2017-03-161-0/+2
| | | | | | | | | | | For render passes with multiple subpasses on gen7, we only fast-clear at the top but an input attachment use can cause us to do a resolve in the middle of the render pass. Once we've done so, we are no longer have a fast-cleared surface so we can just set aux_usage to NONE. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 273b720310863c2084c55f1371b2d27c2d96dbda)
* clover: Work around build failure with AltiVec.Matt Turner2017-03-161-0/+3
| | | | | | | Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504 Acked-by: Francisco Jerez <[email protected]> (cherry picked from commit 7d1195c1e4d071fe796bf5f210c468ea1cc86225)
* nvc0: increase alignment to 256 for texture buffers on fermiIlia Mirkin2017-03-161-1/+3
| | | | | | | | | | | When binding as textures, the alignment can be 16. However when binding as an image, the address has to be aligned to 256. (Also when binding as an RT, but that can't happen with GL or current gallium APIs.) Reported-by: Roy Spliet <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 32dd8d59b6d1b6828e16e854d589d0f04536da14)
* glapi: fix typo in count_scaleGregory Hainaut2017-03-161-1/+1
| | | | | | | | | 2*4=8 Signed-off-by: Gregory Hainaut <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 2ab5eccf5de4a68d0d8d2668f6c5244cc6a41846)
* vulkan/wsi: Improve the DRI3 error messageJacob Lifshay2017-03-161-10/+41
| | | | | | | | | | | | | | | | | | This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand <[email protected]> Reviewed-by: Kai Wasserbäch <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Rene Lindsay <[email protected]> Acked-by: Dave Airlie <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 3d8feb38e8fdbc05b591164cb934b48a495adfbc)
* anv: Properly handle destroying NULL devices and instancesJason Ekstrand2017-03-161-0/+6
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit e3d33a23e6cbe2b73b412a56bb4fc4aa6852d081)
* anv/image: Remove extra dependency on HiZ-specific variableNanley Chery2017-03-161-2/+7
| | | | | | | | | | | | | | | | | surf_usage is only useful to image views that may use HiZ buffers. Storage image views don't use HiZ buffers. v2: Update commit message and add an assertion. Fixes: 055ff2ec521 ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ") Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 258af3a856328934d30b7cdf626d5fdba76852f2) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/intel/vulkan/anv_image.c
* radv: setup llvm target data layoutDave Airlie2017-03-161-0/+7
| | | | | | | | | | | | | | | | | | | | Ported from radeonsi, pointed out by Tom. "This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions." Cc: "13.0 17.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit b8ee70384adc3286d18febba7a92047118cc0f0f) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/amd/common/ac_nir_to_llvm.c
* radeonsi: mark all bound shader buffer ranges as initializedMarek Olšák2017-03-161-0/+3
| | | | | | | | | This should prevent cases when a buffer was incorrectly mapped without synchronization just because this wasn't done. Cc: 13.0 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 71a2e4e9452a6890197f8b629b2d8359bdd58913)
* anv: Stall before fast-clear operationsJason Ekstrand2017-03-161-6/+19
| | | | | | | | | | | | | | | | During initial CCS bring-up, I discovered that you have to do a full CS stall prior to doing a CCS resolve as well as afterwards. It appears that the same is needed for fast-clears as well. This fixes rendering corruptions on The Talos Principle on Sky Lake GT4. The issue hasn't been demonstrated on any other hardware however, given that this appears to be a "too many things in the pipe" problem, having it be easier to reproduce on a system with more EUs makes sense. The issues with resolves is demonstrable on a GT3 or GT2 so this is probably also a problem on all GTs. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "13.0 17.0" <[email protected]> (cherry picked from commit 6b644e571e2344691e4d58ff0bba3ddc059c1a5d)
* anv: Accurately advertise dynamic descriptor limitsJason Ekstrand2017-03-161-2/+2
| | | | | | | | | | | The number of dynamic descriptors is limited by both the number of descriptors and the total number of dynamic things. Because there isn't a single "maximum dynamic things" limit, we need to divide by two so that they can create the maximum of both UBOs and SSBOs. Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: "17.0 13.0" <[email protected]> (cherry picked from commit 5e44ef4a76d9a3681fb6be605319250d4ab800ee)
* i965: move brw_define.h ifndef guard to the topEmil Velikov2017-03-161-3/+3
| | | | | | | | | | | | Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 077078ce77e8653725def01ed291eb486989a9ad) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/mesa/drivers/dri/i965/brw_defines.h
* radv: disable mip point pre clamping.Dave Airlie2017-03-161-1/+1
| | | | | | | | | | No idea what this does, but disabling it fixes a bunch of failing CTS tests in the lod area, so let's go with that. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit d81bd2f75462646d3803d683a28f6682a2ce3078)
* radv/ac: fix multiple descriptor sets with dynamic buffersFredrik Höglund2017-03-161-3/+5
| | | | | | | | | | | The dynamic_offset_offset in the descriptor set binding layout is relative to the dynamic_offset_start for the set in the pipeline layout. Cc: 17.0 <[email protected]> Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 162beb2abbe6b81d81863b3ac88ec8effcbf7c9d)
* radv: fix the dynamic buffer index in vkCmdBindDescriptorSetsFredrik Höglund2017-03-161-1/+1
| | | | | | | | | | This fixes the wrong dynamic buffer descriptors being updated when firstSet > 0. Cc: 17.0 <[email protected]> Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 0941d1a574f46c558b0037be81d9a57004f4290b)
* radv: Disable HTILE for textures with multiple layers/levels.Bas Nieuwenhuizen2017-03-161-0/+3
| | | | | | | | | | It has issues and the fix I'm working on is too complicated for stable, so disable for now. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]> CC: 13.0 17.0 <[email protected]> (cherry picked from commit 0ab2dd361fd80c3840b1547cb7e05b4361eaf928)
* radv: Emit pending flushes before executing a secondary command bufferAlex Smith2017-03-161-0/+3
| | | | | | | | | | | | | | | If we have any pending flushes on the primary command buffer, these must be performed before executing the secondary buffer. This fixes potential corruption when the contents of a subpass which clears any of its render targets are given in a secondary buffer: the flushes after a fast clear would not have been performed until the vkCmdEndRenderPass call. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: 13.0 17.0 <[email protected]> (cherry picked from commit 290d7e892dfa6d04767142f4f6d7ec689933a105)
* radv: drop Z24 support.Dave Airlie2017-03-161-3/+0
| | | | | | | | | | This isn't exposed in -pro, the hw docs say it is deprecated, so let's not bother with it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit cc59e24a6bc9bf8b51a22785beb07089770bec8d)
* nvc0: take extra pushbuf space into account for pushbuf_space callsIlia Mirkin2017-03-161-2/+2
| | | | | | | | | | | | | | | | | | See detailed explanation of why this is needed in commit eb60a89bc3a. This spot was missed/overlooked. Basically as a result of the fact that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8 to the requested amount, we have to be mindful of that when doing bare nouveau_pushbuf_space calls. Reportedly this fixes some crashes when replaying a hitman trace taken on radeonsi. Fixes: eb60a89bc3a ("nouveau: take extra push space into account for pushbuf_space calls") Cc: "13.0 17.0" <[email protected]> Reported-by: Karol Herbst <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 8e6d67685e10b001e07f92a7a6aaff4fe987b6f2)
* anv/pass: Avoid accessing attachment array out of boundsNanley Chery2017-03-161-9/+13
| | | | | | | Cc: <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 178f9e5f29f3fe83eb5af99a69d7c583c30d21d1)
* ralloc: Make sure ralloc() allocations match malloc()'s alignment.Jonas Pfeil2017-03-161-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The header of ralloc needs to be aligned, because the compiler assumes that malloc returns will be aligned to 8/16 bytes depending on the platform, leading to degraded performance or alignment faults with ralloc. Fixes SIGBUS on Raspberry Pi at high optimization levels. This patch is not perfect for MSVC, as maybe in the future the alignment for the most demanding data type might change to more than 8. v2: Commit message reword/typo fix, and add a bigger explanation in the code (by anholt) Signed-off-by: Jonas Pfeil <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: [email protected] (cherry picked from commit cd2b55e536dc806f9358f71db438dd9c246cdb14) Squashed with ralloc: don't leave out the alignment factor Experimentation shows that without alignment factor gcc and clang choose a factor of 16 even on IA-32, which doesn't match what malloc() uses (8). The problem is it makes gcc assume the pointer is 16 byte aligned, so with -O3 it starts using aligned SSE instructions that later fault, so always specify a suitable alignment factor. Cc: Jonas Pfeil <[email protected]> Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049 Signed-off-by: Grazvydas Ignotas <[email protected]> Tested by: Mike Lothian <[email protected]> Tested by: Jonas Pfeil <[email protected]> (cherry picked from commit ff494fe999510ea40e3ed5827e7818550b6de126)
* mesa: Avoid read of uninitialized variableRobert Foss2017-03-151-1/+1
| | | | | | | | | | | | | | | | | The is_color_attachement variable is later read when handling two separate error cases, where only one of the cases results in the variable being initialized. This can be avoided by giving the variable a safe default value. Coverity-Id: 1398631 Cc: [email protected] Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 88becf73022d780cfd0d7dbc5bb3911f8b0d2b11)
* egl: Ensure ResetNotificationStrategy matches for shared contexts.Kenneth Graunke2017-03-151-0/+14
| | | | | | | | | | | Fixes: dEQP-EGL.functional.robustness.negative_context.invalid_robust_shared_context_creation Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: [email protected] (cherry picked from commit 4061bbccf2ad81612afbf8c3ded58c3b7146c5b2)
* st/mesa: inform the driver of framebuffer changes before compute dispatchesNicolai Hähnle2017-03-151-1/+9
| | | | | | | | | | | | | | | | | | | | Even though compute shaders cannot access the framebuffer, there is a synchronization issue when a compute dispatch accesses a texture that was previously bound and drawn to as a framebuffer. Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior results if the texture is still attached to the currently bound framebuffer. However, the feedback loop is broken when the application changes the framebuffer binding before a compute dispatch, and the state tracker needs to let the driver known about this. Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons. Cc: [email protected] Signed-off-by: Marek Olšák <[email protected]> (cherry picked from commit 40c77bbf83a369f21c5a95f14417348aae2dbe42)
* st/glsl_to_tgsi: avoid iterating past the head of the instruction listNicolai Hähnle2017-03-151-2/+9
| | | | | | | | | | | | exec_node::get_prev() does not guard against going past the beginning of the list, so we need to add explicit checks here. Found by ASAN in piglit arb_shader_storage_buffer_object-rendering. Cc: [email protected] Signed-off-by: Marek Olšák <[email protected]> (cherry picked from commit 911391bd70fe30ad970c5e56632b2d7ccc29d955)
* i965/fs: emit MOV_INDIRECT with the source with the right register typeSamuel Iglesias Gonsálvez2017-03-151-1/+1
| | | | | | | | | This was hiding bugs as it retyped the source to destination's type. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 0dddad5b1bb3b05190074a71d274c04c0b5ea700)
* i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handlesSamuel Iglesias Gonsálvez2017-03-151-3/+3
| | | | | | | | | | | | | | | When generating the MOV INDIRECT instruction, the source type is ignored and it is set to destination's type. However, this is going to change in a later patch, so we need to explicitly set the proper source type. brw_vec8_grf() creates an float type's fs_reg by default, when the ICP handle is actually unsigned. This patch fixes these cases before applying the aforementioned patch. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit d8122128bc6bd291ff0abcb7f2e52d9cdc631527)
* i965/fs: fix indirect load DF uniforms on BSW/BXTSamuel Iglesias Gonsálvez2017-03-151-21/+20
| | | | | | | | | | | | | | | | | | | | | | The lowered BSW/BXT indirect move instructions had incorrect source types, which luckily wasn't causing incorrect assembly to be generated due to the bug fixed in the next patch, but would have confused the remaining back-end IR infrastructure due to the mismatch between the IR source types and the emitted machine code. v2: - Improve commit log (Curro) - Fix read_size (Curro) - Fix DF uniform array detection in assign_constant_locations() when it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT. v3: - Move changes in assign_constant_locations() to other patch. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 56266df7ed9dbdf63acfd58944442893b4cd0c0b)
* i965/fs: detect different bit size accesses to uniforms to push them in ↵Samuel Iglesias Gonsálvez2017-03-151-16/+34
| | | | | | | | | | | | | | | proper locations Previously, if we had accesses with different sizes to the same uniform, we might not push it aligned with the bigger one. This is a problem in BSW/BXT when we access an array of DF uniform with both direct and indirect addressing because for the latter we use 32-bit MOV INDIRECT instructions. However this problem can happen with other generations and bitsizes. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit a497ab6838ae5a9898abfed82f7bc8295b490911)