summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* intel/fs: Initialize fs_visitor::grf_used on construction.Francisco Jerez2017-12-211-0/+1
| | | | | | | | | | | | | | | This should shut up some Valgrind errors during pre-regalloc scheduling. The errors were harmless since they could only have led to the estimation of the bank conflict penalty of an instruction pre-regalloc, which is inaccurate at that point of the program compilation, but no less accurate than the intended "return 0" fall-back path. The scheduling pass is normally re-run after regalloc with a well-defined grf_used value and accurate bank conflict information. Fixes: acf98ff933d "intel/fs: Teach instruction scheduler about GRF bank conflict cycles." Reported-by: Eero Tamminen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to ↵Francisco Jerez2017-12-211-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | obtain vector storage. The weight_vector_type constructor was inadvertently assuming C++17 semantics of the new operator applied on a type with alignment requirement greater than the largest fundamental alignment. Unfortunately on earlier C++ dialects the implementation was allowed to raise an allocation failure when the alignment requirement of the allocated type was unsupported, in an implementation-defined fashion. It's expected that a C++ implementation recent enough to implement P0035R4 would have honored allocation requests for such over-aligned types even if the C++17 dialect wasn't active, which is likely the reason why this problem wasn't caught by our CI system. A more elegant fix would involve wrapping the __SSE2__ block in a '__cpp_aligned_new >= 201606' preprocessor conditional and continue taking advantage of the language feature, but that would yield lower compile-time performance on old compilers not implementing it (e.g. GCC versions older than 7.0). Fixes: af2c320190f3c731 "intel/fs: Implement GRF bank conflict mitigation pass." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104226 Reported-by: Józef Kucia <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* Revert "spirv: consider bitsize when handling OpSwitch cases"Mark Janes2017-12-211-11/+3
| | | | | | | | | | This reverts commit 9702fac68e8bd07be8871f7925d7f9fb98da3699, which hangs vulkancts and crucible on all platforms. The patch is being reverted because it disables continuous integration testing. The patch from bug 104359 does not apply to master. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
* radv: fix issue with multisample positions and interp_var_at_sample.Dave Airlie2017-12-221-1/+2
| | | | | | | | | | | | | | | | This fixes vmfaults seen on vega with: dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1 These were caused by the don't allocate cmask but it was just accidental. The actual problem was the shader was trying to get the sample positions from a buffer, but the buffer was never getting configured to contain them, as the previous shader never needed them. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: 1171b304f3 (radv: overhaul fragment shader sample positions.) Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix primitive topology when adjacency is usedSamuel Pitoiset2017-12-211-1/+1
| | | | | | | | Found by inspection. Cc: 17.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: disable vec3 packing/splitting in tfb separate modeBrian Paul2017-12-201-1/+13
| | | | | | | | | | | | | | | | This fixes a varying packing issue when using transform feedback in GL_SEPARATE_ATTRIBS mode. By time we get to linking, we already know that the number of feedback attributes is under the GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't as critical. In fact, packing/splitting vec3 attributes can cause trouble because splitting effectively creates another TFB output which can exceed device limits. So, disable vec3 packing when it's not needed to avoid that issue. Fixes the Piglit ext_transform_feedback-separate test on VMware driver. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: simply packing class comparisonBrian Paul2017-12-201-2/+3
| | | | | | | Handle comparing the packing class using the same method as we do for var->data.is_xfb_only Reviewed-by: Timothy Arceri <[email protected]>
* glsl: document varying_matches::assign_locations() params and return valueBrian Paul2017-12-201-2/+7
| | | | | | And change *components to components[] as a reminder that it's an array. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: remove some continue statementsBrian Paul2017-12-201-13/+11
| | | | | | | In some cases, I think loop code is easier to read without continue statements. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: use bitwise operators in varying_matches::compute_packing_class()Brian Paul2017-12-201-5/+10
| | | | | | | | | | The mix of bitwise operators with * and + to compute the packing_class values was a little weird. Just use bitwise ops instead. v2: add assertion to make sure interpolation bits fit without collision, per Timothy. Basically, rewrite function to be simpler. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: simplify loop in varying_matches::assign_locations()Brian Paul2017-12-201-5/+5
| | | | | | The use of break/continue was kind of weird/confusing. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: minor simplification in assign_varying_locations()Brian Paul2017-12-201-5/+3
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* glsl: make varying_matches::is_varying_packing_safe() constBrian Paul2017-12-201-2/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* glsl: trivial comment fixes in lower_packed_varyings.cppBrian Paul2017-12-201-1/+1
| | | | Reviewed by: Timothy Arceri <[email protected]>
* spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball distJuan A. Suarez Romero2017-12-201-0/+1
| | | | | | Fixes: bb1e6ff161c ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* st/dri: allow direct YUYV importLucas Stach2017-12-201-0/+7
| | | | | | | Push this format to the pipe driver unchanged. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* spirv: consider bitsize when handling OpSwitch casesJuan A. Suarez Romero2017-12-201-3/+11
| | | | | | | When walking over all the cases in a OpSwitch, take in account the bitsize of the literals to avoid getting wrong cases. Reviewed-by: Jason Ekstrand <[email protected]>
* drirc: set allow_glsl_cross_stage_interpolation_mismatch for more gamesTapani Pälli2017-12-201-0/+8
| | | | | | | Signed-off-by: Tapani Pälli <[email protected]> Suggested-by: Darius Spitznagel <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104288 Acked-by: Kenneth Graunke <[email protected]>
* anv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()Samuel Iglesias Gonsálvez2017-12-201-0/+2
| | | | | | | | Vulkan spec doesn't specify that VK_REMAINING_ARRAY_LAYERS is allowed in the passed VkClearRect struct. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nvc0/ir: change textureGrad to always use lane 0 as the tex originIlia Mirkin2017-12-191-14/+46
| | | | | | | | | | | | | | | | | | | Thanks to Karol Herbst for the debugging / tracing work that led to this change. Move to using lane 0 as the "work" lane for the texture. It is unclear why this helps, as that computation should be identical to doing it in the "correct" lane with the properly adjusted quadops. In order to be able to use the lane 0 result, we also have to ensure that lane 0 contains the proper array/indirect/shadow values. This applies to Fermi and Kepler. Maxwell+ may or may not need fixing, but that lowering logic is separate. Fixes KHR-GL45.texture_cube_map_array.sampling Signed-off-by: Ilia Mirkin <[email protected]>
* broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.Eric Anholt2017-12-192-0/+4
| | | | | Most piglit textures happened to work out by RGBW not changing in that bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.
* broadcom/vc5: Clean up the comment and code around level 0 UIF.Eric Anholt2017-12-191-14/+10
| | | | | I wrote this early in driver development, and our UIF handling is much better now.
* broadcom/vc5: Simplify the tiling calculations.Eric Anholt2017-12-191-49/+11
| | | | | The mb_tile_layout table was just the utile_w/h times two, so reuse the utile code instead.
* broadcom/vc5: Return the depth in all components of depth textures.Eric Anholt2017-12-191-6/+6
| | | | | | Apparently gallium's u_blitter wants depth from at least the .z component, and other swizzling appears to apply on top of that. Fixes fbo-generatemipmap-formats failures with depth formats.
* broadcom/vc5: Enable decompressing RGTC for desktop GL support.Eric Anholt2017-12-191-1/+1
| | | | This matches freedreno's behavior.
* broadcom/vc5: Use u_transfer_helper for MSAA mappings.Eric Anholt2017-12-192-98/+6
|
* broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.Eric Anholt2017-12-193-5/+94
| | | | | | | | | | There may be some more RCL work to be done (I think I need to split my Z/S stores when doing separate stencil), but this gets piglit's "texwrap GL_ARB_depth_buffer_float" working. v2: Unwrap the z32f_wrapper before calling the helper, rather than having the helper have a callback. v3: Rebase on Rob Clark's u_transfer_helper instead
* freedreno: add debug flag to force high priority contextRob Clark2017-12-193-1/+5
| | | | | | | Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority contexts. Signed-off-by: Rob Clark <[email protected]>
* freedreno: context priority supportRob Clark2017-12-193-2/+20
| | | | | | | For devices (and kernels) which support different priority ringbuffers, expose context priority support. Signed-off-by: Rob Clark <[email protected]>
* gallium: plumb context priority through to driverRob Clark2017-12-1922-2/+71
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* intel/compiler/gen10: Disable push constants.Rafael Antognolli2017-12-192-0/+16
| | | | | | | | | | | | We still have gpu hangs on Cannonlake when using push constants, so disable them for now until we have a proper fix for these hangs. v2: Add warning message when creating context too. Signed-off-by: Rafael Antognolli <[email protected]> Cc: Ben Widawsky <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID componentsSamuel Pitoiset2017-12-192-5/+23
| | | | | | | F1 2017 looks good now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not add extra SGPR when push constants are not usedSamuel Pitoiset2017-12-191-1/+2
| | | | | | | | | This is not because the vertex stage needs some push constants that other stages need them too. This should reduce the number of loaded SGPRs in some situations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: change the needs_push_constants logicSamuel Pitoiset2017-12-191-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store pipeline stages that need push constantsSamuel Pitoiset2017-12-192-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove one useless check in ac_nir_shader_info_pass()Samuel Pitoiset2017-12-191-4/+2
| | | | | | | pipeline->layout can't be NULL now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove one useless check in radv_flush_constants()Samuel Pitoiset2017-12-191-1/+2
| | | | | | | pipeline->layout can't be NULL now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add assertions to make sure pipeline layout objects are validSamuel Pitoiset2017-12-191-0/+2
| | | | | | | The spec requires it. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: create pipeline layout objects for all meta operationsSamuel Pitoiset2017-12-194-2/+80
| | | | | | | | | They are dummy objects but the spec requires layout to not be NULL, this just makes sure we are creating valid pipeline layout objects. This will allow us to remove some useless checks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use a sort for rebuilding the sparse buffer bo list.Bas Nieuwenhuizen2017-12-191-21/+24
| | | | | | | | | It uses slightly more memory (though still bounded by the number of mapped ranges), but gives less quadratic behavior. Cuts 4 minutes from the runtime of the CTS *.sparse.* tests. Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/ir3: handle VTXID_BASE for indirect drawsRob Clark2017-12-191-2/+41
| | | | | | | | | Need to do some gymnastics to copy the parameter from the indirect parameters buffer to uniform so shader sees the correct base-vertex-id. Fixes ./bin/arb_draw_indirect-vertexid on a5xx and probably a4xx too. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ctx->mem_to_mem()Rob Clark2017-12-194-14/+49
| | | | | | | | For dealing with indirect-draw + gl_VertexID, we'll introduce another case where we need to use CP_MEM_TO_MEM. Rather than adding more if(a5xx)/else make this a ctx vfunc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: use vertex_id_zero_baseRob Clark2017-12-192-20/+1
| | | | | | | | | | | | | Cmdstream traces from blob make it clear that the blob driver dev's *think* a5xx has a real (non-zero-based) vtxid. But reality claims differently. Fixes ./bin/gl-3.2-basevertex-vertexid and probably others. This means draw-indirect is going to need some gymnastics to copy base-vertex into uniform. (a4xx probably needs that too.) Signed-off-by: Rob Clark <[email protected]>
* r600: clear compressed flags in image state on unbind.Dave Airlie2017-12-191-0/+2
| | | | | | | | | If we aren't binding an image, clear the compressed flags. This fixes a segfault seen with an apitrace. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104331 Signed-off-by: Dave Airlie <[email protected]>
* swr: Account for index_bias in offsetsGeorge Kyriazis2017-12-181-3/+3
| | | | | | | | | | When calculating buffer offsets for client buffers account for info.index_bias. Fixes the follow piglit tests: arb_draw_elements_base_vertex-drawelements-user_varrays arb_draw_elements_base_vertex-negative-index-user_varrays Reviewed-By: Bruce Cherniak <[email protected]>
* r600: only reported tgsi ir compute support on evergreen+Dave Airlie2017-12-181-1/+3
| | | | | | This fixes a crash on r600/r700. Signed-off-by: Dave Airlie <[email protected]>
* radv: Advertise sync fd import and export.Bas Nieuwenhuizen2017-12-181-4/+15
| | | | | | Passes dEQP-VK.*.sync_fd.* Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement sync file import/export for fences & semaphores.Bas Nieuwenhuizen2017-12-181-28/+87
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/amdgpu: wrap sync fd import/export.Bas Nieuwenhuizen2017-12-182-0/+26
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: fix lds store for patch outputs.Dave Airlie2017-12-191-1/+1
| | | | | | | | | | This wasn't calculating the correct value, this along with a nir patch fixes a regression in: dEQP-VK.tessellation.shader_input_output.barrier Fixes: 043d14db30a (ac/nir: don't write tcs outputs to LDS that aren't read back.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>