summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ac/radeonsi: add num_work_groups to the abiTimothy Arceri2018-02-074-6/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: implement nir_intrinsic_shader_clockTimothy Arceri2018-02-071-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: create ac_build_shader_clock() helperTimothy Arceri2018-02-073-5/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add load_local_group_size() to the abiTimothy Arceri2018-02-073-0/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add get_block_size() helperTimothy Arceri2018-02-071-20/+27
| | | | | | This will be reused by the nir backend in a later patch. Reviewed-by: Marek Olšák <[email protected]>
* ac: don't call emit_outputs() for computeTimothy Arceri2018-02-071-2/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add local_invocation_ids to the abiTimothy Arceri2018-02-074-6/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add workgroup_ids to the abiTimothy Arceri2018-02-074-11/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: gather some compute info in si_nir_scan_shader()Timothy Arceri2018-02-071-6/+27
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: always set input_usage_mask as using all componentsTimothy Arceri2018-02-071-4/+10
| | | | | | | | | | | This fixes a regression for now, in the future we should gather the used components properly. V2: just set for VS and correctly handle doubles Fixes: be973ed21f6e "radeonsi: load the right number of components for VS inputs and TBOs" Reviewed-by: Marek Olšák <[email protected]>
* i965: remove unused brw_nir_lower_cs_shared()Timothy Arceri2018-02-072-9/+0
| | | | | | This has been unused since 8761a04d0d93. Reviewed-by: Elie Tournier <[email protected]>
* vulkan/wsi: Fix OOM behavior with prime images.Bas Nieuwenhuizen2018-02-061-1/+3
| | | | | Fixes: d50937f137 "vulkan/wsi: Implement prime in a completely generic way" Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: fix GS load input type.Bas Nieuwenhuizen2018-02-061-1/+1
| | | | | Fixes: df1d5174fc "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load" Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa: Factor out _mesa_disable_vertex_array_attrib.Mathias Fröhlich2018-02-064-80/+75
| | | | | | | | And use it in the enable code path. Move _mesa_update_attribute_map_mode into its only remaining file. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vbo: Move vbo_rebase into its only caller module tnl.Mathias Fröhlich2018-02-066-25/+55
| | | | | Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Use atomics for buffer objects reference counts.Mathias Fröhlich2018-02-063-21/+11
| | | | | | | | | | | | | | | | | | | The mutex is currently used for reference counting and updating the minmax index cache. The change uses atomics directly for reference counting and the mutex for the minmax cache. This is safe since the reference count is not modified beside in _mesa_reference_buffer_object where atomics aim to be used. While using the minmax cache, the calling code holds a reference to the buffer object. Thus unreferencing or even referencing the buffer object does not need to be serialized with accessing the minmax cache. The change reduces the time _mesa_reference_buffer_object_ takes by about a factor of two when looking at perf results for some of my favorite use cases. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600: fixup sparse color exports.Dave Airlie2018-02-073-1/+12
| | | | | | | | | | | | | | | If we have gaps in the shader mask we have to have 0x1 in them according to a comment in radeonsi, and this is required to fix the test at least on cayman. We also need to record the highest one written to write to the ps exports reg. This fixes: KHR-GL45.enhanced_layouts.fragment_data_location_api Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: work out target mask at framebuffer bind.Dave Airlie2018-02-073-4/+9
| | | | | | | If we only get 1,2,3,6 framebuffers we want a sparse target mask. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: work out shader export mask at shader build time (v1.1)Dave Airlie2018-02-076-3/+13
| | | | | | | | | | | Since enhanced layouts allows setting specific MRT outputs, we can get sparse outputs, so we have to calculate the shader mask earlier. v1.1: update checks for state update (Roland) Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: fix xfb stream check.Dave Airlie2018-02-071-1/+1
| | | | | | | | | This fixes: KHR-GL45.enhanced_layouts.xfb_vertex_streams Reviewed-by: Roland Scheidegger <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/compute: add render cond support.Dave Airlie2018-02-071-2/+5
| | | | | | | | | | Set render cond and emit atom. Fixes: KHR-GL45.compute_shader.conditional-dispatching Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: fix not-very indirect computeDave Airlie2018-02-071-12/+18
| | | | | | | | | | | | | We need to get the grid sizes earlier to fill in to the const buffer. Fixes: KHR-GL45.compute_shader.built-in-variables and KHR-GL45.compute_shader.dispatch-indirect Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: overhaul buffer resource query.Dave Airlie2018-02-071-7/+8
| | | | | | | | | | | | | This cleans up and fixes the previous fix even more. Buffers from textures start at max const, buffers from buffers/images come in from the 168 offset. This fixes a bunch of: KHR-GL45.shader_storage_buffer_object* Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/eg: fix buffer sizing.Dave Airlie2018-02-071-1/+3
| | | | | | | | | | | For buffers we want the size in bytes, For images we want it in elements. This fixes: KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/images: set offset for compute shaders with number of declared samplersDave Airlie2018-02-071-1/+1
| | | | | | | | for frag shaders we get a value in the key, I expect I need to make compute work better Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/compute: only mark buffer/image state dirty for fragment shadersDave Airlie2018-02-071-2/+4
| | | | | | | | | | | The compute emission path always emits this currently, and emitting it on the fragment path breaks the blitter. This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture Reviewed-by: Roland Scheidegger <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/atomic: fix ATOMCAS instruction.Dave Airlie2018-02-071-1/+31
| | | | | | | | | | This has 4 srcs. This fixes: KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/sb/cayman: fix indirect ubo access on caymanDave Airlie2018-02-071-1/+1
| | | | | | | | | | | | | | With sb enabled on cayman, this was overwriting the proper cf index value with random ones if the dst gpr was 2 or 3, only save the value for a MOVA instruction. Fixes: KHR-GL45.gpu_shader5.uniform_blocks_array_indexing (on cayman with sb) Cc: <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/eg: use texture target to pick array size not view target (v2)Dave Airlie2018-02-071-7/+10
| | | | | | | | | | | | | | This fixes a few CTS cases in : KHR-GL45.texture_view.view_sampling some multisample cases are still broken, but not sure this is the same problem. v2: fix more cases Cc: <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: don't support tc-compat on multisample d32s8 at all.Dave Airlie2018-02-061-2/+2
| | | | | | | | | | | | RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* winsys/amdgpu: allow non page-aligned size bo creation from pointerMichal Navratil2018-02-061-4/+7
| | | | | | | | | Fix INVALID_OPERATION caused by BufferData with target EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is not page aligned. Signed-off-by: Marek Olšák <[email protected]> Cc: 17.3 18.0 <[email protected]>
* meson: ensure xmlpool/options.h is generated for libgalliumJon Turney2018-02-061-1/+1
| | | | | | | | | | | In file included from ../src/gallium/targets/dri/target.c:1: In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8: ../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found See also 26bde1e3. Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vbo: provide 64bits support to print_draw_arraysAndres Gomez2018-02-061-2/+19
| | | | | | | Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* vbo: take into account the size when printing VAO elementsAndres Gomez2018-02-061-1/+1
| | | | | | | | | | | | | When using print_draw_arrays for debugging, we were printing an "n" amount of vertex but that meant not to print all the size in the "n" vertex, depending on the stride used. Now we print the whole size in the "n" vertex. Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* vbo: print first element of the VAO when the binding stride is 0Andres Gomez2018-02-061-3/+4
| | | | | | | Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* anv/device: initialize the list of enabled extensions properlyIago Toral Quiroga2018-02-061-1/+1
| | | | | | | | | | | | | | | The loop goes through the list of enabled extensions marking them as enabled in the list, but this relies on every other extension being initialized to false by default. This bug would make us, for example, advertise certain device extension entry points as available even when the corresponding extensions had not been enabled. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: abc62282b5c "anv: Add a per-device table of enabled extensions" Cc: "18.0" <[email protected]>
* spirv: split constant initializers on in/out structsIago Toral Quiroga2018-02-061-0/+8
| | | | | | | | | The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/nir: do int64 lowering before optimizationIago Toral Quiroga2018-02-061-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Otherwise loop unrolling will fail to see the actual cost of the unrolling operations when the loop body contains 64-bit integer instructions, and very specially when the divmod64 lowering applies, since its lowering is quite expensive. Without this change, some in-development CTS tests for int64 get stuck forever trying to register allocate a shader with over 50K SSA values. The large number of SSA values is the result of NIR first unrolling multiple seemingly simple loops that involve int64 instructions, only to then lower these instructions to produce a massive pile of code (due to the divmod64 lowering in the unrolled instructions). With this change, loop unrolling will see the loops with the int64 code already lowered and will realize that it is too expensive to unroll. v2: Run nir_algebraic first so we can hopefully get rid of some of the int64 instructions before we even attempt to lower them. Reviewed-by: Matt Turner <[email protected]>
* mesa: add OES_EGL_image_external_essl3 supportIlia Mirkin2018-02-066-2/+24
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* r600/fp64: Fix build.Vinson Lee2018-02-051-1/+1
| | | | | | | | | | | | | CC r600_shader.lo r600_shader.c: In function ‘egcm_int_to_double’: r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’? if (ctx.bc->chip_class == CAYMAN) ^ -> Fixes: 35b430157776 ("r600/fp64: fix integer->double conversion") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* r600/fp64: fix integer->double conversionDave Airlie2018-02-061-28/+93
| | | | | | | | | | | | | Doing a straight uint/int->fp32->fp64 conversion causes some precision issues, Roland suggested splitting the integer into two portions and doing two separate int->fp32->fp64 conversions then adding the results. This passes the tests in CTS and piglit. [airlied: fix cypress conversion opcodes] Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: remove emission of nir_op_fdivSamuel Pitoiset2018-02-051-5/+0
| | | | | | | | RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* travis: add macOS meson buildJon Turney2018-02-051-0/+5
| | | | | | | | v2: Simplify set of options now we have better defaults Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: osx ld doesn't support --build-idJon Turney2018-02-052-1/+5
| | | | | Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: build src/glx/appleJon Turney2018-02-052-0/+65
| | | | | | Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: set apple glx definesDylan Baker2018-02-051-0/+2
| | | | | Reviewed-by: Jon Turney <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: better defaults for osx, windows and cygwinJon Turney2018-02-051-6/+15
| | | | | | | | | | | set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers' and 'platforms' options for osx, windows and cygwin, adding cygwin where appropriate. v2: error() for unknown OS Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: Move mistakenly placed lineMatt Turner2018-02-051-1/+1
| | | | | | | Ken called this out in review, but it seems I forgot to make the change. I noticed that the control flow annotations in the fragment shader disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not correct, and moving this line to the correct place fixes it.
* glsl/linker: check same name is not used in block and outsideJuan A. Suarez Romero2018-02-051-23/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <[email protected]> CC: "18.0" <[email protected]> Tested-by: Matteo Bruni <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* mesa: enable ASTC format for CompressedTexSubImage3DJuan A. Suarez Romero2018-02-051-8/+33
| | | | | | | | | | | | If extensions GL_KHR_texture_compression_astc_hdr or GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC format are supported in CompressedTex*Îmage3D. Fixes KHR-GLES2.texture_3d.* with this format. CC: Eric Anholt <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>