summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: Put HALTI level in specsWladimir J. van der Laan2017-11-222-0/+23
| | | | | | | | | | The HALTI level is an indication of the gross architecture of the GPU. It determines for significant part what feature level the GPU has, what state (especially frontend state) is there, and where it is located. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: Const-correctness etnaviv_emit.hWladimir J. van der Laan2017-11-221-1/+1
| | | | | | | | | The relocation structure is never changed by submitting it. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* meson: add si_driinfo.h in libgallium_driJuan A. Suarez Romero2017-11-221-0/+1
| | | | | | v2: generate target conditionally (Dylan) Reviewed-by: Dylan Baker <[email protected]>
* nir/gather_info: recognize load_patch_vertices_in as a system valueIago Toral Quiroga2017-11-221-0/+1
| | | | | | | | This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <[email protected]>
* i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=batJordan Justen2017-11-211-0/+24
| | | | | | | | This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated samplers & surfaces. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/genxml: Add helpers for determining field typeKristian H. Kristensen2017-11-211-6/+17
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Check ADD/MAD with immediates in satprop unit testMatt Turner2017-11-211-1/+125
| | | | | | | | | The gen had to be changed from 4 to 6 so that we could test MAD, which is new on Gen6. mad_imm_float_neg_mov_sat tests the case fixed by the previous commit. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Handle negating immediates on MADs when propagating saturatesMatt Turner2017-11-211-2/+8
| | | | | | | | | | | MADs don't take immediate sources, but we allow them in the IR since it simplifies a lot of things. I neglected to consider that case. Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate negations into MADs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 Reported-and-Tested-by: Ruslan Kabatsayev <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3DJuan A. Suarez Romero2017-11-211-1/+19
| | | | | | | | | | | | | | | | | | | | | From section 8.7, page 179 of OpenGL ES 3.2 spec: An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is one of the the formats in table 8.17 and target is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D. An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array” column of table 8.17 is not checked, or if internalformat is TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked. So far it was only considering TEXTURE_2D_ARRAY as valid target. But as "Cube Map Array" column is checked for all the cases, in practice we can consider also TEXTURE_CUBE_MAP_ARRAY. This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture Reviewed-by: Nanley Chery <[email protected]>
* intel: fix disasm_info memory leaksTapani Pälli2017-11-212-2/+2
| | | | | | | | Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/glsl_to_nir: don't generate nir twice for gsTimothy Arceri2017-11-211-8/+2
| | | | | | This was left out of c980a3aa3133 Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: fix snorm blendingRoland Scheidegger2017-11-214-53/+191
| | | | | | | | | | | | | | | | | | | The blend math gets a bit funky due to inverse blend factors being in range [0,2] rather than [-1,1], our normalized math can't really cover this. src_alpha_saturate blend factor has a similar problem too. (Note that piglit fbo-blending-formats test is mostly useless for anything but unorm formats, since not just all src/dst values are between [0,1], but the tests are crafted in a way that the results are between [0,1] too.) v2: some formatting fixes, and fix a fairly obscure (to debug) issue with alpha-only formats (not related to snorm at all), where blend optimization would think it could simplify the blend equation if the blend factors were complementary, however was using the completely unrelated rgb blend factors instead of the alpha ones... Reviewed-by: Jose Fonseca <[email protected]>
* r600: add cull distance supportDave Airlie2017-11-218-7/+26
| | | | | | This passes all the tests in piglit. Signed-off-by: Dave Airlie <[email protected]>
* i965: Optimize bucket index calculationAravindan Muthukumar2017-11-201-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reducing Bucket index calculation to O(1). This algorithm calculates the index using matrix method. Assuming PAGE_SIZE is 4096, matrix arrangement is as below: 1*4096 2*4096 3*4096 4*4096 5*4096 6*4096 7*4096 8*4096 10*4096 12*4096 14*4096 16*4096 20*4096 24*4096 28*4096 32*4096 ... ... ... ... ... ... ... ... ... ... ... max_cache_size From this matrix its clearly seen that every row follows the below way: ... ... ... n n+(1/4)n n+(1/2)n n+(3/4)n 2n Row is calculated as log2(size/PAGE_SIZE) Column is calculated as converting the difference between the elements to fit into power size of two and indexing it. Final Index is (row*4)+(col-1) Tested with Intel Mesa CI. Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20) v4: Review comments on style and code comments implemented (Ian). v3: Review comments implemented (Ian). v2: Review comments implemented (Jason). Signed-off-by: Aravindan Muthukumar <[email protected]> Signed-off-by: Kedar Karanje <[email protected]> Reviewed-by: Yogesh Marathe <[email protected]> Signed-off-by: Ian Romanick <[email protected]>
* meson: Guard the gallium dri componenetDylan Baker2017-11-201-2/+4
| | | | | | | | Currently the target has a redundant guard, and the state tracker isn't properly guarded. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: don't build gallium subdir unless we're building galliumDylan Baker2017-11-201-1/+3
| | | | | | | This will allow us to simplify some guards within the gallium directory. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/vc5: Align 1D texture miplevels to 64b.Eric Anholt2017-11-201-0/+2
| | | | Fixes tex-miplevel-selection GL2:texture() 1D
* broadcom/vc5: Clamp min lod to the last level.Eric Anholt2017-11-201-2/+3
| | | | | | Otherwise, the simulator would complain in tex-miplevel-selection that the min/max clamp was out of order. The actual HW seems to have clamped to the max anyway.
* broadcom/vc5: Increase simulator memory for tex-miplevel-selection.Eric Anholt2017-11-201-1/+1
| | | | | We were overflowing, because of all the little 4k allocations for CLs that were getting expanded to 128kb in the simulator due to the GMP alignment.
* swr/rast: Repair simd8 frontend code rotTim Rowley2017-11-201-1/+1
| | | | | | Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shaderTim Rowley2017-11-204-29/+220
| | | | | | Disabled for now. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Simplify GATHER* jit builder apiTim Rowley2017-11-204-48/+48
| | | | | | | General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add alignment to transpose targetsTim Rowley2017-11-201-8/+8
| | | | | | | | Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Cache eventmanagerTim Rowley2017-11-203-0/+9
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Enable AVX-512 targets in the jitterTim Rowley2017-11-202-10/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Points with clipdistance can't go through simplepoints pathTim Rowley2017-11-201-1/+2
| | | | | | | Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Code style change (NFC)Tim Rowley2017-11-201-2/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16Tim Rowley2017-11-205-3/+151
| | | | | | | Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Support flexible vertex layout for DS outputTim Rowley2017-11-202-0/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* gallium/u_threaded: avoid syncing in threaded_context_flushNicolai Hähnle2017-11-203-5/+17
| | | | | | | | We could always do the flush asynchronously, but if we're going to wait for a fence anyway and the driver thread is currently idle, the additional communication overhead isn't worth it. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: avoid syncing the driver thread in si_fence_finishNicolai Hähnle2017-11-203-37/+49
| | | | | | It is really only required when we need to flush for deferred fences. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: recompute the relative timeout after waiting for ready fenceNicolai Hähnle2017-11-201-0/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ddebug: fix the hang detection timeout calculationNicolai Hähnle2017-11-201-2/+2
| | | | | Fixes: c9fefa062b36 ("ddebug: rewrite to always use a threaded approach") Reviewed-by: Marek Olšák <[email protected]>
* ddebug: fix use-after-free of streamout targetsNicolai Hähnle2017-11-201-1/+1
| | | | | Fixes: b47727a83ad6 ("ddebug: implement pipelined hang detection mode") Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_threaded: properly initialize fence unflushed tokensNicolai Hähnle2017-11-201-2/+1
| | | | | | | This got lost in a rebase but never hurt anything because we happened to always sync in fence_finish anyway... Reviewed-by: Marek Olšák <[email protected]>
* util/u_queue: really use futex-based fencesNicolai Hähnle2017-11-201-1/+1
| | | | | | | The relevant define changed in the final revision of the simple mutex patch. Reviewed-by: Marek Olšák <[email protected]>
* util/u_queue: fix timeout handling in util_queue_fence_wait_timeoutNicolai Hähnle2017-11-201-1/+1
| | | | | Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout") Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: use asynchronous flushes in st_finishNicolai Hähnle2017-11-201-1/+1
| | | | | | | | | With threaded gallium, the driver may currently be running in another thread. In that case, we will execute all remaining commands in that thread instead of syncing, which should be better for cache locality. Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: implement st_server_wait_sync properlyNicolai Hähnle2017-11-201-2/+24
| | | | | | | | | | | | | | | | | | | | | Asynchronous flushes require a proper implementation of st_server_wait_sync, because we could have the following with threaded Gallium: Context 1 app Context 1 driver Context 2 ------------- ---------------- --------- f = glFenceSync glFlush <-- app sync --> <-- app sync --> glWaitSync(f) .. draw calls .. pipe_context::flush for glFenceSync pipe_context::flush for glFlush Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* u_threaded_gallium: remove synchronization in fence_server_syncNicolai Hähnle2017-11-203-3/+13
| | | | | | | | The whole point of fence_server_sync is that it can be used to avoid waiting in the application thread. Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd: build addrlib with C++11Nicolai Hähnle2017-11-201-1/+1
| | | | | | | | | It is required for LLVM anyway. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658 Fixes: 7f33e94e43a6 ("amd/addrlib: update to latest version") Tested-by: Vinson Lee <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: fix VM fault with fetched instance divisorsNicolai Hähnle2017-11-202-5/+12
| | | | | | | | | We need to account for SGPR locations in merged shaders. This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations Fixes: 79c2e7388c7f ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON") Reviewed-by: Marek Olšák <[email protected]>
* radv: use a 16 bytes array for the sampled/storage image descriptorsSamuel Pitoiset2017-11-203-12/+8
| | | | | | | This allows to update them with only one memcpy(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: do not add the query pool BO to the list in vkCmdEndQuery()Samuel Pitoiset2017-11-201-1/+3
| | | | | | | | | As per the spec, the query identified by queryPool and query must currently be active. Applications have to call vkCmdBeginQuery() before, and thus the query pool BO will already be in the list. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: only load needed depth clear regs for fast depth clearsSamuel Pitoiset2017-11-201-2/+12
| | | | | | | | | Similar to how the driver sets the depth clear regs after a fast depth clear. Most of the time, this will copy a 32-bit reg instead of a 64-bit reg. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not add the image BO in radv_set_depth_clear_regs()Samuel Pitoiset2017-11-201-2/+0
| | | | | | | | | | For the fast path, radv_fill_buffer() ensures that the BO is already in the list. For the slow path, the depth surface is part of the framebuffer which means the BO is added to the list when the framebuffer is emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless assertion in emit_depthstencil_clear()Samuel Pitoiset2017-11-201-4/+0
| | | | | | | Already checked in emit_clear(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless check in radv_set_depth_clear_regs()Samuel Pitoiset2017-11-201-1/+1
| | | | | | | | aspects can't be zero and there is an assertion that ensures it's not in emit_clear(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* docs/features: mark some r600 extensions supportedDave Airlie2017-11-201-2/+2
| | | | | | These just looked to be missed when this file was updated. Signed-off-by: Dave Airlie <[email protected]>
* glsl: Catch subscripted calls to undeclared subroutinesGeorge Barrett2017-11-201-2/+7
| | | | | | | | | | generate_array_index fails to check whether the target of a subroutine call exists in the AST, potentially passing around null ir_rvalue pointers eventuating in abort/segfault. Fixes: fd01840c0bd3 ("glsl: add AoA support to subroutines") Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438