summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* anv: flag batch & instruction BOs for captureLionel Landwerlin2017-11-222-2/+6
| | | | | | | | | | | | | When the kernel support flagging our BO, let's mark batch & instruction BOs for capture so then can be included in the error state. v2: Only add EXEC_CAPTURE if supported (Kristian) v3: Fix operator precedence issue (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: setup BO flags at state_pool/block_pool creationLionel Landwerlin2017-11-227-22/+41
| | | | | | | | This will allow to set the flags on any anv_bo created/filled from a state pool or block pool later. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* r600/shader: Fix all warnings issed with "-Wall -Wextra"Gert Wollny2017-11-221-31/+36
| | | | | | | | | | | | - fix a number of -Wsign-compare warnings - fix two warnings for -Woverride-init because TGSI_OPCODE_CEIL == 83, and the according field was defined two times. [airlied: don't use -1 with unsigned type, fix whitespace] Signed-off-by: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: Emit EOP for more CF instruction typesGert Wollny2017-11-224-7/+16
| | | | | | | | | | | | | | | | | | | | So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END, CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction was added to add the EOP flag, even though this is not actually needed, because all these instrutions support the EOP flag. This patch removes the fixup code, adds setting the EOP flag for the according instructions as well as others like CF_OP_TEX and CF_OP_VTX, and adds writing out EOP for this type of instruction in the disassembler. This also fixes a bug where shaders were created that didn't actually have the EOP flag set in the last CF instruction, which might have resulted in GPU lockups. [airlied: cleaned up a little] Signed-off-by: Gert Wollny <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* meson: replace with_*dri with with_dri_platformDylan Baker2017-11-223-7/+3
| | | | | | | | This fixes the windows and macos stubs to be consistent with the *nix path. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: add logic to select apple and windows driDylan Baker2017-11-221-2/+14
| | | | | | | | | | | | This is still not fully correct (haiku and BSD is notably probably not correct), but Linux is not regressed and this should be correct for macOS and Windows. v2: - set the dri_platform to windows on Cygwin as well (Jon) v3: - Add a better todo for Hurd (Eric) Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Fix LLVM requires for radeonsiDylan Baker2017-11-221-2/+2
| | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: convert llvm option to tristateDylan Baker2017-11-222-25/+28
| | | | | | | This option has been acting as a strange sort of half-tri state anyway. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Convert platform to autoDylan Baker2017-11-222-2/+9
| | | | | | | | | This is necessary to support operating systems other than the *nix family (excluding macOS). For Linux nothing has changed, the defaults are still the same. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Remove duplicate _GNU_SOURCEDylan Baker2017-11-221-1/+0
| | | | | | | | There is one provided unconditionally, and one guarded by platform == linux. Remove the unconditional one. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Remove completed or irrelevant TODO commentsDylan Baker2017-11-221-15/+0
| | | | | | | | | These are all either done already, or are autotools specific. The misspelled gallium G3DVL is the autotools specific bit, meson is handling that via build_by_default. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Fix TODO for missing dl_iterate_phdr functionDylan Baker2017-11-221-2/+4
| | | | | | | | | | This function is required for both the Intel "Anvil" vulkan driver and the i965 GL driver. Error out if either of those is enabled but this function isn't found. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: disable x86 asm in fewer cases.Dylan Baker2017-11-221-7/+10
| | | | | | | | | | | | | | | | | | | This patch allows building asm for x86 on x86_64 platforms, when the operating system is the same. Previously cross compile always turned off assembly. This allows using a cross file to cross compile x86 binaries on x86_64 with asm. This could probably be relaxed further thanks to meson's "exe_wrapper", which is way to specify an emulator or compatibility layer (wine) that can run the foreign binaries on the build system. Since the meson build at this point only supports building on Linux I can't test this and I don't want to write/enable code that cannot even be build tested. v4: - set condition to build == x86_64 and host == x86 and build.system == host.system Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Enable SSE4.1 optimizationsDylan Baker2017-11-222-4/+25
| | | | | | | | | | | | This patch checks for an and then enables sse4.1 optimizations if the host machine will be x86/x86_64. v2: - Don't compile code, it's unnecessary since we require a compiler which always has SSE4.1 (Matt) v3: - x64 -> x86_64 (Matt) Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* broadcom/vc5: Fix BASE_LEVEL handling with txl.Eric Anholt2017-11-222-2/+8
| | | | | | | The HW doesn't add the base level anywhere (the min/max lod clamping is what does base level), so we need to add it manually in this case. Fixes piglit tex-miplevel-selection *Lod 2D.
* broadcom/vc5: Fix array texture layer count setup.Eric Anholt2017-11-221-1/+6
| | | | Fixes piglit array-texture.
* broadcom/vc5: Don't increment primitive queries while they're paused.Eric Anholt2017-11-221-1/+3
| | | | Fixes ext_transform_feedback-generatemipmap prims_generated
* broadcom/vc5: Fix incorrect padding of TF outputs.Eric Anholt2017-11-221-0/+2
| | | | | | After the first output, we were padding by an extra size of the previous output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and friends.
* broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.Eric Anholt2017-11-221-2/+23
| | | | | | | | | | The HW was computing an implicit height for the surface based on the image size, but that may be smaller than the surface with ARB_fbo mismatched sizes. In that case, we need to tell it about the pad, either with the little 4-bit field in the RT config, or the extended field in CLEAR_COLORS_PART3. Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.
* etnaviv: Put HALTI level in specsWladimir J. van der Laan2017-11-222-0/+23
| | | | | | | | | | The HALTI level is an indication of the gross architecture of the GPU. It determines for significant part what feature level the GPU has, what state (especially frontend state) is there, and where it is located. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: Const-correctness etnaviv_emit.hWladimir J. van der Laan2017-11-221-1/+1
| | | | | | | | | The relocation structure is never changed by submitting it. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* meson: add si_driinfo.h in libgallium_driJuan A. Suarez Romero2017-11-221-0/+1
| | | | | | v2: generate target conditionally (Dylan) Reviewed-by: Dylan Baker <[email protected]>
* nir/gather_info: recognize load_patch_vertices_in as a system valueIago Toral Quiroga2017-11-221-0/+1
| | | | | | | | This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <[email protected]>
* i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=batJordan Justen2017-11-211-0/+24
| | | | | | | | This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated samplers & surfaces. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/genxml: Add helpers for determining field typeKristian H. Kristensen2017-11-211-6/+17
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Check ADD/MAD with immediates in satprop unit testMatt Turner2017-11-211-1/+125
| | | | | | | | | The gen had to be changed from 4 to 6 so that we could test MAD, which is new on Gen6. mad_imm_float_neg_mov_sat tests the case fixed by the previous commit. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Handle negating immediates on MADs when propagating saturatesMatt Turner2017-11-211-2/+8
| | | | | | | | | | | MADs don't take immediate sources, but we allow them in the IR since it simplifies a lot of things. I neglected to consider that case. Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate negations into MADs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 Reported-and-Tested-by: Ruslan Kabatsayev <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3DJuan A. Suarez Romero2017-11-211-1/+19
| | | | | | | | | | | | | | | | | | | | | From section 8.7, page 179 of OpenGL ES 3.2 spec: An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is one of the the formats in table 8.17 and target is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D. An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array” column of table 8.17 is not checked, or if internalformat is TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked. So far it was only considering TEXTURE_2D_ARRAY as valid target. But as "Cube Map Array" column is checked for all the cases, in practice we can consider also TEXTURE_CUBE_MAP_ARRAY. This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture Reviewed-by: Nanley Chery <[email protected]>
* intel: fix disasm_info memory leaksTapani Pälli2017-11-212-2/+2
| | | | | | | | Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/glsl_to_nir: don't generate nir twice for gsTimothy Arceri2017-11-211-8/+2
| | | | | | This was left out of c980a3aa3133 Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: fix snorm blendingRoland Scheidegger2017-11-214-53/+191
| | | | | | | | | | | | | | | | | | | The blend math gets a bit funky due to inverse blend factors being in range [0,2] rather than [-1,1], our normalized math can't really cover this. src_alpha_saturate blend factor has a similar problem too. (Note that piglit fbo-blending-formats test is mostly useless for anything but unorm formats, since not just all src/dst values are between [0,1], but the tests are crafted in a way that the results are between [0,1] too.) v2: some formatting fixes, and fix a fairly obscure (to debug) issue with alpha-only formats (not related to snorm at all), where blend optimization would think it could simplify the blend equation if the blend factors were complementary, however was using the completely unrelated rgb blend factors instead of the alpha ones... Reviewed-by: Jose Fonseca <[email protected]>
* r600: add cull distance supportDave Airlie2017-11-218-7/+26
| | | | | | This passes all the tests in piglit. Signed-off-by: Dave Airlie <[email protected]>
* i965: Optimize bucket index calculationAravindan Muthukumar2017-11-201-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reducing Bucket index calculation to O(1). This algorithm calculates the index using matrix method. Assuming PAGE_SIZE is 4096, matrix arrangement is as below: 1*4096 2*4096 3*4096 4*4096 5*4096 6*4096 7*4096 8*4096 10*4096 12*4096 14*4096 16*4096 20*4096 24*4096 28*4096 32*4096 ... ... ... ... ... ... ... ... ... ... ... max_cache_size From this matrix its clearly seen that every row follows the below way: ... ... ... n n+(1/4)n n+(1/2)n n+(3/4)n 2n Row is calculated as log2(size/PAGE_SIZE) Column is calculated as converting the difference between the elements to fit into power size of two and indexing it. Final Index is (row*4)+(col-1) Tested with Intel Mesa CI. Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20) v4: Review comments on style and code comments implemented (Ian). v3: Review comments implemented (Ian). v2: Review comments implemented (Jason). Signed-off-by: Aravindan Muthukumar <[email protected]> Signed-off-by: Kedar Karanje <[email protected]> Reviewed-by: Yogesh Marathe <[email protected]> Signed-off-by: Ian Romanick <[email protected]>
* meson: Guard the gallium dri componenetDylan Baker2017-11-201-2/+4
| | | | | | | | Currently the target has a redundant guard, and the state tracker isn't properly guarded. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: don't build gallium subdir unless we're building galliumDylan Baker2017-11-201-1/+3
| | | | | | | This will allow us to simplify some guards within the gallium directory. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/vc5: Align 1D texture miplevels to 64b.Eric Anholt2017-11-201-0/+2
| | | | Fixes tex-miplevel-selection GL2:texture() 1D
* broadcom/vc5: Clamp min lod to the last level.Eric Anholt2017-11-201-2/+3
| | | | | | Otherwise, the simulator would complain in tex-miplevel-selection that the min/max clamp was out of order. The actual HW seems to have clamped to the max anyway.
* broadcom/vc5: Increase simulator memory for tex-miplevel-selection.Eric Anholt2017-11-201-1/+1
| | | | | We were overflowing, because of all the little 4k allocations for CLs that were getting expanded to 128kb in the simulator due to the GMP alignment.
* swr/rast: Repair simd8 frontend code rotTim Rowley2017-11-201-1/+1
| | | | | | Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shaderTim Rowley2017-11-204-29/+220
| | | | | | Disabled for now. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Simplify GATHER* jit builder apiTim Rowley2017-11-204-48/+48
| | | | | | | General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add alignment to transpose targetsTim Rowley2017-11-201-8/+8
| | | | | | | | Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Cache eventmanagerTim Rowley2017-11-203-0/+9
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Enable AVX-512 targets in the jitterTim Rowley2017-11-202-10/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Points with clipdistance can't go through simplepoints pathTim Rowley2017-11-201-1/+2
| | | | | | | Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Code style change (NFC)Tim Rowley2017-11-201-2/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16Tim Rowley2017-11-205-3/+151
| | | | | | | Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Support flexible vertex layout for DS outputTim Rowley2017-11-202-0/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* gallium/u_threaded: avoid syncing in threaded_context_flushNicolai Hähnle2017-11-203-5/+17
| | | | | | | | We could always do the flush asynchronously, but if we're going to wait for a fence anyway and the driver thread is currently idle, the additional communication overhead isn't worth it. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: avoid syncing the driver thread in si_fence_finishNicolai Hähnle2017-11-203-37/+49
| | | | | | It is really only required when we need to flush for deferred fences. Reviewed-by: Marek Olšák <[email protected]>