summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spirv: handle OpUndef as part of the variable parsing passLionel Landwerlin2017-01-262-0/+7
| | | | | | | | | | | | | | | | | | | | | Looking at the following bit of SPIRV shader : ... %zero = OpConstant %i32 0 %ivec3_0 = OpConstantComposite %ivec3 %zero %zero %zero %vec3_undef = OpUndef %ivec3 %sc_0 = OpSpecConstant %i32 0 %sc_1 = OpSpecConstant %i32 0 %sc_2 = OpSpecConstant %i32 0 ... Our compiler currently stops parsing variables & types on the OpUndef and switches to instructions, leaving the following sc_[0-2] variables untreated. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv: fix descriptor pool internal size allocationLionel Landwerlin2017-01-261-4/+4
| | | | | | | | | | | | | | | The size of the pool is slightly smaller than the size of the structure containing the whole pool. We need to take that into account on when setting up the internals. Fixes a crash due to out of bound memory access in: dEQP-VK.api.descriptor_pool.out_of_pool_memory v2: Drop debug traces (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* i965: Make intelEmitCopyBlit not truncate large strides.Kenneth Graunke2017-01-262-11/+7
| | | | | | | | | | | | | | | | | | | When trying to blit larger tiled surfaces, the pitch can be larger than 32768 bytes, which means it won't fit in a GLshort. Passing it in will truncate the stride to 0, which has...surprising results. The pitch can be up to 32,768 DWords, or 128kB. We measure it in bytes, but divide by 4 when programming it. So we need to handle values up to 131,072. Switch from GLshort to int32_t to avoid the truncation. Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage at widths greater than 8192. v2: Use int32_t as negative values can be used (Jason). Cc: "17.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.Kenneth Graunke2017-01-261-1/+1
| | | | | | | | | | | | SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message, using a source of g127 for the single register. With a UD type, this supposedly could read g128, which doesn't exist, causing the simulator to get cranky. Use a UW type to avoid this. Cc: "17.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* anv/lower_input_attachments: honor sample index parameter to subpassLoad()Iago Toral Quiroga2017-01-261-4/+1
| | | | | | | | | | | | | | | | | According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is: gvec4 subpassLoad(gsubpassInput subpass); gvec4 subpassLoad(gsubpassInputMS subpass, int sample); So the multisampled case always receives an explicit sample index that we should use. The current implementation was ignoring this parameter and using gl_SampleID value instead. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0" <[email protected]>
* i965: Fix fast depth clears for surfaces with a dimension of 16384.Kenneth Graunke2017-01-251-0/+12
| | | | | | | | | | | | | | I hadn't bothered to set this bit because I figured it would just paper over us getting the rectangle wrong. But it turns out that there is a legitimate reason to use it, so let's do so. The alternative would be to chop up 16k clears to multiple 8k clears, which is pointlessly painful. Cc: "17.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Implement VK_KHR_get_physical_device_properties2Chad Versace2017-01-253-0/+161
| | | | | Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Refactor anv_GetPhysicalDeviceQueueFamilyProperties()Chad Versace2017-01-251-9/+17
| | | | | | | | | Add a helper function, anv_get_queue_family_properties(), which fills the struct. This patch reduces churn in the following patch that implements vkGetPhysicalDeviceQueueFamilyProperties2KHR. Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Refactor anv_GetPhysicalDeviceFormatProperties()Chad Versace2017-01-251-28/+49
| | | | | | | | | | Add a helper function, anv_get_image_format_properties(), which does all the work and has a VkPhysicalDeviceImageFormatInfo2KHR parameter. This patch reduces churn in the following patch that implements vkGetPhysicalDeviceImageFormatProperties2KHR. Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Revive struct anv_commonChad Versace2017-01-251-0/+5
| | | | | | | | | | | | | | | | | The struct was deleted by: commit efe9d1cde3340d3a9d17e5560b609a4fb839d43d Author: Edward O'Callaghan <[email protected]> Subject: anv: Clean up some unused variables Unlike the original anv_common, the new one has a non-const pNext pointer because we will use it for the output structs of VK_KHR_get_physical_device_properties2. v2: - Retype pNext from void* to struct anv_common*. Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Define macro anv_debug()Chad Versace2017-01-251-0/+2
| | | | | | | | This is a printf-like macro that prints a debug message to stderr when built with DEBUG. If no DEBUG, then do nothing. Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* mesa: Fix copy-and-paste bug in _mesa_(Program|)Uniform[1234](i|ui)64vARB ↵Ian Romanick2017-01-251-16/+16
| | | | | | | | | | | | | | | | | | | | | functions All of the functions were passing 1 to _mesa_uniform instead of passing count. Fixes 16 unsed parameter warnings like: main/uniforms.c: In function ‘_mesa_Uniform1i64vARB’: main/uniforms.c:1692:47: warning: unused parameter ‘count’ [-Wunused-parameter] _mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value) ^~~~~ This is why I build with extra warnings enabled. Unfortunately, there are so many unused parameter warnings in Mesa that I didn't notice these added warnings for over 6 months. :( Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* spirv: bump headers to SPIRV 1.1Lionel Landwerlin2017-01-253-9/+86
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add default handler for new enumsLionel Landwerlin2017-01-252-0/+15
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix typosLionel Landwerlin2017-01-251-3/+3
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: set command buffer to NULL when allocations failLionel Landwerlin2017-01-251-1/+4
| | | | | | | | | | | | | | | | | | The spec section 5.2 says: "vkAllocateCommandBuffers can be used to create multiple command buffers. If the creation of any of those command buffers fails, the implementation must destroy all successfully created command buffer objects from this command, set all entries of the pCommandBuffers array to VK_NULL_HANDLE and return the error." Fixes: dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "13.0 17.0" <[email protected]>
* vulkan/wsi: Lower the maximum image sizesJason Ekstrand2017-01-252-2/+4
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "17.0" <[email protected]>
* vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModesJason Ekstrand2017-01-251-3/+5
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "17.0" <[email protected]>
* vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormatsJason Ekstrand2017-01-251-7/+9
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "17.0" <[email protected]>
* swr: Update fs texture & sampler state logicGeorge Kyriazis2017-01-251-2/+5
| | | | | | | | In swr_update_derived() update texture and sampler state on a new fragment shader. GALLIUM_HUD can update fs using a previously bound texture and sampler. Reviewed-by: Bruce Cherniak <[email protected]>
* gallium/radeon: add a new HUD query for the number of mapped buffersSamuel Pitoiset2017-01-259-0/+18
| | | | | | | | | | | Useful when debugging applications which map a ton of buffers and also because we used to run into Linux's limit on the number of simultaneous mmap() calls. v2: - update the commit message Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* spirv: handle gl_SampleMaskIago Toral Quiroga2017-01-251-2/+6
| | | | | | | | | | | | | SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same builtin (SampleMask). The only way to tell which one we are dealing with is to check if it is an input or an output. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.* Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: acknowledge multisampled input attachmentsIago Toral Quiroga2017-01-251-3/+8
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* radv: program a default point size.Dave Airlie2017-01-251-1/+2
| | | | | | | | | | | | Along the lines of what 3b804819 anv: Default PointSize to 1.0 if not written by the shader does for anv, program a default point size in the hw of 1.0. This preempt fixes a bunch of geom shader tests. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: handle first_non_void correctly in si_create_vertex_elementsMarek Olšák2017-01-241-3/+3
| | | | | | | | This fixes R11G11B10_FLOAT, because it's in the category of "OTHER", meaning that it doesn't have any channel description. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: destroy pipe_context before destroying st_context (v2)Marek Olšák2017-01-241-6/+7
| | | | | | | | | | | | | | | | | If radeonsi starts compiling an optimized shader variant asynchronously with a GL debug callback set and the application destroys the GL context, radeonsi crashes when trying to write shader stats into the debug output of a non-existent context after compilation, because st/mesa was destroyed before pipe_context. Firefox with WebGL2 enabled hits this bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456 v2: protect against a double destroy in st_create_context_priv and callers. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: bump loop max unroll limitTimothy Arceri2017-01-251-1/+1
| | | | | | | | | | | | | | | | | | The original number was chosen in an attempt to match the limits applied to GLSL IR. A look at the git history of the why these limits were chosen for GLSL IR shows it was more to do with the slow speed of unrolling large loops in GLSL IR than anything else. The speed of loop unrolling in NIR is not a problem so we may wish to bump this even higher in future. No shader-db change, however a furture change will disbale the GLSL IR optimisation loop in the i965 backend results in 4 loops from The Talos Principle failing to unroll. Bumping the limit allows them to unroll which results in the instruction count matching the previous output from when the GLSL IR opts were still enabled. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: lower constant arrays to uniform arrays before optimisation loopTimothy Arceri2017-01-251-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the constant array would not get copy propagated until the backend did its GLSL IR opt loop. I plan on removing that from i965 shortly which caused huge regressions in Deus-ex and Tomb Raider which have large constant arrays. Moving lowering before the opt loop in the GLSL linker fixes this and unexpectedly improves some compute shaders also. shader-db results BDW: instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%) total instructions in shared programs: 13060313 -> 13059877 (-0.00%) instructions in affected programs: 1756 -> 1320 (-24.83%) helped: 3 HURT: 0 total cycles in shared programs: 256586698 -> 256561910 (-0.01%) cycles in affected programs: 2066104 -> 2041316 (-1.20%) helped: 3 HURT: 0 V3: only call the opt loop if lowering progressed (Suggested by Eric) V2: call opts before and after lowering (Suggested by Ken) Reviewed-by: Eric Anholt <[email protected]>
* mesa: Don't advertise GL_OES_read_format in core profileIan Romanick2017-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OpenGL ES implementations are not allowed to ship ARB extensions, and OpenGL implementations are not allowed to ship OES extensions. The functionality is also included in GL_ARB_ES2_compatibility. Ever OpenGL core-profile driver currently exposes both extensions. I don't know of any applications that explicitly check for GL_OES_read_format, so removing it seems very unlikely to cause problems. No functionality is removed. I have left this extension in place for compatibility profile. There are still OpenGL 1.x drivers in Mesa, and adding code to check for compatibility profile and not GL_ARB_ES2_compatibility for GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT just feels dumb. Three other other alternatives considered: - Remove the string from compatibility profile drivers but leave the functionality in place. - Add a flag to expose the extension string, and set it in every OpenGL driver that does not expose GL_ARB_ES2_compatibility (and those drivers only). I tried this. You can't have two instances of an extension in the extension table (one dummy_true for ES1 and one with a flag for compatibility profile), so the implementation requires a bit of effort. - Only expose the extension in compatibility if the version is less than 2.0. I didn't see an easy way to do this. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* docs: fix incorrect link to 12.0.6 release notesBrian Paul2017-01-241-1/+1
| | | | Trivial.
* anv: Expose VK_KHR_maintenance1Jason Ekstrand2017-01-241-1/+5
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Return better errors from AllocateDescriptorSetsJason Ekstrand2017-01-241-2/+7
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Allow selecting the slice of a 3D imageJason Ekstrand2017-01-241-1/+1
| | | | | | | | As per VK_KHR_maintenance1, clients can render to a slice of a 3D image by creating a VK_IMAGE_VIEW_TYPE_2D view of it. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Report FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHRJason Ekstrand2017-01-241-1/+13
| | | | | | | | | As of VK_KHR_maintenance1, these are supposed to be reported for any formats on which we support transfer operations. For us, this is anything that we can texture from. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add trivial support for TrimCommandPoolKHRJason Ekstrand2017-01-241-0/+8
| | | | | | | | Our command buffers already efficiently use a global pool so trimming doesn't really need to do anything. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Set viewport extents correctly when height is negativeJason Ekstrand2017-01-241-2/+2
| | | | | | | | | | As per VK_KHR_maintenance1, setting a negative height in the viewport can be used to get flipped coordinates. This is, aparently, very useful when porting D3D apps to Vulkan. All we need to do to support this is to make sure we actually set the min and max correctly. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: Don't install vk_platform.h or vulkan.h.Matt Turner2017-01-241-3/+5
| | | | | | These files belong to the vulkan loader. Reviewed-by: Emil Velikov <[email protected]>
* glsl: fix compile errors with mingw due to missing PRIx64 definitionsRoland Scheidegger2017-01-242-0/+4
| | | | | | | | | | | | | | | | | | | | | | define __STDC_FORMAT_MACROS and include <inttypes.h> (same as ir_builder_print_visitor.cpp already does). Otherwise, some mingw build errors out (since 8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with: src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before ‘PRIu64’ case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; (Note even with that fix I get other format specifier warnings: src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: unknown conversion type character ‘a’ in format [-Wformat=] fprintf(f, "%a", ir->value.f[i]); ^ src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: too many arguments for format [-Wformat-extra-args] but it still compiles at least) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: don't try to use fast rcp for fdivRoland Scheidegger2017-01-241-1/+3
| | | | | | | | The use of fast rcp instruction is disabled, and will always fall back to use a division instead (1 / x). Hence, if we get a division opcode, it doesn't make much sense trying to split that into rcp/mul. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix ddiv cpu implementationRoland Scheidegger2017-01-241-1/+0
| | | | | | | | | | | we can't use the cpu implementation of fdiv, as this one uses different lp_build_context, which causes assertion failure. Just use default fdiv action (there is no fast rcp for doubles which we could potentially use anyway). Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: implement ddiv opcodeRoland Scheidegger2017-01-241-0/+14
| | | | | | | | | softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64, so we really need to support that opcode. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* i965/blorp: Use the correct ISL format for combined depth/stencilJason Ekstrand2017-01-241-0/+2
| | | | | | | | | | | | | | In brw_blorp_copyteximage, we use the format from the render buffer. This could be a combined depth/stencil format. In this case, we handle stencil properly but we give blorp the wrong ISL format. Specifically, we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong size was causing GPU hangs. Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "13.0 17.0" <[email protected]>
* st/glsl_to_tgsi: fix compilation warnings since int64 typesSamuel Pitoiset2017-01-241-3/+3
| | | | | | | | | | state_tracker/st_glsl_to_tgsi.cpp:302:28: warning: ‘glsl_to_tgsi_instruction::tex_type’ is too small to hold all values of ‘enum glsl_base_type’ glsl_base_type tex_type:4; Fixes: 8ce53d4a2f3 ("glsl: Add basic ARB_gpu_shader_int64 types") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: undef the very specific UPDATE_COUNTER macroSamuel Pitoiset2017-01-241-5/+9
| | | | | | | Also, wrap this into a do { ... } while (0). Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/blorp: Add also depth and stencil buffers to render cacheTopi Pohjolainen2017-01-241-0/+4
| | | | | | | | | | v2 (Jason, Curro): Add stencil also even though it is not enabled yet. Cc: 17.0 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* gbm: Fix width height getters return type (trivial)Ben Widawsky2017-01-231-2/+2
| | | | | | | | | | v2: Other way round... to make consistent, make both return type have the fixed width - uint32_t. Cc: Daniel Stone <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Daniel Stone <[email protected]>
* gbm: Move getters to match order in header file (trivial)Ben Widawsky2017-01-231-11/+11
| | | | | | | | | | | | Other things are out of order, but I need to add a getter so I'm just fixing those. This helps people adding to GBM know where the right place to put things is. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Daniel Stone <[email protected]>
* docs: add news item and link release notes for 12.0.6Emil Velikov2017-01-242-0/+12
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: use correct year for the 12.0.6 release notesEmil Velikov2017-01-241-1/+1
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 13953f012dfc7f89dbb07f1eda856aa5353347cc)
* docs: add sha256 checksums for 12.0.6Emil Velikov2017-01-241-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 36e3f2542d3cde1fe4f7ca0be83dc49d941cb988)