summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: Skip serializing driver_cache_blob if it existsJordan Justen2018-07-091-0/+3
| | | | | | | | | | Previously the mesa core code would not call to serialize the driver_cache_blob if it existed. We will update it to always call to serialize the driver_cache_blob meaning we should avoid re-serializing it under mesa/state_tracker. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Add disk shader cache driver blob callbackJordan Justen2018-07-092-0/+23
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* intel/compiler: emit actual barriers for working-group level barriersIago Toral Quiroga2018-07-101-23/+2
| | | | | | | | | Until now we have assumed that we could skip emitting these barriers in the general case based on empirical testing and a few assumptions detailed in a comment in the driver code, however, recent CTS tests have showed that we actually need them to produce correct behavior. Reviewed-by: Jason Ekstrand <[email protected]>
* radv: add some cxxflags for new c++ fileDave Airlie2018-07-101-0/+4
| | | | | | | Looks like I broke intel CI compiles. Fixes: 6f3aee40f9 (radv: using tls to store llvm related info and speed up compiles (v10)) Tested-by: Clayton Craft <[email protected]>
* anv,radv: Add support for VK_KHR_get_display_properties2Jason Ekstrand2018-07-096-16/+301
| | | | Reviewed-by: Keith Packard <[email protected]>
* intel/aubinator_error_decode: Allow for more sectionsJason Ekstrand2018-07-091-11/+13
| | | | | | | | | | Error states coming from actual Vulkan applications tend to have fairly long command buffers and lots of chained batches. 30 total BOs isn't nearly enough. This commit bumps it to 256, makes some things use the actual number of sections instead of the #define, and adds asserts if we ever go over 256 sections. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch_decoder: Recurse for all 2nd level batchesJason Ekstrand2018-07-091-14/+5
| | | | | | | | | Our attempt to restart the loop with the second level batch worked at one point but got broken at some point. It was too fragile anyway and we're not likely to have enough secondaries to actually overflow the stack so we may as well recurse in both cases. Reviewed-by: Lionel Landwerlin <[email protected]>
* virgl/vtest: add support to vtest for new cap getting.Dave Airlie2018-07-102-4/+28
| | | | | | | | | | | | The vtest protocol is pretty simple but also pretty dumb, and the v1 caps query was fixed size, with no nice way to expand it, however the server also ignores any command it doesn't understand. So we can query v2 caps by sending a v2 followed by a v1, if the v2 is ignored we know it's an old vtest server, and the we get a v2 answer then we can just read the v1 answer and discard it. Acked-by: Jakob Bornecrantz <[email protected]> (sounds good)
* i965/icl: Don't set float blend optimization bit in CACHE_MODE_SSAnuj Phogat2018-07-091-4/+0
| | | | | | | | | | | | CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/icl: Don't set float blend optimization bit in CACHE_MODE_SSAnuj Phogat2018-07-091-12/+0
| | | | | | | | | | | | CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement VK_EXT_vertex_attribute_divisorJason Ekstrand2018-07-093-0/+21
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/pipeline: Add a per-VB instance divisorJason Ekstrand2018-07-094-12/+20
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/pipeline: Use a per-VB struct instead of separate arraysJason Ekstrand2018-07-094-8/+11
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storageJose Maria Casanova Crespo2018-07-103-0/+13
| | | | | | | | Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+ using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir: Add support for SPV_KHR_8bit_storageJose Maria Casanova Crespo2018-07-102-0/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Include headers and grammar for SPV_KHR_8bit_storageJose Maria Casanova Crespo2018-07-102-7/+40
| | | | | | Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb Acked-by: Jason Ekstrand <[email protected]>
* i965/fs: Enable store_ssbo for 8-bit types.Jose Maria Casanova Crespo2018-07-101-7/+8
| | | | | | v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: relax brw_eu_validate for byte raw movsJose Maria Casanova Crespo2018-07-101-3/+5
| | | | | | | | | | | When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Enable conversions to 8-bit integersJose Maria Casanova Crespo2018-07-101-0/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Support for 8-bit base types in helper functionsJose Maria Casanova Crespo2018-07-102-1/+14
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Register allocator shoudn't use grf127 for sends destJose Maria Casanova Crespo2018-07-101-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) Reviewed-by: Jason Ekstrand <[email protected]> Cc: 18.1 <[email protected]>
* intel/compiler: grf127 can not be dest when src and dest overlap in sendJose Maria Casanova Crespo2018-07-101-0/+11
| | | | | | | | | | | | | | Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Cc: 18.1 <[email protected]>
* radv: using tls to store llvm related info and speed up compiles (v10)Dave Airlie2018-07-108-28/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2)Adam Jackson2018-07-091-21/+20
| | | | | | | | | | Fixes 14 piglits, mostly in egl_khr_create_context. v2: Also short-circuit the same-context-no-drawables case (Eric Anholt) Fixes: https://github.com/anholt/libepoxy/issues/177 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* intel: tools: dump_gpu: fix ppgtt mappingLionel Landwerlin2018-07-091-23/+23
| | | | | | | | We were not properly writing page tables when the virtual address range spans multiple subtrees of the tables. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* v3d: Implement noperspective varyings on V3D 4.x.Eric Anholt2018-07-097-4/+40
| | | | | Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.
* v3d: Refactor flat shade/centroid flag emission.Eric Anholt2018-07-091-64/+76
| | | | | | The logic was duplicated in a pretty gross way, when what we really need is just a helper function for stuffing the values in the packet. This will make implementing noperspective easier.
* v3d: Fix typo in dither mode offset.Eric Anholt2018-07-091-1/+1
| | | | | | We weren't using the field yet, so it didn't affect anything. Fixes: c0476d964abb ("v3d: Express dithering mode in the same way that the CLIF parser does.")
* glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2zhaowei yuan2018-07-091-2/+2
| | | | | | | | | | "sampler2DRect" and "sampler2DRectShadow" are specified as reserved from GLSL 1.1 and GLSL ES 1.0 Signed-off-by: zhaowei yuan <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906 Reviewed-by: Eric Anholt <[email protected]> Fixes: 34f7e761bc61 ("glsl/parser: Track built-in types using the glsl_type directly")
* st/wgl: check for NULL piAttribList in wglCreatePbufferARB()Charmaine Lee2018-07-061-39/+41
| | | | | | | | | Java2d opengl pipeline passes NULL piAttribList to wglCreatePbufferARB(). So skip parsing the attribute list if it is NULL. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* anv: Add support for VK_KHR_create_renderpass2Jason Ekstrand2018-07-093-0/+165
| | | | | | | | The implementation of CreateRenderPass2 uses the helpers we broke out in previous commits. The implementations of the new vkCmd functions just call the old versions. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Make subpass::depth_stencil_attachment a pointerJason Ekstrand2018-07-098-25/+28
| | | | | | | | This makes certain checks a bit easier and means that we don't have the attachment information duplicated in the attachment list and in depth_stencil_attachment. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pass: Move implicit dependency setup to anv_render_pass_compileJason Ekstrand2018-07-091-70/+63
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pass: Move some dependency setup into a helperJason Ekstrand2018-07-091-18/+34
| | | | | | This new helper takes a VkSubpassDependency2KHR for future-proofing. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pass: Move a bunch of analysis into a separate "compile" stageJason Ekstrand2018-07-091-50/+64
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pass: Use a designated initailizer for attachmentsJason Ekstrand2018-07-091-11/+11
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Bump the advertised patch version to 80Jason Ekstrand2018-07-091-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* glx: Don't allow glXMakeContextCurrent() with only one valid drawableAdam Jackson2018-07-091-0/+7
| | | | | | | | Drawable and readable need to either both be None or both be non-None. Cc: <[email protected]> Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: verify MaxVertexAttribStride for GLES 3.1Erik Faye-Lund2018-07-091-0/+1
| | | | | | | | | | | | | The OpenGL 3.1 specification, table Table 20.41 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL ES 3.1 on implementations where this isn't the case. Let's add a check for this Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: verify MaxVertexAttribStride for GL 4.4Erik Faye-Lund2018-07-091-0/+1
| | | | | | | | | | | | | The OpenGL 4.4 specification, table Table 23.55 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL 4.4 on implementations where this isn't the case. Let's add a check for this. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* r600: report incorrect max-vertex-attrib for GL 4.4Erik Faye-Lund2018-07-091-1/+2
| | | | | | | | | | | | | | OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but r600 only supports 2047. Technically, this makes it an GL4.3 GPU, but it's currently exposing GL4.4. To avoid regressing the GL version supported in the following patches, let's just lie and pretend like we support 2048. Any applications using 2048 are already broken anyway. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel/fs: use uint type for per_slot_offset at GSJose Maria Casanova Crespo2018-07-091-1/+1
| | | | | | | | | | | | This helps us to compact original instruction: mul(8) g3<1>D g6<8,8,1>UD 0x00000006UD { align1 1Q }; So now we emit: mul(8) g3<1>UD g6<8,8,1>UD 0x00000006UD { align1 1Q compacted }; Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* radv: add the trace BO to the list when starting a new cmdbufSamuel Pitoiset2018-07-091-4/+7
| | | | | | | | That might reduce CPU overhead a little bit when using RADV_TRACE_FILE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce CPU overhead in radv_flush_descriptors()Samuel Pitoiset2018-07-093-11/+8
| | | | | | | | The number of enabled descriptors for a given pipeline stage can be computed at compile time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler: remove unused functionIago Toral Quiroga2018-07-092-31/+0
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: honor the pipeline_cache_enabled run-time flagIago Toral Quiroga2018-07-091-1/+1
| | | | | | v2: merge both conditions to reduce the diff (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* r600/sb: fix crash in fold_alu_op3Roland Scheidegger2018-07-091-0/+2
| | | | | | | | | | | | | | | | | | | | fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* vulkan: Update the XML and headers to 1.1.80Jason Ekstrand2018-07-081-49/+231
| | | | Acked-by: Lionel Landwerlin <[email protected]>
* i965: fix clear color bo address relocationLionel Landwerlin2018-07-071-1/+1
| | | | | | Fixes: 7987d041fda0c9 ("i965/surface_state: Emit the clear color address instead of value.") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: winsys/amdgpu: include missing pthread.h headerMauro Rossi2018-07-071-0/+1
| | | | | | | | | | | | | | | | | | pthread types are used in some files without explicitely including pthread.h. This leads to compile errors on Android 7.x nougat-x86 e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31: In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32: external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t' pthread_mutex_t global_bo_list_lock; ^ 1 error generated. Including pthread.h explicitely solves the building error Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>