mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/eu: Provide desc immediate argument up front to ↵	Francisco Jerez	2018-07-09	4	-11/+13
\| \| \| \| \| \| \| \| \| \|	brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <[email protected]>
*	TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().	Francisco Jerez	2018-07-09	1	-5/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/eu: Use brw_set_desc() along with a helper to set common descriptor ↵	Francisco Jerez	2018-07-09	3	-86/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see d2eecf0b0b24d203d0f171807681dffd830d54de fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD.	Francisco Jerez	2018-07-09	1	-0/+7
\| \| \| \| \| \| \| \|	Allows to specify a bitfield based on its upper and lower bounds instead of a symbolic field definition, kind of what the current GET_BITS macro is to GET_FIELD. Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/eu: Define helper to specify the descriptor immediates of a SEND ↵	Francisco Jerez	2018-07-09	2	-0/+26
\| \| \| \| \| \|	instruction. Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended ↵	Francisco Jerez	2018-07-09	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \|	descriptor. This introduces helpers that can be used to specify or extract the whole descriptor of a SEND message instruction at once. Because the the instruction encoding of these is rather awkward on some generations using the generic brw_inst.h macros doesn't seem like an option. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Support saving the gen program with glGetProgramBinary	Jordan Justen	2018-07-09	1	-6/+66
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add flag_state param to brw_search_cache	Jordan Justen	2018-07-09	12	-45/+35
\| \| \| \| \| \| \| \| \| \|	This allows brw_search_cache to be used to find programs without causing extra state to be emitted in the case where the program isn't being made active. (For example, to find the program to save out with the ARB_get_program_binary interface.) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob	Jordan Justen	2018-07-09	8	-4/+45
\| \| \| \| \| \| \| \| \| \|	This might be required because some stages might generate different programs depending on the other stages in the program. For example, the i965 driver's tessellation control stage depends on the tessellation evaluation shader. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add brw_populate_default_key	Jordan Justen	2018-07-09	12	-73/+195
\| \| \| \| \| \| \| \| \|	We will need to populate the default key for ARB_get_program_binary to allow us to retrieve the default gen program to store in the program binary. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Replace brw_setup_tex_for_precompile brw with devinfo	Jordan Justen	2018-07-09	8	-9/+8
\| \| \| \| \| \| \| \|	Trying to make sure the setup of the default program key is not dependent on the GL state. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Regenerate blob without gen program for shader cache	Jordan Justen	2018-07-09	1	-1/+63
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	compiler/blob: Add blob_skip_bytes	Jordan Justen	2018-07-09	2	-0/+13
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add support for driver cache blob containing the gen program	Jordan Justen	2018-07-09	1	-0/+41
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Use brw_prog_key_set_id in disk cache load/store code	Jordan Justen	2018-07-09	1	-16/+8
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add brw_prog_key_set_id helper to set the program id on any stage	Jordan Justen	2018-07-09	2	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	For saving programs (shader cache; get program binary) it is useful to set the id to 0, with the stage being a parameter. For restoring programs it is useful to set the id to the id allocated to the program at creation time. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add brw_stage_cache_id to map gl stages to brw cache_ids	Jordan Justen	2018-07-09	2	-0/+17
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add brw_(read\|write)_blob_program_data functions	Jordan Justen	2018-07-09	3	-41/+61
\| \| \| \| \| \| \| \|	We will want to use these for both the disk shader cache, and for the ARB_get_program_binary. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Add brw_program_deserialize_driver_blob	Jordan Justen	2018-07-09	3	-21/+48
\| \| \| \| \| \| \| \| \| \| \|	brw_program_deserialize_driver_blob will be a more generic form of brw_program_deserialize_nir. In addition to nir, it will also be able to extract gen binaries and upload them to the program cache. In this commit, it continues to only support nir. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Move brw_program_*serialize_nir to brw_program_binary.c	Jordan Justen	2018-07-09	2	-37/+37
\| \| \| \| \| \| \| \|	This will allow get_program_binary to add the gen program into its serialization in addition to just the nir program. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: Always call ProgramBinarySerializeDriverBlob	Jordan Justen	2018-07-09	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The driver may prefer to have a different blob for ARB_get_program_binary compared to the version saved out for the disk shader cache. Since they both use the driver_cache_blob field, we need to always give the driver the opportunity to fill in the driver_cache_blob when saving the program binary. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Use ShaderCacheSerializeDriverBlob driver function	Jordan Justen	2018-07-09	3	-11/+7
\| \| \| \| \| \| \| \|	This function is called just before the gl_program::driver_cache_blob is saved out as part of the gl_program serialization. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: Use ShaderCacheSerializeDriverBlob driver function	Jordan Justen	2018-07-09	1	-0/+2
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: Skip serializing driver_cache_blob if it exists	Jordan Justen	2018-07-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Previously the mesa core code would not call to serialize the driver_cache_blob if it existed. We will update it to always call to serialize the driver_cache_blob meaning we should avoid re-serializing it under mesa/state_tracker. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: Add disk shader cache driver blob callback	Jordan Justen	2018-07-09	2	-0/+23
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	intel/compiler: emit actual barriers for working-group level barriers	Iago Toral Quiroga	2018-07-10	1	-23/+2
\| \| \| \| \| \| \| \| \|	Until now we have assumed that we could skip emitting these barriers in the general case based on empirical testing and a few assumptions detailed in a comment in the driver code, however, recent CTS tests have showed that we actually need them to produce correct behavior. Reviewed-by: Jason Ekstrand <[email protected]>
*	radv: add some cxxflags for new c++ file	Dave Airlie	2018-07-10	1	-0/+4
\| \| \| \| \| \| \|	Looks like I broke intel CI compiles. Fixes: 6f3aee40f9 (radv: using tls to store llvm related info and speed up compiles (v10)) Tested-by: Clayton Craft <[email protected]>
*	anv,radv: Add support for VK_KHR_get_display_properties2	Jason Ekstrand	2018-07-09	6	-16/+301
\| \| \| \|	Reviewed-by: Keith Packard <[email protected]>
*	intel/aubinator_error_decode: Allow for more sections	Jason Ekstrand	2018-07-09	1	-11/+13
\| \| \| \| \| \| \| \| \| \|	Error states coming from actual Vulkan applications tend to have fairly long command buffers and lots of chained batches. 30 total BOs isn't nearly enough. This commit bumps it to 256, makes some things use the actual number of sections instead of the #define, and adds asserts if we ever go over 256 sections. Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch_decoder: Recurse for all 2nd level batches	Jason Ekstrand	2018-07-09	1	-14/+5
\| \| \| \| \| \| \| \| \|	Our attempt to restart the loop with the second level batch worked at one point but got broken at some point. It was too fragile anyway and we're not likely to have enough secondaries to actually overflow the stack so we may as well recurse in both cases. Reviewed-by: Lionel Landwerlin <[email protected]>
*	virgl/vtest: add support to vtest for new cap getting.	Dave Airlie	2018-07-10	2	-4/+28
\| \| \| \| \| \| \| \| \| \| \| \|	The vtest protocol is pretty simple but also pretty dumb, and the v1 caps query was fixed size, with no nice way to expand it, however the server also ignores any command it doesn't understand. So we can query v2 caps by sending a v2 followed by a v1, if the v2 is ignored we know it's an old vtest server, and the we get a v2 answer then we can just read the v1 answer and discard it. Acked-by: Jakob Bornecrantz <[email protected]> (sounds good)
*	i965/icl: Don't set float blend optimization bit in CACHE_MODE_SS	Anuj Phogat	2018-07-09	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \|	CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv/icl: Don't set float blend optimization bit in CACHE_MODE_SS	Anuj Phogat	2018-07-09	1	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \|	CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv: Implement VK_EXT_vertex_attribute_divisor	Jason Ekstrand	2018-07-09	3	-0/+21
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	anv/pipeline: Add a per-VB instance divisor	Jason Ekstrand	2018-07-09	4	-12/+20
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	anv/pipeline: Use a per-VB struct instead of separate arrays	Jason Ekstrand	2018-07-09	4	-8/+11
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage	Jose Maria Casanova Crespo	2018-07-10	3	-0/+13
\| \| \| \| \| \| \| \|	Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+ using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. Reviewed-by: Jason Ekstrand <[email protected]>
*	spirv/nir: Add support for SPV_KHR_8bit_storage	Jose Maria Casanova Crespo	2018-07-10	2	-0/+7
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	spirv: Include headers and grammar for SPV_KHR_8bit_storage	Jose Maria Casanova Crespo	2018-07-10	2	-7/+40
\| \| \| \| \| \|	Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb Acked-by: Jason Ekstrand <[email protected]>
*	i965/fs: Enable store_ssbo for 8-bit types.	Jose Maria Casanova Crespo	2018-07-10	1	-7/+8
\| \| \| \| \| \|	v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: relax brw_eu_validate for byte raw movs	Jose Maria Casanova Crespo	2018-07-10	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Enable conversions to 8-bit integers	Jose Maria Casanova Crespo	2018-07-10	1	-0/+2
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Support for 8-bit base types in helper functions	Jose Maria Casanova Crespo	2018-07-10	2	-1/+14
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Register allocator shoudn't use grf127 for sends dest	Jose Maria Casanova Crespo	2018-07-10	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) Reviewed-by: Jason Ekstrand <[email protected]> Cc: 18.1 <[email protected]>
*	intel/compiler: grf127 can not be dest when src and dest overlap in send	Jose Maria Casanova Crespo	2018-07-10	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Cc: 18.1 <[email protected]>
*	radv: using tls to store llvm related info and speed up compiles (v10)	Dave Airlie	2018-07-10	8	-28/+199
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2)	Adam Jackson	2018-07-09	1	-21/+20
\| \| \| \| \| \| \| \| \| \|	Fixes 14 piglits, mostly in egl_khr_create_context. v2: Also short-circuit the same-context-no-drawables case (Eric Anholt) Fixes: https://github.com/anholt/libepoxy/issues/177 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
*	intel: tools: dump_gpu: fix ppgtt mapping	Lionel Landwerlin	2018-07-09	1	-23/+23
\| \| \| \| \| \| \| \|	We were not properly writing page tables when the virtual address range spans multiple subtrees of the tables. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
*	v3d: Implement noperspective varyings on V3D 4.x.	Eric Anholt	2018-07-09	7	-4/+40
\| \| \| \| \|	Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.
*	v3d: Refactor flat shade/centroid flag emission.	Eric Anholt	2018-07-09	1	-64/+76
\| \| \| \| \| \|	The logic was duplicated in a pretty gross way, when what we really need is just a helper function for stuffing the values in the packet. This will make implementing noperspective easier.