mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/compiler: don't compact 3-src instructions with Src1Type or Src2Type bits	Iago Toral Quiroga	2019-04-18	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are now using these bits, so don't assert that they are not set. In gen8, if these bits are set compaction is not possible. On gen9 and CHV platforms set_3src_control_index() checks these bits (and others) against a table to validate if the particular bit combination is eligible for compaction or not. v2 - Add more detail in the commit message explaining the situation for SKL+ and CHV (Jason) Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	intel/compiler: add new half-float register type for 3-src instructions	Iago Toral Quiroga	2019-04-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This is available since gen8. v2: restore previously existing assertion. v3: don't use separate tables for gen7 and gen8, just assert that we don't use half-float before gen8 (Matt) Reviewed-by: Topi Pohjolainen <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: add instruction setters for Src1Type and Src2Type.	Iago Toral Quiroga	2019-04-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original SrcType is a 3-bit field that takes a subset of the types supported for the hardware for 3-source instructions. Since gen8, when the half-float type was added, 3-source floating point operations can use use mixed precision mode, where not all the operands have the same floating-point precision. While the precision for the first operand is taken from the type in SrcType, the bits in Src1Type (bit 36) and Src2Type (bit 35) define the precision for the other operands (0: normal precision, 1: half precision). Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	intel/compiler: drop unnecessary temporary from 32-bit fsign implementation	Iago Toral Quiroga	2019-04-18	1	-3/+2
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: implement 16-bit fsign	Iago Toral Quiroga	2019-04-18	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \|	v2: - make 16-bit be its own separate case (Jason) v3: - Drop the result_int temporary (Jason) Reviewed-by: Topi Pohjolainen <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: handle extended math restrictions for half-float	Iago Toral Quiroga	2019-04-18	3	-12/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extended math with half-float operands is only supported since gen9, but it is limited to SIMD8. In gen8 we lower it to 32-bit. v2: quashed together the following patches (Jason): - intel/compiler: allow extended math functions with HF operands - intel/compiler: lower 16-bit extended math to 32-bit prior to gen9 - intel/compiler: extended Math is limited to SIMD8 on half-float Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (allow extended math functions with HF operands, extended Math is limited to SIMD8 on half-float)
*	intel/compiler: lower some 16-bit float operations to 32-bit	Iago Toral Quiroga	2019-04-18	1	-0/+5
\| \| \| \| \| \| \|	The hardware doesn't support half-float for these. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: assert restrictions on conversions to half-float	Iago Toral Quiroga	2019-04-18	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	There are some hardware restrictions that brw_nir_lower_conversions should have taken care of before we get here. v2: - rebased on top of regioning lowering pass Reviewed-by: Topi Pohjolainen <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: handle b2i/b2f with other integer conversion opcodes	Iago Toral Quiroga	2019-04-18	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since we handle booleans as integers this makes more sense. v2: - rebased to incorporate new boolean conversion opcodes v3: - rebased on top regioning lowering pass Reviewed-by: Jason Ekstrand <[email protected]> (v1) Reviewed-by: Topi Pohjolainen <[email protected]> (v2)
*	intel/compiler: split float to 64-bit opcodes from int to 64-bit	Iago Toral Quiroga	2019-04-18	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \|	Going forward having these split is a bit more convenient since these two groups have different restrictions. v2: - Rebased on top of new regioning lowering pass. Reviewed-by: Topi Pohjolainen <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: add a NIR pass to lower conversions	Iago Toral Quiroga	2019-04-18	5	-0/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some conversions are not directly supported in hardware and need to be split in two conversion instructions going through an intermediary type. Doing this at the NIR level simplifies a bit the complexity in the backend. v2: - Consider fp16 rounding conversion opcodes - Properly handle swizzles on conversion sources. v3 - Run the pass earlier, right after nir_opt_algebraic_late (Jason) - NIR alu output types already have the bit-size (Jason) - Use 'is_conversion' to identify conversion operations (Jason) v4: - Be careful about the intermediate types we use so we don't lose range and avoid incorrect rounding semantics (Jason) Reviewed-by: Topi Pohjolainen <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
*	Add no_aos_sampling GALLIVM_PERF option	Dominik Drees	2019-04-17	3	-4/+11
\| \| \| \| \|	This forces using general sampling and should improve precision and performance in some cases.
*	ac: use struct/raw store intrinsics for 8-bit/16-bit int with LLVM 9+	Samuel Pitoiset	2019-04-17	1	-14/+34
\| \| \| \| \| \| \| \|	This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	ac: use struct/raw load intrinsics for 8-bit/16-bit int with LLVM 9+	Samuel Pitoiset	2019-04-17	1	-12/+38
\| \| \| \| \| \| \| \|	This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	ac: add support for more types with struct/raw LLVM intrinsics	Samuel Pitoiset	2019-04-17	1	-20/+26
\| \| \| \| \| \| \| \|	LLVM 9+ now supports 8-bit and 16-bit types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radv: add VK_KHR_shader_atomic_int64 but disable it for now	Samuel Pitoiset	2019-04-17	3	-0/+12
\| \| \| \| \| \| \|	No support for 64-bit compare&swap atomic operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: add 64-bit SSBO atomic operations support	Samuel Pitoiset	2019-04-17	1	-3/+7
\| \| \| \| \| \| \| \|	Except compare&swap which is still buggy. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap	Samuel Pitoiset	2019-04-17	1	-13/+18
\| \| \| \| \| \| \| \| \| \|	Use the raw version (ie. IDXEN=0) because vindex is unused. Use the old intrinsic for compare&swap because the new one hangs the GPU for some reasons. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	gallivm: fix saturated signed add / sub with llvm 9	Roland Scheidegger	2019-04-17	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm 8 removed saturated unsigned add / sub x86 sse2 intrinsics, and now llvm 9 removed the signed versions as well - they were proposed for removal earlier, but the pattern to recognize those was very complex, so it wasn't done then. However, instead of these arch-specific intrinsics, there's now arch-independent intrinsics for saturated add / sub, both for signed and unsigned, so use these. They should have only advantages (work with arbitrary vector sizes, optimal code for all archs), although I don't know how well they work in practice for other archs (at least for x86 they do the right thing). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110454 Reviewed-by: Brian Paul <[email protected]>
*	meson: Add dependency on genxml to anvil genfiles	Juan A. Suarez Romero	2019-04-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This fixes a race condition where anv_gen_files are executed before genxml files, which causes a build failure v2: add dependency on idep_genxml (Lionel) Fixes: d1992255bb29054fa51763376d125183a9f602f ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/perf: constify accumlator parameter	Lionel Landwerlin	2019-04-17	2	-3/+3
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	intel/perf: drop counter size field	Lionel Landwerlin	2019-04-17	4	-9/+26
\| \| \| \| \| \| \|	We can deduct the size from another field, let's just save some space. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: perf: add mdapi pipeline statistics queries on gen10/11	Lionel Landwerlin	2019-04-17	2	-1/+10
\| \| \| \| \| \| \| \| \|	The Gen10+ expected format adds an additional counter which we can't disclose yet. We can still make the size of the expected query result match. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	intel/perf: stub gen10/11 missing definitions	Lionel Landwerlin	2019-04-17	1	-0/+4
\| \| \| \|	Reviewed-by: Mark Janes <[email protected]>
*	i965: move mdapi guid into intel/perf	Lionel Landwerlin	2019-04-17	2	-2/+4
\| \| \| \| \| \| \|	One more thing we want to share between the different APIs. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: move mdapi result data format to intel/perf	Lionel Landwerlin	2019-04-17	7	-98/+138
\| \| \| \| \| \| \|	We want to reuse this in Anv. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: move brw_timebase_scale to device info	Lionel Landwerlin	2019-04-17	6	-19/+22
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: move OA accumulation code to intel/perf	Lionel Landwerlin	2019-04-17	5	-199/+229
\| \| \| \| \| \| \|	We'll want to reuse this in our Vulkan extension. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: move mdapi data structure to intel/perf	Lionel Landwerlin	2019-04-17	3	-97/+128
\| \| \| \| \| \| \|	We'll want to reuse those structures later on. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: extract performance query metrics	Lionel Landwerlin	2019-04-17	31	-866/+1098
\| \| \| \| \| \| \| \| \| \|	We would like to reuse performance query metrics in other APIs. Let's make the query code dealing with the processing of raw counters into human readable values API agnostic. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: store device revision in gen_device_info	Lionel Landwerlin	2019-04-17	4	-6/+5
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/compiler/icl: Use tcs barrier id bits 24:30 instead of 24:27	Topi Pohjolainen	2019-04-17	1	-7/+17
\| \| \| \| \| \| \| \| \|	Similarly to 1cc17fb731466c68586915acbb916586457b19bc Fixes gpu hangs with dEQP-VK.tessellation.shader_input_output.barrier Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
*	virgl: document potentially failing blit	Erik Faye-Lund	2019-04-17	1	-0/+6
\| \| \| \| \| \| \| \| \|	This blit can fail, but this is not new; in the old version we didn't even try to blit in this case. So let's just document the limitation for now, and leave this for another day. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: do color-conversion during when mapping transfer	Erik Faye-Lund	2019-04-17	1	-10/+70
\| \| \| \| \| \| \| \| \| \| \| \| \|	When running on OpenGL ES, we can't just map any format for reading, because of limitations on glReadPixels. So let's fall back to the blit code-path, and translate the pixels to the correct format in the end. This fixes the remaining failures of KHR-GL32.packed_pixels.* apart from the sRGB tests. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: only blit if resource is read	Erik Faye-Lund	2019-04-17	1	-2/+5
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: get readback-formats from host	Erik Faye-Lund	2019-04-17	3	-0/+44
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	gallium/util: support translating between uint and sint formats	Erik Faye-Lund	2019-04-17	1	-0/+62
\| \| \| \| \| \| \| \|	Without this, we can't for instance convert between r8_sint and r8g8b8a8_sint. But that's pretty useful, so let's support it as well. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: make sure bind is set for non-buffers	Erik Faye-Lund	2019-04-17	1	-0/+3
\| \| \| \| \| \| \|	Otherwise, virglrenderer will reject the resource. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: support write-back with staged transfers	Erik Faye-Lund	2019-04-17	2	-22/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently don't support writing to resources that uses a temporary staging-resource to resolve the pixels. If a write-bit was set, we forgot to perform a blit back to the old resource, followed by trying to update the wrong resource, which lacks backing-storage. The end-result would be that nothing useful happened. This approach also fixes a few smaller bugs, like using the wrong box (without x y and z zeroed out), which means a partial update of a multisampled texture could result in the wrong part of the texture being updated. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: use pipe_box for blit dst-rect	Erik Faye-Lund	2019-04-17	1	-5/+12
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: rewrite core of virgl_texture_transfer_map	Erik Faye-Lund	2019-04-17	1	-36/+58
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: return error if allocating resolve_tmp fails	Erik Faye-Lund	2019-04-17	1	-0/+4
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: wait for the right resource	Erik Faye-Lund	2019-04-17	1	-1/+1
\| \| \| \| \| \| \| \|	In case we're resolving, we need to wait for the resolved resource instead of the original one. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: check for readback on correct resource	Erik Faye-Lund	2019-04-17	1	-1/+1
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: make unmap queuing a bit more straight-forward	Erik Faye-Lund	2019-04-17	1	-5/+7
\| \| \| \| \| \| \| \| \|	It's hard to read the code that decides if we want to queue up an unmap or destroy the transfer right away. So let's make it a bit simpler, by setting a bool in case we want to queue it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: simplify virgl_texture_transfer_unmap logic	Erik Faye-Lund	2019-04-17	1	-13/+9
\| \| \| \| \| \| \| \|	There's no reason to keep an extra indentation level here, let's merge the two if-conditions. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: track full virgl_resource instead of just virgl_hw_res	Erik Faye-Lund	2019-04-17	1	-5/+5
\| \| \| \| \|	Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: tmp_resource -> templ	Erik Faye-Lund	2019-04-17	1	-4/+3
\| \| \| \| \| \| \| \|	This isn't the temporary resource itself, it's the template that we'll create the resource from. So let's name it appropriately. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	virgl: remove pointless transfer-counter	Erik Faye-Lund	2019-04-17	4	-4/+2
\| \| \| \| \| \| \|	This is only written to, never read. Let's just get rid of it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
*	radeonsi/nir: fix scanning of bindless images	Timothy Arceri	2019-04-17	1	-38/+37
\| \| \| \|	Fixes: d62d434fe920 ("ac/nir_to_llvm: add image bindless support")