mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	meson: move libsensors dependency to libgallium	Dylan Baker	2018-01-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	This simplifies the build by removing the need to link targets against libsensors. Suggested-by: Emil Velikov <[email protected]> Signed-off-by: Dylan Baker <[email protected]> Acked-by: Eric Engestrom <[email protected]>
*	nvc0: enable bindless on kepler	Ilia Mirkin	2018-01-07	1	-3/+3
\| \| \| \| \| \| \|	All the functionality is in. Maxwell will take a little bit more enablement work. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: add bindless image support for kepler	Ilia Mirkin	2018-01-07	11	-75/+272
\| \| \| \| \| \| \| \|	A part of the driver constbuf area is allocated for bindless images. Any update requires uploading to all driver constbufs. This also extends the driver constbuf to 64KB, up from 2KB. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: add support for bindless textures on kepler+	Ilia Mirkin	2018-01-07	10	-5/+183
\| \| \| \| \| \| \| \| \|	This keeps a list of resident textures (per context), and dumps that list into the active buffer list when submitting. We also treat bindless texture fetches slightly differently, wrt the meaning of indirect, and not requiring the SAMPLER file to be used. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: use the image info in the instruction rather than decl	Ilia Mirkin	2018-01-07	1	-52/+24
\| \| \| \| \| \| \| \| \| \|	In preparation for bindless images, we have to retrieve the target/format info from the instruction directly, as there will be no declaration. Furthermore, for bound images, this information is still available in the instruction, so we can drop the declaration-based mechanism entirely. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: safen up lowering logic against overwriting reused values	Ilia Mirkin	2018-01-07	1	-2/+4
\| \| \| \| \| \| \| \| \|	I'm fairly sure both of the changed sites are OK as-is, but they're fragile, so this is just safening them up. Since this is happening pre-ssa, we don't want to be overwriting values that may potentially get used later on. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: update tic in-place when buffer address changes	Ilia Mirkin	2018-01-07	2	-14/+21
\| \| \| \| \| \|	This is helpful for bindless, where changing TIC id's is undesirable. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: ensure that pushbuf keeps ref to old text/tls bos	Ilia Mirkin	2018-01-07	1	-0/+13
\| \| \| \| \| \| \| \| \|	If we free the bo, then the PTE may get deallocated immediately. We have to make sure that the submission includes a ref to the old bo so that it remains mapped for the duration of the command execution. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	nv50/ir: Fix unused var warnings in release build	Rhys Kidd	2017-12-29	2	-2/+4
\| \| \| \| \| \| \| \|	v2: Add preventative comment (Ilia Mirkin) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Signed-off-by: Rhys Kidd <[email protected]>
*	nvc0: Fix unused var warnings in release build	Rhys Kidd	2017-12-29	1	-3/+4
\| \| \| \| \|	Reviewed-by: Pierre Moreau <[email protected]> Signed-off-by: Rhys Kidd <[email protected]>
*	nv50: Fix unused var warning in release build	Rhys Kidd	2017-12-29	1	-1/+2
\| \| \| \| \|	Reviewed-by: Pierre Moreau <[email protected]> Signed-off-by: Rhys Kidd <[email protected]>
*	gm107/ir: use lane 0 for manual textureGrad handling	Ilia Mirkin	2017-12-22	1	-21/+34
\| \| \| \| \| \| \| \| \| \|	This is parallel to the pre-SM50 change which does this. Adjusts the shuffles / quadops to make the values correct relative to lane 0, and then splat the results to all lanes for the final move into the target register. Signed-off-by: Ilia Mirkin <[email protected]> Tested-By: Karol Herbst <[email protected]>
*	nvc0/ir: change textureGrad to always use lane 0 as the tex origin	Ilia Mirkin	2017-12-19	1	-14/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to Karol Herbst for the debugging / tracing work that led to this change. Move to using lane 0 as the "work" lane for the texture. It is unclear why this helps, as that computation should be identical to doing it in the "correct" lane with the properly adjusted quadops. In order to be able to use the lane 0 result, we also have to ensure that lane 0 contains the proper array/indirect/shadow values. This applies to Fermi and Kepler. Maxwell+ may or may not need fixing, but that lowering logic is separate. Fixes KHR-GL45.texture_cube_map_array.sampling Signed-off-by: Ilia Mirkin <[email protected]>
*	gallium: plumb context priority through to driver	Rob Clark	2017-12-19	3	-0/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
*	meson: define driver dependencies	Dylan Baker	2017-12-04	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This allow us to encapsulate the compiler and linkage requirements of each driver in a reusable way. The result will be that each target that needs a specific driver can simply add `driver_<name>` to its dependencies line and the necessary libraries and compiler args will be added. This will allow for a lot of code de-duplication between gallium targets. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	nvc0/ir: Properly lower 64-bit shifts when the shift value is >32	Pierre Moreau	2017-12-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: 61d7676df77 "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30" Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current set-up: uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901 uniform uint64_t uval 0x1400000085010203 uniform int shl 36 uniform int shr 36 uniform int64_t iexpected_shl 0x09ac901000000000 uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010 uniform uint64_t uexpected_shl 0x5010203000000000 uniform uint64_t uexpected_shr 0x0000000001400000 draw rect ortho 12 0 4 4 Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	meson: Add lmsensors support	Dylan Baker	2017-12-01	1	-1/+1
\| \| \| \| \| \| \| \|	v2: - Make -Dlmsensors=false work - Simplify auto and true cases Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	nouveau/compiler: Allow to omit line numbers when printing instructions	Tobias Klausmann	2017-11-26	5	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff! V2: - Use environmental variable (Karol Herbst) V3: - Use the already populated nv50_ir_prog_info to forward information to the print pass (Pierre Moreau) V4: - get rid of default value in PrintPass constructor Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: move LateAlgebraicOpt to the very end	Ilia Mirkin	2017-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: when merging immediates/consts, load directly	Ilia Mirkin	2017-11-26	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a MERGE operation gets its constraint moves added, it susbstantially extends live ranges to be reusing an immediate from earlier in the program (not to mention the silliness of loading an immediate into a register, and then moving into another register). We detect these scenarios and insert moves that take the immediate or constbuf load directly into the register. If it's the last use, then we can just move that operation to the closer location. With SM35 (255 regs) we get these results: total instructions in shared programs : 6583670 -> 6580681 (-0.05%) total gprs used in shared programs : 950818 -> 944261 (-0.69%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60367456 -> 60339896 (-0.05%) local shared gpr inst bytes helped 0 0 4584 3186 3186 hurt 0 0 55 968 968 I suspect they will be better for SM20 and SM30. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add optimization for modulo by a non-power-of-2 value	Ilia Mirkin	2017-11-26	1	-0/+15
\| \| \| \| \| \| \| \|	We can still use the optimized division methods which make use of multiplication with overflow. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
*	nv50/ir: optimize signed integer modulo by pow-of-2	Ilia Mirkin	2017-11-25	2	-10/+29
\| \| \| \| \| \| \| \| \|	It's common to use signed int modulo in GLSL. As it happens, the GLSL specs allow the result to be undefined, but that seems fairly surprising. It's not that much more effort to get it right, at least for positive modulo operators. Signed-off-by: Ilia Mirkin <[email protected]>
*	meson: don't use build_by_default for specific gallium drivers	Dylan Baker	2017-11-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using build_by_default : false is convenient for dependencies that can be pulled in by various diverse components of the build system, the gallium hardware/software drivers and state trackers do not fit that description. Instead, these should be guarded using the variable that tracks whether that driver should be enabled. This leaves a few helper libraries: trace, rbug, etc, and the generic winsys bits as `build_by_default : false` because there are a large number of gallium components that pull them in. v2: - remove build_by_default from winsys convenience libs as well. v3: - Always put drivers before winsys for consistency Signed-off-by: Dylan Baker <[email protected]> Tested-by: Lionel Landwerlin <[email protected]> (v1) Reviewed-by: Eric Anholt <[email protected]>
*	gallium: add CAPs to support HW atomic counters. (v3)	Dave Airlie	2017-11-10	3	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This looks like an evergreen specific feature, but with atomic counters AMD have hw specific counters they use instead of operating on buffers directly. These are separate to the buffer atomics, so require different limits and code paths. I've left the CAP for atomic type extensible in case someone else has a variant on this sort of thing (freedreno maybe?) and needs to change it. This adds all the CAPs required to add support for those atomic counters, along with a related CAP for limiting the number of output resources. I'd like to land this and the st patch then I can start to upstream the evergreen support for these and other GL4.x features. v2: drop the ATOMIC_COUNTER_MODE cap, just use the return from the HW counters. If 0 we use the current mode. v3: fix some rebase errors (Gert Wollny) Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	util: move os_time.[ch] to src/util	Nicolai Hähnle	2017-11-09	2	-2/+2
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	nv50: make blending work so that zero wins in a multiplication	Ilia Mirkin	2017-11-08	1	-0/+5
\| \| \| \| \| \| \|	This matches nvc0 behavior, tested with the fbo-float-nan piglit. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann<[email protected]>
*	gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET	Marek Olšák	2017-11-06	3	-0/+3
\|
*	nv50,nvc0: Display shared memory usage in pipe_debug_message	Pierre Moreau	2017-11-04	2	-6/+8
\| \| \| \|	Signed-off-by: Pierre Moreau <[email protected]>
*	nv50,nvc0: Copy shared memory per block to the program info structure and back	Pierre Moreau	2017-11-04	2	-0/+4
\| \| \| \| \| \| \| \|	In OpenCL/CUDA kernels, shared memory usage can be defined within the kernel code. Those usage will only be picked up while parsing the SPIR-V, during the translation phase of the program. Signed-off-by: Pierre Moreau <[email protected]>
*	nv50/ir: Store shared memory per block in nv50_ir_prog_info	Pierre Moreau	2017-11-04	1	-0/+1
\| \| \| \|	Signed-off-by: Pierre Moreau <[email protected]>
*	gallium: add cap for driver specified max combined shader resources.	Dave Airlie	2017-11-01	3	-0/+3
\| \| \| \| \| \| \| \|	Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	gallium: s/unsigned/enum pipe_prim_type/	Brian Paul	2017-10-27	1	-1/+1
\| \| \| \| \| \|	In the vbuf_render::set_primitive() functions. Reviewed-by: Roland Scheidegger <[email protected]>
*	meson: build nouveau (gallium) driver	Dylan Baker	2017-10-16	1	-0/+224
\| \| \| \| \| \| \| \| \| \| \|	Tested with a GK107. v2: - Add target for nouveau standalone compiler. This target is not built by default. v3: - Add nouveau to list of drivers built by default Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <eric at anholt.net>
*	nv50,nvc0: fix push hint logic in presence of a start offset	Ilia Mirkin	2017-10-11	2	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously buffer offsets were passed in explicitly as an offset, which had to be added to the resource address. Now they are passed in via an increased 'start' parameter. As a result, we were double-adding the start offset in this kind of situation. This condition was triggered by piglit's draw-elements test which has a requisite glMultiDrawElements in combination with a small enough number of vertices to go through the immediate push path. Fixes: 330d0607ed6 ("gallium: remove pipe_index_buffer and set_index_buffer") Reported-by: Karol Herbst <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
*	gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4.	Eric Anholt	2017-10-10	3	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because vc4 can control the order that tiles are rasterized in, we can use it to implement overlapping blits using normal drawing and GL_ARB_texture_barrier, as long as we can tell the kernel what order to render the tiles in. This commit introduces the core gallium support, vc4 changes will follow. v2: Fix on the simulator. v3: Add the cap (disabled) to other drivers, add rst docs for the cap. v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS v5: Drop vc4 changes from this commit, for clarity. Reviewed-by: Nicolai Hähnle <[email protected]> (v3)
*	nv50/ir: fix 64-bit integer shifts	Ilia Mirkin	2017-10-09	1	-1/+3
\| \| \| \| \| \| \| \| \|	TGSI was adjusted to always pass in 64-bit integers but nouveau was left with the old semantics. Update to the new thing. Fixes: d10fbe5159 (st/glsl_to_tgsi: fix 64-bit integer bit shifts) Reported-by: Karol Herbst <[email protected]> Cc: [email protected]
*	gallium: add PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS	Marek Olšák	2017-10-06	3	-0/+3
\| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: Remove util_format_s3tc_init()	Matt Turner	2017-10-02	1	-2/+0
\| \| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	gallium: add LDEXP TGSI instruction and corresponding cap	Nicolai Hähnle	2017-09-29	3	-0/+4
\| \| \| \| \|	Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
*	gallium: Add PIPE_SHADER_CAP_INT64_ATOMICS	Jan Vesely	2017-09-21	3	-0/+3
\| \| \| \| \| \| \|	Denotes availability of 64bit int atomic instructions Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium: Add PIPE_SHADER_CAP_FP16	Jan Vesely	2017-09-18	3	-0/+4
\| \| \| \| \| \| \| \| \|	Denotes native half precision float operations capability v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16 fix indentation Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	nvc0: fix compile error	Benedikt Schemmer	2017-09-18	1	-1/+1
\| \| \| \| \| \| \|	Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Signed-off-by: Benedikt Schemmer <[email protected]> Previously-pointed-out-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE	Nicolai Hähnle	2017-09-18	5	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To be able to properly distinguish between GL_ANY_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED_CONSERVATIVE. This patch goes through all drivers, having them treat the two query types identically, except: 1. radeon incorrectly enabled conservative mode on PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE. 2. st/mesa uses the new query type. Fixes dEQP-GLES31.functional.fbo.no_attachments.* Reviewed-by: Marek Olšák <[email protected]>
*	gallium: introduce PIPE_CAP_LOAD_CONSTBUF	Timothy Arceri	2017-09-15	3	-0/+3
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	nvc0/ir: propagate immediates to CALL input MOVs	Tobias Klausmann	2017-08-31	1	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On using builtin functions we have to move the input to registers $0 and $1, if one of the input value is an immediate, we fail to propagate the immediate: ... mov u32 $r477 0x00000003 (0) ... mov u32 $r0 %r473 (0) mov u32 $r1 $r477 (0) call abs BUILTIN:0 (0) mov u32 %r495 $r1 (0) ... With this patch the immediate is propagated, potentially causing the first MOV to be superfluous, which we'd remove in that case: ... mov u32 $r0 %r473 (0) mov u32 $r1 0x00000003 (0) call abs BUILTIN:0 (0) mov u32 %r495 $r1 (0) ... Shaderdb stats: total instructions in shared programs : 4893460 -> 4893324 (-0.00%) total gprs used in shared programs : 582972 -> 582881 (-0.02%) total local used in shared programs : 17960 -> 17960 (0.00%) local gpr inst bytes helped 0 91 112 112 hurt 0 0 0 0 v2: implement some changes proposed by imirkin, the manual deletion of the dead mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call to division function") as the potentially dead mov is unlinked properly, causing later passes to not notice the mov op at all and thus not cleaning it up. That makes up a big chunk of the regression the above commit caused. Keep the deletion of the op where it is, deleting it later unnecessarily blows up size of the change. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: write 0 to pipeline_statistics.cs_invocations	Karol Herbst	2017-08-31	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	cs_invocations are currently unsupported, but leaving the field uninitialized is even worse. fixes on nvc0: * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	nv50/ir: properly set sType for TXF ops to U32	Ilia Mirkin	2017-08-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	All of the coordinates and LOD args are integers for TXF. This mostly doesn't matter, except for converting into a levelZero=true operation by removing an explicit zero LOD. For the comparison against zero to work properly, the sType of the instruction has to be set correctly. Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch Reported-by: Karol Herbst <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	gallium: remove TGSI opcode SCS	Marek Olšák	2017-08-22	3	-32/+0
\| \| \| \| \| \| \|	use COS+SIN instead. Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Jose Fonseca <[email protected]>
*	gallium: remove TGSI opcode XPD	Marek Olšák	2017-08-22	5	-39/+0
\| \| \| \| \| \|	use MUL+MAD+MOV instead. Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium: remove TGSI opcode DPH	Marek Olšák	2017-08-22	3	-16/+0
\| \| \| \| \| \|	use DP4 or DP3 + ADD. Reviewed-by: Roland Scheidegger <[email protected]>