mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl/nir: Fill in the Parameters in NIR linker	Caio Marcelo de Oliveira Filho	2019-09-10	5	-3/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the first UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: Pack gl_program_parameter struct	Caio Marcelo de Oliveira Filho	2019-09-10	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gl_register_file doesn't need 16 bits, so shorten it and use the extra room for 'Padded' (also mark it as a single bit). This shrinks the struct size from 32 bytes to 24 bytes. See also 4794fbc86e3 ("mesa: reduce the size of gl_program_parameter") that shrinked from 40 to 24 and later 7536af670b7 ("glsl: fix shader cache for packed param list") that added `Padded`. v2: Use just 5 bits for gl_register_file. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
*	compiler: Add glsl_contains_opaque() helper	Caio Marcelo de Oliveira Filho	2019-09-10	2	-0/+7
\| \| \| \|	Reviewed-by: Alejandro Piñeiro <[email protected]>
*	mesa/st: Do not rely on name to identify special uniforms	Caio Marcelo de Oliveira Filho	2019-09-10	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \|	Every uniform that have the "gl_" name also have some state slots. So use the state_slots like we did in 57b61849310 ("i965: account for NIR uniforms without name"). This removes the dependency on names, which are optional when using ARB_gl_spirv. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	glsl/nir: Avoid overflow when setting max_uniform_location	Caio Marcelo de Oliveira Filho	2019-09-10	1	-1/+2
\| \| \| \| \| \| \| \| \|	Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned max_uniform_location. Those unmapped uniforms don't have to be accounted at this point. Fixes: 7a9e5cdfbb9 ("nir/linker: Add gl_nir_link_uniforms()") Reviewed-by: Alejandro Piñeiro <[email protected]>
*	meson: build getopt when using msvc	Dylan Baker	2019-09-10	2	-0/+34
\| \| \| \| \| \| \| \|	v4: - Don't wrap a single file in a list to match mesa style - Use null_dep instead of empty list Reviewed-by: Eric Anholt <[email protected]> (v3) Reviewed-by: Eric Engestrom <[email protected]>
*	glapi: export glapi_destroy_multithread when building shared-glapi on windows	Dylan Baker	2019-09-10	2	-1/+2
\| \| \| \| \| \| \| \| \|	Which will allow meson to build a shared glapi build with mingw. v2: - Add symbol to symbol check test Reviewed-by: Eric Anholt <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]>
*	meson: don't build glapi_static_check_table on windows	Dylan Baker	2019-09-10	1	-1/+3
\| \| \| \| \| \| \|	It doesn't compile due to undefined symbols, which are in libglapi_static, so I don't understand the problem. Reviewed-by: Eric Engestrom <[email protected]>
*	meson: don't try to generate i18n translations on windows	Dylan Baker	2019-09-10	1	-2/+4
\| \| \| \|	Reviewed-by: Eric Engestrom <[email protected]>
*	glsl/tests: Handle windows \r\n new lines	Dylan Baker	2019-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \|	Currently the praser for s expressions assumes that newlines will be \n, resulting in incorrect parsing on windows, where the newline is \r\n. This patch just adds \r? to the regular expression used to parse the s expressions, which fixes at 1 test on windows. Reviewed-by: Eric Engestrom <[email protected]>
*	iris: Fix constant buffer sizes for non-UBOs	Kenneth Graunke	2019-09-10	1	-3/+4
\| \| \| \| \| \| \| \| \|	Since the system value refactor, we've accidentally only been setting cbuf->buffer_size in the UBO case, and not in the uploaded-constants case. We use cbuf->buffer_size to fill out the SURFACE_STATE entry, so it needs to be initialized in both cases. Fixes: 3b6d787e404 ("iris: move sysvals to their own constant buffer")
*	radv/gfx10: declare a LDS symbol for the NGG emit space	Samuel Pitoiset	2019-09-10	3	-32/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes some interactions when NGG GS is enabled. It fixes: - dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.geom - dEQP-VK.tessellation.geometry_interaction.passthrough.* For some reasons, using the computed ESGS ring size randomly hangs with CTS. For now, just use the maximum LDS size for ESGS. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: calculate GFX9 GS and GFX10 NGG states before compiling shader variants	Samuel Pitoiset	2019-09-10	1	-35/+48
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: store the ESGS ring size as part of gfx10_ngg_info	Samuel Pitoiset	2019-09-10	2	-1/+3
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: store GFX10 NGG state as part of the shader info	Samuel Pitoiset	2019-09-10	2	-44/+46
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: store GFX9 GS state as part of the shader info	Samuel Pitoiset	2019-09-10	2	-31/+33
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: fill shader info for all stages in the pipeline	Samuel Pitoiset	2019-09-10	4	-20/+130
\| \| \| \| \| \| \| \| \|	This shouldn't be in NIR->LLVM because ACO also needs the shader info. This will also help for computing some NGG values that are necessary for declaring LDS symbols. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: do not pass all compiler options to the shader info pass	Samuel Pitoiset	2019-09-10	3	-28/+33
\| \| \| \| \| \| \|	Only the pipeline layout and the shader keys are needed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: remove redundant si_texture offset and size fields	Marek Olšák	2019-09-09	7	-123/+106
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi: move texture storage allocation outside of radeonsi	Marek Olšák	2019-09-09	4	-51/+97
\| \| \| \| \| \|	possible code sharing with radv Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi: move HTILE allocation outside of radeonsi	Marek Olšák	2019-09-09	4	-91/+93
\| \| \| \| \| \| \|	ac_surface computes it for amdgpu. radeon_drm_surface computes it for radeon. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi: handle NO_DCC early	Marek Olšák	2019-09-09	1	-5/+7
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	ac/surface: add RADEON_SURF_NO_FMASK	Marek Olšák	2019-09-09	4	-12/+14
\| \| \| \| \| \|	This controls FMASK and CMASK computation for MSAA. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	r300,r600,radeonsi: set winsys_handle::stride,offset in drivers, not winsyses	Marek Olšák	2019-09-09	6	-20/+12
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	r300,r600,radeonsi: read winsys_handle::stride,offset in drivers, not winsyses	Marek Olšák	2019-09-09	6	-47/+20
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi/gfx10: fix wave occupancy computations	Marek Olšák	2019-09-09	4	-21/+49
\| \| \| \| \|	Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi: only support at most 1024 threads per block	Marek Olšák	2019-09-09	1	-8/+2
\| \| \| \| \| \|	LLVM 10 won't support 2048. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi: disable DCC when importing a texture from an incompatible driver	Marek Olšák	2019-09-09	1	-4/+12
\| \| \| \| \| \|	and unify the code. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts	Marek Olšák	2019-09-09	1	-1/+1
\| \| \| \| \| \| \|	This fixes a crash. Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi/gfx10: use fma for TGSI_OPCODE_FMA	Marek Olšák	2019-09-09	3	-5/+16
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	ac: use fma on gfx10	Marek Olšák	2019-09-09	2	-1/+9
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	ac: enable LLVM atomic optimizations	Marek Olšák	2019-09-09	1	-1/+9
\|
*	virgl: Fix pipe_resource leaks under multi-sample.	Lepton Wu	2019-09-10	1	-1/+3
\| \| \| \| \| \| \|	Fixes: 900a80f9e4f ("virgl: virgl_transfer should own its virgl_resource") Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
*	iris: Avoid flushing for cache history on transfer range flushes	Kenneth Graunke	2019-09-09	2	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps appending data, and calling glFlushMappedBufferRange(). We were invalidating the VF cache each time it flushed a new range, which results in a ton of VF flushes. If the contents of the destination in the target range are undefined (never even possibly written), this patch makes us assume that it's likely not in the cache and so cache invalidations are required. If the destination range is defined, we continue cache flushing as we may need to expunge stale data. This eliminates 88% of the VF cache invalidates on Manhattan 3.0. Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
*	iris: Optimize out redundant sampler state binds	Kenneth Graunke	2019-09-09	1	-2/+8
\| \| \| \| \| \|	This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in the J2DBench images test. For some reason, the state tracker is calling bind_sampler_state with the same sampler state in a bunch of cases.
*	iris: Add support for the always_flush_cache=true debug option.	Kenneth Graunke	2019-09-09	7	-0/+39
\| \| \| \|	This can be useful for debugging missing flushes.
*	mesa: Eliminate gl_config::rgbMode	Adam Jackson	2019-09-09	8	-68/+31
\| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	mesa: Eliminate gl_config::have{Accum,Depth,Stencil}Buffer	Adam Jackson	2019-09-09	13	-46/+18
\| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	mesa: Remove unused gl_config::indexBits	Adam Jackson	2019-09-09	5	-7/+1
\| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/xlib: Fix an obvious thinko	Adam Jackson	2019-09-09	1	-1/+1
\| \| \| \| \|	x == !GLX_DIRECT_COLOR is a fancy way of writing x == 0, which is clearly not what was meant.
*	iris: Ignore line stipple information if it's disabled	Kenneth Graunke	2019-09-09	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The line stipple pattern and factor only matter if line stippling is actually enabled. Otherwise, we can safely ignore it. PBO upload may give us zero for line stipple information, while normal drawing tends to give us an actual stipple pattern such as 0xffff. This was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus very expensive. Improves performance in Manhattan 3.0 on Skylake GT4e by 0.149261% +/- 0.0380796% (n=210). On an Icelake 8x8 with the GPU frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).
*	lima/ppir: drop fge/flt/feq/fne options	Vasily Khoruzhick	2019-09-09	1	-4/+0
\| \| \| \| \| \| \| \| \|	These are supposed to be lowered into sge/slt/seq/sne equivalents. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima: run opt_algebraic between int_to_float and boot_to_float for vs	Vasily Khoruzhick	2019-09-09	1	-4/+5
\| \| \| \| \| \| \| \| \|	int_to_float emits ftrunc and ftrunc lowering generates bool ops. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: fix warning in gpir disassembler	Vasily Khoruzhick	2019-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes following warning: ../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’: ../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds] 241 \| "xyzw"[src - gpir_codegen_src_attrib_x]); Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: lower fceil	Vasily Khoruzhick	2019-09-09	1	-0/+1
\| \| \| \| \| \| \| \| \|	GP doesn't support fceil so we need to lower it. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Disallow moves for schedule_first nodes	Connor Abbott	2019-09-09	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	The entire point of schedule_first is that the node has to be scheduled as soon as possible without any moves because it doesn't produce a proper floating-point value, or its value changes depending on where you read it. We were still introducing a move for preexp2 in some cases though, even if it got scheduled as soon as possible, which broke some exp() tests. Fix that. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix fake dep handling for schedule_first nodes	Connor Abbott	2019-09-09	2	-10/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	The whole point of schedule_first nodes is that they need to be scheduled as soon as possible, so if a schedule_first node is the successor in a fake dependency that prevents it from being scheduled after its parent, that can cause problems. We need to add these fake dependencies to the parent as well, and we need to guarantee that the pre-RA scheduler puts schedule_first nodes right before their parents in order to prevent this from adding cycles to the dependency graph. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix schedule_first insertion logic	Connor Abbott	2019-09-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	The idea was to make sure schedule_first nodes were always first in the ready list. I made sure they were inserted first, but not that other nodes wouldn't later be scheduled ahead of them. Fixes [email protected]@execution@built-in-functions@vs-exp-float and probably others. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Ignore unscheduled successors in can_use_complex()	Connor Abbott	2019-09-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	The point of the function is to avoid creating a complex move which is used by certain slots in the next instruction, but unscheduled successors will never be in the next instruction. Found while debugging a crash that the previous commit fixed. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Do all lowerings before rsched	Connor Abbott	2019-09-09	3	-23/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scheduler assumes that load nodes are always duplicated so that they can always be scheduled eventually and therefore they never need to be spilled. But some lowerings were running after the pre-RA scheduler, whereas duplication has to happen before then since it's needed for the scheduler to do a better job reducing register pressure. This meant that lowerings were introducing multiple uses of a load instruction, which broke the scheduler's expectation and resulted in infinite loops in situations where the only nodes available to spill were load nodes. Spilling load nodes would be silly, so we want to fix the lowerings rather than the scheduler. Just do all lowerings before the pre-RA scheduler, which also helps with reducing pressure since the scheduler can more accurately compute the pressure. Fixes lima/mesa#104. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>