mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965/blorp: Get rid of brw_blorp_surface_info::map_stencil_as_y_tiled	Jason Ekstrand	2016-08-17	3	-39/+26
\| \| \| \| \| \| \|	Now that we're carrying around the isl_surf, we can just modify it directly instead of passing an extra bit around. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/blorp: Remove compute_tile_offsets	Jason Ekstrand	2016-08-17	2	-34/+5
\| \| \| \| \| \|	We have a handy little function is ISL that does exactly the same thing. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/blorp: Create the isl_surf up-front	Jason Ekstrand	2016-08-17	2	-11/+19
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/blorp/clear: Initialize surface info after allocating an MCS	Jason Ekstrand	2016-08-17	1	-6/+6
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	isl/state: Use a valid alignment for 1-D textures	Jason Ekstrand	2016-08-17	1	-1/+1
\| \| \| \| \| \| \|	The alignment we use doesn't matter (see the comment) but it should at least be an alignment we can represent with the enums. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/miptree: Remove the stencil_as_y_tiled parameter from get_tile_masks	Jason Ekstrand	2016-08-17	4	-10/+8
\| \| \| \| \| \| \|	It's only used to stomp the tiling to Y and it's only used by blorp so there's no reason why blorp can't do it itself. Reviewed-by: Topi Pohjolainen <[email protected]>
*	isl: Fix the parameter names for get_intratile_offset	Jason Ekstrand	2016-08-17	1	-4/+4
\| \| \| \| \| \| \| \|	It's been in elements for a while but, for whatever reason, the parameter names in the header file never got updated. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	util: try to use SSE instructions with MSVC and 32-bit gcc	Brian Paul	2016-08-17	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The lrint() and lrintf() functions are pretty slow and make some texture transfers very inefficient. This patch makes a better effort at using those intrisics for 32-bit gcc and MSVC. Note, this patch doesn't address the use of SSE4.1 with MSVC. v2: get rid of the ROUND_WITH_SSE symbol, per Matt. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	svga: fix src/dst typo in can_blit_via_copy_region_vgpu10()	Brian Paul	2016-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The function was always returning false because of this typo. Retested with piglit. There's some sRGB-related blit failures, but that seems unrelated. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
*	svga: initialize a variable to silence a gcc warning	Brian Paul	2016-08-17	1	-1/+1
\| \| \| \|	Reviewed-by: Charmaine Lee <[email protected]>
*	glsl: Pull enum ir_expression_operation out to its own file	Ian Romanick	2016-08-17	3	-317/+342
\| \| \| \| \| \| \| \| \| \|	No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase... I guess a lot can change in a year. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Make the generated sources build rules more like NIR	Ian Romanick	2016-08-17	3	-6/+5
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	mesa/st: use llabs instead of abs for long args (v2)	Francesco Ansanelli	2016-08-17	1	-1/+1
\| \| \| \| \| \|	v2: long has 32bit on Windows (Marek) Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	radeonsi: fix up buffer descriptor upper-bound checking	Marek Olšák	2016-08-17	1	-1/+1
\| \| \| \| \| \|	st/mesa does this too, so we're safe. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: change pipe_image_view::first_element/last_element -> offset/size	Marek Olšák	2016-08-17	9	-50/+27
\| \| \| \| \| \| \| \| \|	This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
*	gallium: change pipe_sampler_view::first_element/last_element -> offset/size	Marek Olšák	2016-08-17	24	-82/+81
\| \| \| \| \| \| \| \| \| \| \|	This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: assign the highest priority to scratch; make rings second	Marek Olšák	2016-08-17	2	-4/+6
\| \| \| \| \| \| \|	just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/winsys: re-number winsys priority flags	Marek Olšák	2016-08-17	1	-16/+13
\| \| \| \| \| \| \|	free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: mark shader rings as highest-priority buffers	Marek Olšák	2016-08-17	5	-7/+7
\| \| \| \| \| \| \|	and rename the enum Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers	Marek Olšák	2016-08-17	2	-4/+6
\| \| \| \| \|	Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental	Marek Olšák	2016-08-17	4	-16/+38
\| \| \| \| \| \| \| \| \| \|	This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
*	radeonsi: simplify CB_TARGET_MASK logic	Marek Olšák	2016-08-17	1	-14/+7
\| \| \| \| \| \|	we can now rely on CB_COLORn_INFO to disable empty slots. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: don't set CB_COLOR1_INFO for dual src blending	Marek Olšák	2016-08-17	1	-7/+0
\| \| \| \| \| \| \| \| \|	Vulkan doesn't do this. The reason may be that CB_COLOR1_INFO.SOURCE_FORMAT from NI was moved to SPI_SHADER_COL_FORMAT for SI. I asked CB guys about this 2 days ago and they still haven't replied. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: eliminate PS OUT[1] if dual src blending is off and CB1 is not bound	Marek Olšák	2016-08-17	2	-11/+7
\| \| \| \| \| \|	All VP DX9 ports benefit from this. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED	Marek Olšák	2016-08-17	1	-2/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use lp_build_alloca_undef	Nicolai Hähnle	2016-08-17	1	-13/+4
\| \| \| \| \| \| \| \|	Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallivm: add lp_build_alloca_undef	Nicolai Hähnle	2016-08-17	2	-0/+24
\| \| \| \| \|	Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallivm: add create_builder_at_entry helper function	Nicolai Hähnle	2016-08-17	1	-23/+22
\| \| \| \| \| \| \|	Reduces code duplication. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: protect against out of bounds temporary array accesses	Nicolai Hähnle	2016-08-17	1	-0/+15
\| \| \| \| \| \| \|	They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: add radeon_llvm_bound_index for bounds checking	Nicolai Hähnle	2016-08-17	3	-18/+34
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: reduce alloca of temporaries based on usagemask	Nicolai Hähnle	2016-08-17	2	-10/+54
\| \| \| \| \| \|	v2: take actual writemasks into account Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: use tgsi_scan_arrays for temp arrays	Nicolai Hähnle	2016-08-17	3	-5/+10
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: allocate temps array info in radeon_llvm_context_init	Nicolai Hähnle	2016-08-17	3	-36/+47
\| \| \| \| \| \| \| \| \|	Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: always do the full store in store_value_to_array	Nicolai Hähnle	2016-08-17	1	-49/+28
\| \| \| \| \| \| \| \| \|	Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: extract common getelementptr logic into get_pointer_into_array	Nicolai Hähnle	2016-08-17	1	-39/+66
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: pass indirect register info into get_alloca_for_array	Nicolai Hähnle	2016-08-17	1	-5/+6
\| \| \| \| \| \|	To have the same signature as get_array_range. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: extract common lookup code into get_temp_array function	Nicolai Hähnle	2016-08-17	1	-33/+40
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: clarify the comment on the array alloca heuristic	Nicolai Hähnle	2016-08-17	1	-10/+19
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: more descriptive names for LLVM temporaries in debug builds	Nicolai Hähnle	2016-08-17	1	-2/+12
\| \| \| \| \|	Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing	Nicolai Hähnle	2016-08-17	1	-7/+0
\| \| \| \| \| \| \|	We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: simplify radeon_llvm_emit_fetch for direct array addressing	Nicolai Hähnle	2016-08-17	1	-5/+0
\| \| \| \| \| \| \|	We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: clean up emit_declaration for temporaries	Nicolai Hähnle	2016-08-17	1	-9/+18
\| \| \| \| \| \| \| \|	In the alloca'd array case, no longer create redundant and unused allocas for the individual elements; create getelementptrs instead. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	st_glsl_to_tgsi: use calloc the way it's meant to be used	Nicolai Hähnle	2016-08-17	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	tgsi/scan: add tgsi_scan_arrays	Nicolai Hähnle	2016-08-17	2	-0/+93
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	glsl: Add missing ir_quadop_vector constant evaluation for Boolean types	Ian Romanick	2016-08-17	1	-0/+3
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Fix typo in ir_unop_f2u implementation	Ian Romanick	2016-08-17	1	-1/+1
\| \| \| \| \| \| \|	This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Fix typo in ir_unop_b2i implementation	Ian Romanick	2016-08-17	1	-1/+1
\| \| \| \| \| \| \|	This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Don't support integer types for operations that can't handle them	Ian Romanick	2016-08-17	2	-14/+2
\| \| \| \| \| \| \| \|	ir_unop_fract already forbade integer types in ir_validate. ir_unop_rcp, ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers	Ian Romanick	2016-08-17	2	-6/+9
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir/algebraic: Optimize common array indexing sequence	Ian Romanick	2016-08-17	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some shaders include code that looks like: uniform int i; uniform vec4 bones[...]; foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]); CSE would do some work on this: x = i * 3 foo(bones[x], bones[x + 1], bones[x + 2]); The compiler may then add '<< 4 + base' to the index calculations. This results in expressions like x = i * 3 foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]); Just rearranging the math to produce (i * 48) + 16 saves an instruction, and it allows CSE to do more work. x = i * 48; foo(bones[x], bones[x + 16], bones[x + 32]); So, ~6 instructions becomes ~3. Some individual shader-db results look pretty bad. However, I have a really, really hard time believing the change in estimated cycles in, for example, 3dmmes-taiji/51.shader_test after looking that change in the generated code. G45 total instructions in shared programs: 4020840 -> 4010070 (-0.27%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 98829000 -> 98784990 (-0.04%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Ironlake total instructions in shared programs: 6418887 -> 6408117 (-0.17%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 143504542 -> 143460532 (-0.03%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Sandy Bridge total instructions in shared programs: 8357887 -> 8339251 (-0.22%) instructions in affected programs: 432715 -> 414079 (-4.31%) helped: 2795 HURT: 0 total cycles in shared programs: 118284184 -> 118207412 (-0.06%) cycles in affected programs: 6114626 -> 6037854 (-1.26%) helped: 2478 HURT: 317 Ivy Bridge total instructions in shared programs: 7669390 -> 7653822 (-0.20%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68381982 -> 68263684 (-0.17%) cycles in affected programs: 1972658 -> 1854360 (-6.00%) helped: 2458 HURT: 307 Haswell total instructions in shared programs: 7082636 -> 7067068 (-0.22%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68282020 -> 68164158 (-0.17%) cycles in affected programs: 1891820 -> 1773958 (-6.23%) helped: 2459 HURT: 261 Broadwell total instructions in shared programs: 9002466 -> 8985875 (-0.18%) instructions in affected programs: 658784 -> 642193 (-2.52%) helped: 2795 HURT: 5 total cycles in shared programs: 78503092 -> 78450404 (-0.07%) cycles in affected programs: 2873304 -> 2820616 (-1.83%) helped: 2275 HURT: 415 Skylake total instructions in shared programs: 9156978 -> 9140387 (-0.18%) instructions in affected programs: 682625 -> 666034 (-2.43%) helped: 2795 HURT: 5 total cycles in shared programs: 75591392 -> 75550574 (-0.05%) cycles in affected programs: 3192120 -> 3151302 (-1.28%) helped: 2271 HURT: 425 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>