mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ilo: update SF_CLIP_VIEWPORT for Gen8	Chia-I Wu	2015-02-12	3	-14/+40
\|
*	ilo: update streamout related functions for Gen8	Chia-I Wu	2015-02-12	3	-44/+78
\|
*	ilo: update 3DSTATE_{DS,HS,GS} for Gen8	Chia-I Wu	2015-02-12	1	-8/+24
\|
*	ilo: update 3DSTATE_CONSTANT_x for Gen8	Chia-I Wu	2015-02-12	1	-3/+16
\|
*	ilo: update 3DSTATE_URB_x for Gen8	Chia-I Wu	2015-02-12	1	-1/+8
\|
*	ilo: update 3DSTATE_PUSH_CONSTANT_ALLOC_x for Gen8	Chia-I Wu	2015-02-12	1	-7/+8
\|
*	ilo: update render engine common helpers for Gen8	Chia-I Wu	2015-02-12	4	-34/+91
\|
*	ilo: update BLT helpers for Gen8	Chia-I Wu	2015-02-12	1	-25/+58
\|
*	ilo: update MI helpers for Gen8	Chia-I Wu	2015-02-12	2	-30/+59
\|
*	ilo: add functions for Gen8 relocs	Chia-I Wu	2015-02-12	1	-6/+39
\| \| \| \| \|	Extend ilo_builder_writer_reloc() for Gen8 memory addressing. Add new wrappers, ilo_builder_surface_reloc64(() and ilo_builder_batch_reloc64().
*	ilo: update the toy compiler for Gen8	Chia-I Wu	2015-02-12	5	-91/+501
\| \| \| \|	Based on what we know from the classic driver.
*	ilo: update genhw headers	Chia-I Wu	2015-02-12	19	-529/+1704
\| \| \| \| \| \| \|	Accumulated changes for various renames and additions, including Gen8 definitions. Some of the dynamic state __SIZE no longer means the size of an element, but the size of an array of elements. The changes can be seen in ilo_render_dynamic.c.
*	ilo: clean up ilo_gpe_init_dsa()	Chia-I Wu	2015-02-12	1	-54/+82
\| \| \| \| \|	Add dsa_get_stencil_enable_gen6(), dsa_get_depth_enable_gen6(), and dsa_get_alpha_enable_gen6() to be called from ilo_gpe_init_dsa().
*	ilo: clean up ilo_gpe_init_blend()	Chia-I Wu	2015-02-12	3	-87/+106
\| \| \| \|	Make ilo_blend_state more space efficient and forward-looking.
*	ilo: clean up sample patterns	Chia-I Wu	2015-02-12	5	-68/+71
\| \| \| \| \|	Use signed int for sample positions and add helpers to access them. Call them patterns instead of positions.
*	glsl: Optimize (f2i(trunc x)) into (f2i x).	Matt Turner	2015-02-11	1	-0/+9
\| \| \| \| \| \|	total instructions in shared programs: 5950326 -> 5949286 (-0.02%) instructions in affected programs: 88264 -> 87224 (-1.18%) helped: 692
*	glsl: Optimize round-half-up pattern.	Matt Turner	2015-02-11	1	-0/+33
\| \| \| \| \|	Hurts some Psychonauts shaders, but after the next patch (which this enables) they're fewer instructions than before this patch.
*	glsl: Add trunc() to ir_builder.	Matt Turner	2015-02-11	2	-0/+6
\|
*	i965: Add LINTERP/CINTERP to can_do_cmod().	Matt Turner	2015-02-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	LINTERP is implemented as a PLN instruction or a LINE+MAC. PLN and MAC can do conditional mod. CINTERP is just a MOV. total instructions in shared programs: 5952103 -> 5950284 (-0.03%) instructions in affected programs: 324573 -> 322754 (-0.56%) helped: 1819 We lose the SIMD16 in one Unigine Heaven shader which appears six times in shader-db.
*	program: Remove _mesa_nop_vertex_program/_mesa_nop_fragment_program.	Matt Turner	2015-02-11	2	-97/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Dead since commit 284ce20901b0c2cfab1d952cc129b8f3cd068f12 Author: Eric Anholt <eric@anholt.net> Date: Fri Aug 20 10:52:14 2010 -0700 Remove remnants of the old glsl compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	nir: Recognize open-coded fmin/fmax.	Matt Turner	2015-02-11	1	-0/+2
\| \| \| \| \| \| \| \| \|	And unfortunately other shaders do the same thing but with >=/<= which we can't apply this optimization to because of NaNs. instructions in affected programs: 23309 -> 22938 (-1.59%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	nir: Add algebraic opt for int comparisons with identical operands.	Eric Anholt	2015-02-11	1	-0/+9
\| \| \| \| \| \| \| \| \|	No change on shader-db on i965. v2: Reword the comment due to feedback from Erik Faye-Lund Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
*	nir: Fix load_const comparisons for CSE.	Eric Anholt	2015-02-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%) Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	i965/fs: Remove conditional mod when optimizing a SEL into a MOV.	Matt Turner	2015-02-11	1	-0/+1
\| \| \| \|	Missed in commit ca675b73, but got right in the companion commit 3c28b2c0.
*	darwin: build fix	Jeremy Huddleston Sequoia	2015-02-10	1	-0/+5
\| \| \| \| \| \| \| \| \|	xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration] glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable); ^ Fixes regression from 291be28476ea60c6fb1eb2a882e2e25def5d3735 Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
*	darwin: build fix	Jeremy Huddleston Sequoia	2015-02-10	1	-0/+1
\| \| \| \| \| \|	../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
*	glsl: Optimize 1/exp(x) into exp(-x).	Matt Turner	2015-02-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Lots of shaders divide by exp2(...) which we turn into a multiplication by the reciprocal. We can avoid the reciprocal by simply negating exp2's argument. total instructions in shared programs: 5947154 -> 5946695 (-0.01%) instructions in affected programs: 118661 -> 118202 (-0.39%) helped: 380 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
*	nir: Remove casts from void*.	Matt Turner	2015-02-10	4	-14/+13
\| \| \| \|	Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	nir: Replace assert(0) with unreachable().	Matt Turner	2015-02-10	1	-7/+7
\| \| \| \|	Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	nir: Remove unused has_indirect variable.	Matt Turner	2015-02-10	1	-4/+0
\| \| \| \|	Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
*	i965/vec4: Emit MADs from (x + abs(y * z)).	Matt Turner	2015-02-10	1	-3/+15
\| \| \| \| \| \| \| \| \| \|	Same as commit 3654b6d4 to the fs backend. total instructions in shared programs: 5945788 -> 5945787 (-0.00%) instructions in affected programs: 36 -> 35 (-2.78%) helped: 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/vec4: Emit MADs from (x + -(y * z)).	Matt Turner	2015-02-10	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Same as commit c4fab711 to the fs backend. total instructions in shared programs: 5945998 -> 5945788 (-0.00%) instructions in affected programs: 74665 -> 74455 (-0.28%) helped: 399 HURT: 180 It hurts some programs because we make no attempts in the vec4 backend to avoid MADs if they have constant (or vector uniform) arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/skl: Implement WaDisable1DDepthStencil	Neil Roberts	2015-02-10	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	Skylake+ doesn't support setting a depth buffer to a 1D surface but it does allow pretending it's a 2D texture with a height of 1 instead. This fixes the GL_DEPTH_COMPONENT_* tests of the copyteximage piglit test (and also seems to avoid a subsequent GPU hang). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89037 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/gen7-8: Implement glMemoryBarrier().	Francisco Jerez	2015-02-10	2	-0/+41
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Generalize the update_null_renderbuffer_surface vtbl hook to ↵	Francisco Jerez	2015-02-10	4	-56/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	non-renderbuffers. Null surfaces are going to be useful to have something to point unbound image units to, as the ARB_shader_image_load_store extension requires us to behave deterministically in cases where some shader tries to access an unbound image unit: Invalid stores and atomics are supposed to be discarded and invalid loads are supposed to return zero, which is precisely what the null surface does. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Allocate binding table space for shader images.	Francisco Jerez	2015-02-10	2	-0/+12
\| \| \| \| \| \| \|	v2: Bump the number of supported image uniforms to 32 (Ken). Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Don't tile 1D miptrees.	Francisco Jerez	2015-02-10	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	It doesn't really improve locality of texture fetches, quite the opposite it's a waste of memory bandwidth and space due to tile alignment. v2: Check mt->logical_height0 instead of mt->target (Ken). Add short comment explaining why they shouldn't be tiled. Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/vec4: Don't set any dependency control bits for F32TO16 on Gen8.	Francisco Jerez	2015-02-10	1	-0/+5
\| \| \| \| \| \|	It's expanded to several instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Handle negated unsigned immediate values in constant propagation.	Francisco Jerez	2015-02-10	3	-19/+19
\| \| \| \| \| \| \| \| \|	Negation of UD/UW sources behaves the same as for D/W sources, taking the two's complement of the source, except for bitwise logical operations on Gen8 and up which take the one's complement. Fixes crash in a GLSL shader with subtraction of two unsigned values. Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Take into account non-zero reg_offset during register allocation.	Francisco Jerez	2015-02-10	1	-1/+3
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Add register classes up to MAX_VGRF_SIZE.	Francisco Jerez	2015-02-10	3	-7/+9
\| \| \| \| \| \| \|	In preparation for some send from GRF instructions that will require larger payloads. Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Init mlen for several send from GRF instructions.	Francisco Jerez	2015-02-10	3	-5/+11
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Don't infer MRF dependencies for send from GRF instructions.	Francisco Jerez	2015-02-10	1	-14/+18
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Fix the scheduler to take into account reads and writes of ↵	Francisco Jerez	2015-02-10	3	-5/+29
\| \| \| \| \| \| \| \|	multiple registers. v2: Avoid nested ternary operators in vec4_instruction::regs_read(). (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Make vec4_visitor::implied_mrf_writes() return zero for sends ↵	Francisco Jerez	2015-02-10	1	-1/+1
\| \| \| \| \| \|	from GRF. Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Pass dst register to the vec4_instruction constructor.	Francisco Jerez	2015-02-10	1	-7/+5
\| \| \| \| \| \|	So regs_written gets initialized with a sensible value. Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Initialize vec4_instruction::predicate and ::predicate_inverse.	Francisco Jerez	2015-02-10	1	-0/+2
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Implement equals() method for dst_reg too.	Francisco Jerez	2015-02-10	2	-0/+18
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.	Francisco Jerez	2015-02-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Scalar registers are required to have zero stride, fix the regs_written calculation not to assume that the instruction writes zero registers in that case. v2: Rename CEILING() to DIV_ROUND_UP(). (Matt, Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/fs: Fix stack allocation of fs_inst and stop stealing src array ↵	Francisco Jerez	2015-02-10	2	-37/+39
\| \| \| \| \| \| \| \| \| \|	provided on construction. Using 'ralloc*(this, ...)' is wrong if the object has automatic storage or was allocated through any other means. Use normal dynamic memory instead. Reviewed-by: Matt Turner <mattst88@gmail.com>