mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	r600g/compute: Defrag the pool at the same time as we grow it	Bruno Jiménez	2014-07-25	2	-23/+19
\| \| \| \| \| \| \| \| \| \| \| \|	This allows us two things: we now need less item copies when we have to defrag+grow the pool (to just one copy per item) and, even in the case where we don't need to defrag the pool, we reduce the data copied to just the useful data that the items use. Note: The fallback path is a bit ugly now, but hopefully we won't need it much. Reviewed-by: Tom Stellard <[email protected]>
*	r600g/compute: Try to use a temporary resource when growing the pool	Bruno Jiménez	2014-07-25	1	-18/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	Now, before moving everything to host memory, we try to create a new resource to use as a pool. I we succeed we just use this resource and delete the previous one. If we fail we fallback to using the shadow. This should make growing the pool faster, and we can also save 64KB of memory that were allocated for the 'shadow', even if they weren't used. Reviewed-by: Tom Stellard <[email protected]>
*	freedreno: fix typo in gpu version check	Rob Clark	2014-07-25	1	-1/+1
\| \| \| \| \| \| \|	Opps, I should use larger fonts, I guess. Reported-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: split out shader compiler from a3xx	Rob Clark	2014-07-25	25	-477/+580
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 \|- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx/compiler: rename ir3_shader to ir3	Rob Clark	2014-07-25	12	-55/+55
\| \| \| \| \| \| \| \|	First step of reoganization split out compiler (so it can be shared between a3xx and a4xx). Rename ir3_shader -> ir3 (since we'll want the name ir3_shader for a higher level object). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx/compiler: scheduler vs pred reg	Rob Clark	2014-07-25	2	-3/+51
\| \| \| \| \| \| \|	The scheduler also needs to be aware of predicate register (p0) in addition to address register (a0). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx/compiler: little cleanups	Rob Clark	2014-07-25	4	-39/+19
\| \| \| \| \| \|	Remove some obsolete comments, rename deref->addr. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: enable/disable wa's based on patch-level	Rob Clark	2014-07-25	5	-9/+35
\| \| \| \| \| \| \| \|	It seems like for the most part, different behaviors, workarounds, etc, should be conditional on GPU patch revision (ie. a320.0 vs a320.2) rather than GPU id (a320 vs a330). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx/compiler: make IR heap dyanmic	Rob Clark	2014-07-25	2	-8/+43
\| \| \| \| \| \| \| \| \| \| \| \|	The fixed size heap is a remnant of the fdre-a3xx assembler. Yet it is convenient for being able to free the entire data structure in one shot without worrying about leaking nodes. Change it to dynamically grow the heap size (adding chunks) as needed so we don't have an artificial upper limit on shader size (other than hw limits) and don't always have to allocate worst-case size. Signed-off-by: Rob Clark <[email protected]>
*	r600g/compute: Fix singed/unsigned comparison compiler warnings.	Jan Vesely	2014-07-25	1	-7/+7
\| \| \| \| \| \| \|	The iteration variables go from 0 anyway. Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
*	clover: Query the device to see if images are supported	Tom Stellard	2014-07-25	3	-1/+8
\| \| \| \|	Reviewed-by: Francisco Jerez <[email protected]>
*	gallium: Add PIPE_CAP_COMPUTE_IMAGES_SUPPORTED	Tom Stellard	2014-07-25	3	-1/+11
\| \| \| \| \|	Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	r600g/compute: Allow compute_memory_defrag to defragment between resources	Bruno Jiménez	2014-07-25	2	-5/+7
\| \| \| \| \| \|	This will be used in the following patch to avoid duplicated code Reviewed-by: Tom Stellard <[email protected]>
*	r600g/compute: Allow compute_memory_move_item to move items between resources	Bruno Jiménez	2014-07-25	2	-16/+16
\| \| \| \| \| \|	v2: Remove unnecesary variables Reviewed-by: Tom Stellard <[email protected]>
*	gbm: Search LIBGL_DRIVERS_PATH if GBM_DRIVERS_PATH is not set	Dylan Baker	2014-07-24	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The GBM_DRIVERS_PATH environment variable is not documented, and only used to set the location of gbm drivers, while LIBGL_DRIVERS_PATH is used for everything else, and is documented. Generally this split leads to confusion as to why gbm doesn't work. This patch will read LIBGL_DRIVERS_PATH as a fallback if GBM_DRIVERS_PATH is not set. The comments clearly indicate that using LIBGL_DRIVERS_PATH is preferred over GBM_DRIVERS_PATH. v2: - Use GBM_DRIVERS_PATH as a fallback v3: [[email protected]] - Make LIBGL_DRIVERS_PATH the fallback Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	winsys/radeon: fix indentation	Jerome Glisse	2014-07-24	1	-29/+29
\| \| \| \| \| \| \|	Can we please keep it clean and avoid ending up in messy situation like ddx. Signed-off-by: Jérôme Glisse <[email protected]>
*	Add an accelerated version of F_TO_I for x86_64	Jason Ekstrand	2014-07-24	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	According to a quick micro-benchmark, this new version is 20% faster on my Haswell laptop. v2: Removed the XXX note about x86_64 from the comment v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC support for free. v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Decide predicate/predicate_inverse outside of the for loop.	Matt Turner	2014-07-24	1	-9/+14
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Swap if/else conditions in SEL peephole.	Matt Turner	2014-07-24	1	-3/+3
\| \| \| \| \| \|	Will clarify make the next commit easier to read. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Improve dead control flow elimination.	Matt Turner	2014-07-24	1	-10/+15
\| \| \| \| \| \| \| \|	... to eliminate an ELSE instruction followed immediately by an ENDIF. instructions in affected programs: 704 -> 700 (-0.57%) Reviewed-by: Kenneth Graunke <[email protected]>
*	nvc0/ir: support 2d constbuf indexing	Ilia Mirkin	2014-07-24	1	-0/+14
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: emit LDC subops	Ilia Mirkin	2014-07-24	1	-0/+1
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gk110/ir: emit load constant subop	Ilia Mirkin	2014-07-24	1	-0/+1
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	mesa/st: add support for interpolate_at_* ops	Ilia Mirkin	2014-07-24	1	-3/+9
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	nv50/ir: fix phi/union sources when their def has been merged	Ilia Mirkin	2014-07-24	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	In a situation where double-register values are used, the phi nodes can still end up being u32 values. They all get merged into one RA node though. When fixing up the merge (which comes after the phi node), the phi node's def would get fixed, but not its sources which would remain at the low register value. This maintains the invariant that a phi node's defs and sources are allocated the same register. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: fix hard-coded TYPE_U32 sized register	Ilia Mirkin	2014-07-24	1	-3/+4
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: mark shader header if fp64 is used	Ilia Mirkin	2014-07-24	1	-0/+2
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: keep track of whether the program uses fp64	Ilia Mirkin	2014-07-24	2	-2/+7
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: make sure that the local memory allocation is aligned to 0x10	Ilia Mirkin	2014-07-24	1	-1/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
*	mesa: add ARB_clear_texture.xml to file list, remove duplicate decls	Ilia Mirkin	2014-07-24	2	-12/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	ilo: check the tilings of imported handles	Chia-I Wu	2014-07-24	1	-30/+36
\| \| \| \|	Just to be cautious.
*	ilo: clean up resource bo renaming	Chia-I Wu	2014-07-24	4	-51/+63
\| \| \| \| \|	s/alloc_bo/rename_bo/ as that is what the functions do. Simplify bo allocation and move the complexity to bo renaming.
*	ilo: share some code between {tex,buf}_create_bo	Chia-I Wu	2014-07-24	1	-59/+55
\| \| \| \| \|	Add resource_get_bo_name() and resource_get_bo_initial_domain() for use by both functions.
*	ilo: use native 3-component vertex formats on GEN7.5+	Chia-I Wu	2014-07-24	2	-1/+6
\| \| \| \|	GEN7.5 gains support for those formats natively.
*	ilo: allow for device-dependent format translation	Chia-I Wu	2014-07-24	5	-32/+39
\| \| \| \|	Pass ilo_dev_info to all format translation functions.
*	i965: Accelerate uploads of RGBA and BGRA GL_UNSIGNED_INT_8_8_8_8_REV textures	Jason Ekstrand	2014-07-23	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Since intel is always going to be little-endian, GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_UNSIGNED_BYTE for RGBA and BGRA textures, so the same acceleration code will work. We might as well use it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Fix the name in the error message	Ian Romanick	2014-07-23	1	-1/+1
\| \| \| \| \| \| \|	Obvious copy-and-paste bug. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Fix some bad indentation	Ian Romanick	2014-07-23	1	-3/+3
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Set LastRT on the final FB write on Broadwell.	Kenneth Graunke	2014-07-23	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend test, key->nr_color_regions == 2, but the dual source blend FB write has ir->target set to 0. So we failed to set "Last Render Target Select" on any FB write message. We only emit one FB write per render target, so my comment about setting LastRT on every FB write directed at the last color region is a bit... misinformed. According to the documentation, depth buffer writes and scoreboard updates happen on the FB write with LastRT set, so I believe we want to set it only once. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
*	i965: Port INTEL_DEBUG=optimizer to the vec4 backend.	Kenneth Graunke	2014-07-23	1	-6/+36
\| \| \| \| \| \| \|	Largely via copy and paste. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Save the gl_shader_stage enum in backend_visitor.	Kenneth Graunke	2014-07-23	2	-1/+4
\| \| \| \| \| \| \| \| \|	This will be useful for INTEL_DEBUG=optimizer in the vec4 backend, which needs to know whether it's currently processing a VS or GS. It isn't worth adding virtual methods for this case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Don't print WE_normal in disassembly.	Kenneth Graunke	2014-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \|	Dropping this helps most lines fit in an 80 column terminal. The absence of WE_normal also helps call attention to WE_all, where something unusual is going on. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	freedreno/a3xx/compiler: fix p0 (kill, etc)	Rob Clark	2014-07-23	1	-1/+2
\| \| \| \| \| \| \|	Don't assert (debug builds) or assign random uninitialized value for predicate register (p0).. that screws up kill, etc. Signed-off-by: Rob Clark <[email protected]>
*	Revert "r600g/compute: Fix warnings"	Tom Stellard	2014-07-23	2	-16/+12
\| \| \| \| \| \|	This reverts commit 467f1585e28adba0e94ef593de131bc327f098bb. This breaks the build on some systems.
*	radeon/llvm: fix formatting	Grigori Goronzy	2014-07-23	1	-10/+14
\| \| \| \| \| \| \|	Use K&R and same indent as most other code. No functional change intended. Reviewed-by: Tom Stellard <[email protected]>
*	radeon/llvm: enable unsafe math for graphics shaders	Grigori Goronzy	2014-07-23	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Accuracy of some operations was recently improved in the R600 backend, at the cost of slower code. This is required for compute shaders, but not for graphics shaders. Add unsafe-fp-math hint to make LLVM generate faster but possibly less accurate code. Piglit didn't indicate any regressions. Reviewed-by: Tom Stellard <[email protected]>
*	r600g/compute: Fix warnings	Tom Stellard	2014-07-23	2	-12/+16
\|
*	r600g: Use hardware sqrt instruction	Glenn Kennard	2014-07-23	2	-7/+4
\| \| \| \| \| \| \|	Piglit quick tests including sqrt pass, no other regressions, tested on radeon 6670. Reviewed-by: Alex Deucher <[email protected]>
*	r600g/compute: Remove unneeded code from compute_memory_promote_item	Bruno Jiménez	2014-07-23	2	-36/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Now that we know that the pool is defragmented, we positively know that allocated + unallocated will be the total size of the current pool plus all the items that will be promoted. So we only need to grow the pool once. This will allow us to just add the new items to the end of the item_list without the need of looking for a place to the new item. Reviewed-by: Tom Stellard <[email protected]>
*	r600g/compute: Quick exit if there's nothing to add to the pool	Bruno Jiménez	2014-07-23	1	-0/+4
\| \| \| \| \| \| \| \|	This way we can avoid defragmenting the pool, even if it is needed to defragment it, and looping again through the list of unallocated items. Reviewed-by: Tom Stellard <[email protected]>