mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: blit_texture_to_pbo() now accepts TEXTURE_CUBE_MAP.	Laura Ekstrand	2015-01-08	1	-0/+1
\| \| \| \| \| \|	ARB_DIRECT_STATE_ACCESS permits the user to use TEXTURE_CUBE_MAP as a target. Reviewed-by: Anuj Phogat <[email protected]>
*	main: Added utility function _mesa_lookup_texture_err().	Laura Ekstrand	2015-01-08	2	-0/+19
\| \| \| \| \| \| \| \| \|	Most ARB_DIRECT_STATE_ACCESS functions take an object's ID and use it to look up the object in its hash table. If the user passes a fake object ID (ie. a non-generated name), the implementation should throw INVALID_OPERATION. This is a convenience function for texture objects. Reviewed-by: Anuj Phogat <[email protected]>
*	glapi: Added ARB_direct_state_access.xml file.	Laura Ekstrand	2015-01-08	4	-1/+18
\| \| \| \| \| \|	main: Added ARB_direct_state_access to extensions.c as dummy_false. Reviewed-by: Anuj Phogat <[email protected]>
*	st/wgl: Ignore ulVersion in DrvValidateVersion.	José Fonseca	2015-01-08	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \|	We never used ulVersion for proper version checks. Most 3rd party drivers use version 1, but recently NVIDIA OpenGL driver started using a different version number, so the handy trick of renaming Mesa's ICDs as nvoglv32.dll on Windows machines with NVIDIA hardware for quick testing of Mesa software renderers stopped working. Reviewed-by: Brian Paul <[email protected]>
*	mesa: Address `assignment makes integer from pointer without a cast` gcc ↵	José Fonseca	2015-01-08	1	-2/+2
\| \| \| \| \| \| \|	warning. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/skl: Always use a header for SIMD4x2 sampler messages	Kristian Høgsberg	2015-01-08	5	-11/+54
\| \| \| \| \| \| \| \| \| \| \| \|	SKL+ overloads the SIMD4x2 SIMD mode to mean either SIMD8D or SIMD4x2 depending on bit 22 in the message header. If the bit is 0 or there is no header we get SIMD8D. We always wand SIMD4x2 in vec4 and for fs pull constants, so use a message header in those cases and set bit 22 there. Based on an initial patch from Ken. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Kristian Høgsberg <[email protected]>
*	i965/skl: Report more accurate number of samples for format	Kristian Høgsberg	2015-01-07	1	-0/+2
\| \| \| \| \|	Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	freedreno/ir3: fix pos_regid > max_reg	Rob Clark	2015-01-07	4	-41/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can't (or don't know how to) turn this off. But it can end up being stored to a higher reg # than what the shader uses, leading to corruption. Also we currently aren't clever enough to turn off frag_coord/frag_face if the input is dead-code, so just fixup max_reg/max_half_reg. Re-org this a bit so both vp and fp reg footprint fixup are called by a common fxn used also by ir3_cmdline. Also add a few more output lines for ir3_cmdline to make it easier to see what is going on. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: start on indirect gpr reads	Rob Clark	2015-01-07	3	-8/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Handle TEMP[ADDR[]] src registers by generating a fanin to group array elements, similarly to how texture fetch instructions work. NOTE: For all the scalar instructions generated for a single tgsi vector operation which uses an array src (or possibly even uses the same array as multiple srcs), re-use the same fanin node. Since a vector operation operates on all components at the same time, it should never see more than one version of the same array. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: make reg array dynamic	Rob Clark	2015-01-07	4	-13/+50
\| \| \| \| \| \| \| \| \|	To use fanin's to group registers in an array, we can potentially have a much larger array of registers. Rather than continuing to bump up the array size, just make it dynamically allocated when the instruction is created. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: simplify RA	Rob Clark	2015-01-07	8	-777/+622
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: regmask support for relative addr	Rob Clark	2015-01-07	2	-17/+51
\| \| \| \| \| \| \| \|	For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't need to support an arbitrary mask. So for this case use a simple size field rather than a bitmask. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: split up ssa_src	Rob Clark	2015-01-07	1	-23/+34
\| \| \| \| \| \| \|	Slight bit of refactoring that will be needed for indirect gpr addressing (TEMP[ADDR[]]). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: drop instr_clone() stuff	Rob Clark	2015-01-07	2	-49/+17
\| \| \| \| \| \| \|	Unnecessary and overly complicated. And gets in the way for temp arrays (TEMP[ADDR[]]). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: runtime enable RA debug for DEBUG builds	Rob Clark	2015-01-07	1	-1/+6
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: handle relative addr in ir3_dump	Rob Clark	2015-01-07	1	-1/+8
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: legalize vs unused sam dst components	Rob Clark	2015-01-07	2	-2/+9
\| \| \| \| \| \| \| \|	We probably could be more clever elsewhere and mask out components that are not used. But either way, legalize should realize that there is also a write-after-write hazard with texture sample instructions. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: hack for old compiler	Rob Clark	2015-01-07	1	-0/+23
\| \| \| \| \| \| \|	Old compiler doesn't have ir3_block's.. so we need a special path. This hack can be dropped when ir3_compiler_old is retired. Signed-off-by: Rob Clark <[email protected]>
*	tgsi: track max array per file	Rob Clark	2015-01-07	2	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can optionally have them. So we implicitly assume that ArrayID==0 always exists for each file. This is why array_max[file] is never less than zero. You can tell from indirect_files(_read/written) if the legacy array- id zero was actually used. Signed-off-by: Rob Clark <[email protected]>
*	tgsi: keep track of read vs written indirects	Rob Clark	2015-01-07	2	-0/+8
\| \| \| \| \| \| \| \| \| \|	At least temporarily, I need to fallback to old compiler still for relative dest (for freedreno), but I can do relative src temp. Only a temporary situation, but seems easy/reasonable for tgsi-scan to track this. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	Revert "radeonsi: reduce the size of si_pm4_state"	Marek Olšák	2015-01-08	2	-3/+12
\| \| \| \| \| \|	This reverts commit 9141d8855555e45a057970e78969e1518ad3617d. It broke OpenCL.
*	radeonsi: Fix crash when destroying si_screen	Tom Stellard	2015-01-07	1	-2/+4
\| \| \| \| \| \| \| \| \|	We were invalidating si_screen:tm by calling r600_destroy_common_screen() which frees the si_screen object. This caused the driver to crash in LLVMDisposeTargetMachine() since we were passing it an invalid pointer. https://bugs.freedesktop.org/show_bug.cgi?id=88170
*	mesa: Don't use _mesa_generic_nop on Windows.	José Fonseca	2015-01-07	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It doesn't work on Windows because of STDCALL calling convention -- it's the callee responsibility to pop the arguments, and the number of arguments vary with the prototype --, so the stack pointer ends up getting corrupted. This is just a non-invasive stop-gap fix. A proper fix would be more elaborate, and require either: - a variation of __glapi_noop_table which sets GL_INVALID_OPERATION error - stop using APIENTRY on all internal _mesa_* functions. Tested with piglit gl-1.0-beginend-coverage (it now fails instead of crashing). VMware PR1350505 Reviewed-by: Brian Paul <[email protected]>
*	glapi: Force frame pointer elimination on Windows.	José Fonseca	2015-01-07	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \|	To catch mismatches in cdecl vs stdcall calling convention. See code comment for more detailed explanation. Tested with piglit gl-1.0-beginend-coverage (it now also crashes on debug builds.) VMware PR1350505. Reviewed-by: Brian Paul <[email protected]>
*	radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders	Marek Olšák	2015-01-07	3	-4/+12
\| \| \| \| \| \| \|	v2: complete rewrite Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
*	radeonsi: emit SURFACE_SYNC last	Marek Olšák	2015-01-07	1	-23/+35
\| \| \| \| \| \| \|	This fixes a case where a transform feedback buffer is fed back as an index buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer	Marek Olšák	2015-01-07	1	-11/+7
\| \| \| \| \| \|	This is easier to read and will work better with shader image stores. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: change TC cache flushing strategy for textures	Marek Olšák	2015-01-07	2	-4/+6
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: improve and fix streamout flushing	Marek Olšák	2015-01-07	3	-10/+40
\| \| \| \| \| \| \| \| \| \| \|	- we don't usually need to flush TC L2 - we should flush KCACHE (not really an issue now since we always flush KCACHE when updating descriptors, but it could be a problem if we used CE, which doesn't require flushing KCACHE) - add an explicit VS_PARTIAL_FLUSH flag Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: use TC L2 for CP DMA operations with shader resources on CIK	Marek Olšák	2015-01-07	3	-10/+39
\| \| \| \| \| \| \| \| \|	So that TC L2 doesn't need to be flushed. The only problem is with index buffers, which don't use TC. A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty). Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: use TC L2 for updating descriptors on CIK	Marek Olšák	2015-01-07	2	-5/+10
\| \| \| \| \| \|	This allows not flushing TC L2 on CIK later. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: don't use TC L2 for updating descriptors on SI	Marek Olšák	2015-01-07	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \|	It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA when updating the same memory. The solution for SI is to use uncached access here, because CP DMA doesn't support cached access. CIK will be handled in the next patch. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: only flush the right set of caches for CP DMA operations	Marek Olšák	2015-01-07	9	-34/+48
\| \| \| \| \| \| \| \|	That's either framebuffer caches or caches for shader resources. The motivation is that framebuffer caches need to be flushed very rarely here. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: implement separate ICACHE and KCACHE flush for SI	Marek Olšák	2015-01-07	1	-9/+17
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: add a combined flag for flushing a framebuffer	Marek Olšák	2015-01-07	3	-20/+10
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: rename flush flags, split the TC flag into L1 and L2	Marek Olšák	2015-01-07	7	-91/+109
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	r600g,radeonsi: separate cache flush flags	Marek Olšák	2015-01-07	5	-26/+39
\| \| \| \| \| \|	I will rename them for radeonsi. Reviewed-by: Michel Dänzer <[email protected]>
*	r600g: move r6xx-specific streamout flush flagging into r600g	Marek Olšák	2015-01-07	2	-9/+7
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: only set BC_OPTIMIZE_DISABLE when necessary	Marek Olšák	2015-01-07	2	-6/+15
\| \| \| \| \| \|	SPI_PS_IN_CONTROL is moved into the SPI mapping state. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: do not define FACE as an ordinary PS input	Marek Olšák	2015-01-07	1	-1/+2
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: remove flatshade from the shader key	Marek Olšák	2015-01-07	3	-7/+7
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen	Marek Olšák	2015-01-07	1	-6/+10
\| \| \| \| \| \| \| \|	It doesn't do anything useful. And colors are floating-point, so we can use fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE state only (in the next patch). Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values	Marek Olšák	2015-01-07	1	-0/+10
\| \| \| \| \| \| \| \|	Only done for completeness. Not used by anything yet. Tested by advertising PIPE_CAP_VERTEXID_NOBASE. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: fix VertexID for OpenGL	Marek Olšák	2015-01-07	1	-2/+5
\| \| \| \| \| \| \|	This fixes all failing piglit VertexID tests. Cc: 10.4 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: clarify a hw bug in shader exports	Marek Olšák	2015-01-07	1	-5/+10
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: use ordered compares for SSG and face selection	Marek Olšák	2015-01-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Ordered compares are what you have in C. Unordered compares are the result of negating ordered compares (they return true if either argument is NaN). That special NaN behavior is completely useless here, and unordered compares produce horrible code with all stable LLVM versions. (I think that has been fixed in LLVM git) Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: remove unused and not useful variables	Marek Olšák	2015-01-07	3	-6/+1
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: remove init config from states	Marek Olšák	2015-01-07	6	-5/+4
\| \| \| \| \| \|	It really doesn't do anything there. Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: reduce the size of si_pm4_state	Marek Olšák	2015-01-07	2	-12/+3
\| \| \| \| \| \| \| \|	- the relocs array is unused, remove it - ndw is at most 115 (init), set 140 as the maximum - compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum Reviewed-by: Michel Dänzer <[email protected]>
*	tgsi: add uses_centroid into tgsi_shader_info	Marek Olšák	2015-01-07	2	-0/+4
\|