mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	mesa: Add mesa SHA-1 functions	Carl Worth	2015-01-16	5	-0/+504
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The upcoming shader cache uses the SHA-1 algorithm for cryptographic naming. These new mesa_sha1 functions are implemented with any one of several differeny cryptographics libraries. This code was copied from the xserver repository, (where it has apparently been functioning well on a variety of operating systems), and comes licensed with a license identical to that of Mesa. Bug fixes by José Fonseca <[email protected]>: Fix to put conditional assignment in Makefile.am, not Makefile.sources to avoid breaking scons build. Fix include file for CryptoAPI section. Fix missing cast in openssl section. Reviewed-by: Matt Turner <[email protected]>
*	configure: Add copyright and license block to configure.ac	Carl Worth	2015-01-16	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to copying in code from the xserver configure.ac file, it makes sense to have the license of this file clearly marked, (to show that it's licensed identically to the configure.ac file from the xserver repository). And since the text of the license refers to "the above copyright notice" it also makes sense to have an actual copyright attribution in place. I generated this list of names by looking at the output of: git shortlog -n --format=%aD -- configure.ac (and arbitrarily stopping for contributors with fewer than 15 commits). Then for each name, I looked for existing Copyright attributions in the mesa source tree with the same name, (and using "Intel Corporation" as the copyright holder where I knew that was appropriate).
*	glsl: Add unit tests for blob.c	Carl Worth	2015-01-16	3	-0/+328
\| \| \| \| \| \|	In addition to exercising all of the functions in blob.h, this includes a stress test that forces some reallocing, and also tests to verify the alignment and overrun-detection code in blob.c.
*	glsl: Add blob_overwrite_bytes and blob_overwrite_uint32	Tapani Pälli	2015-01-16	2	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These functions are useful when serializing an unknown number of items to a blob. The caller can first save the current offset, write a placeholder uint32, write out (and count) the items, then use blob_overwrite_uint32 with the saved offset to replace the placeholder value. Then, when deserializing, the reader will first read the count and know how many subsequent items to expect. (I wrote this code after reading a very similar patch written by Tapani when he wrote serialization code for IR. Since I re-used the idea of his code so directly, I've credited him as the author of this code. --Carl) Reviewed-by: Jason Ekstrand <[email protected]>
*	glsl: Add blob.c---a simple interface for serializing data	Carl Worth	2015-01-16	3	-0/+548
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new interface allows for writing a series of objects to a chunk of memory (a "blob").. The allocated memory is maintained within the blob itself, (and re-allocated by doubling when necessary). There are also functions for reading objects from a blob as well. If code attempts to read beyond the available memory, the read functions return 0 values (or its moral equivalent) without reading past the allocated memory. Once the caller is done with the reads, it can check blob->overrun to ensure whether any invalid values were previously returned due to attempts to read too far. Reviewed-by: Jason Ekstrand <[email protected]>
*	mesa: Add iterate method for string_to_uint_map	Tapani Pälli	2015-01-16	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The upcoming shader cache needs this to be able to cache hash data from the gl_shader_program structure. Edited-by: Carl Worth <[email protected]>: There is an internal implementation detail that the hash table underlying the struct string_to_uint_map stores each value internally as (value+1). The user needn't be very concerned with this (other than knowing that a value of UINT_MAX cannot be stored) since put() adds 1 and get() subtracts 1. So in this commit, rather than call the user's function directly with hash_table_call_foreach, we call through a wrapper that fixes up the off-by-one values before the caller's callback sees them. And with this wrapper in place, we also give a better signature to the callback function being passed to iterate(), so that this callback function can actually expect a char* and an unsigned argument, (rather than a couple of void* ). Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	util: Make unreachable at least be an assert	Carl Worth	2015-01-16	1	-1/+1
\| \| \| \| \| \| \| \|	Previously, if __builtin_unreachable() was unavailable, the unreachable macro was defined to do nothing. We do better here, by at least still making it an assert. Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Add convenience function get_sampler_instance	Carl Worth	2015-01-16	2	-0/+120
\| \| \| \| \| \| \| \| \| \|	This is similar to the existing functions get_instance, get_array_instance, etc. for getting a type singleton. The new get_sampler_instance() function will be used by the upcoming shader cache. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Fix some oddities in FB_WRITE register width and execution size.	Kenneth Graunke	2015-01-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we generated this for FB writes in SIMD16 mode: load_payload(16) vgrf5@8+0.0:F, vgrf1:F, vgrf2:F, vgrf3:F, vgrf4:F fb_write(8) (null):UD, vgrf5@8+0.0:F 1sthalf The LOAD_PAYLOAD's destination had its register width set to 8, and the FB_WRITE had its execution size set to 8. This seems wrong, and while it probably doesn't affect anything, we should fix it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Make lower_load_payload etc. appear in INTEL_DEBUG=optimizer.	Kenneth Graunke	2015-01-16	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to support calling lower_load_payload() inside a condition, this patch makes OPT() a statement expression: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html We recently did the equivalent change in the vec4 backend (commit 9b8bd67768769b685c25e1276e053505aede5f93). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	format_utils: Use a more precise conversion when decreasing bits	Neil Roberts	2015-01-16	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When converting to a format that has fewer bits the previous code was just shifting off the bits. This doesn't provide very accurate results. For example when converting from 8 bits to 5 bits it is equivalent to doing this: x * 32 / 256 This works as if it's taking a value from a range where 256 represents 1.0 and scaling it down to a range where 32 represents 1.0. However this is not correct because it is actually 255 and 31 that represent 1.0. We can do better with a formula like this: (x * 31 + 127) / 255 The +127 is to make it round correctly. The new code has a special case to use uint64_t when the result of the multiplication would overflow an unsigned int. This function is inline and only ever called with constant values so hopefully the if statements will be folded. The main incentive to do this is to make the CPU conversion path pick the same values as the hardware would if it did the conversion. This fixes failures with the ‘texsubimage pbo’ test when using the patches from here: http://lists.freedesktop.org/archives/mesa-dev/2015-January/074312.html v2: Use 64-bit arithmetic when src_bits+dst_bits > 32 Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gen6: Fix crash with VS+TF after rendering with GS	Iago Toral Quiroga	2015-01-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rendering with a GS and then using transform feedback with a program that does not have a GS can crash in gen6. The reason for this is that brw_begin_transform_feedback checks brw->geometry_program to decide if there is a GS program, but this is not correct: brw->geometry_program is updated when issuing drawing commands, so after rendering with a GS it will be non-NULL until we draw again with a program that does not have a GS. If the next program uses TF, we will call glBegintransformFeedback before issuing the drawing command and hence brw->geometry_program will be non-NULL if the previous rendering used a GS. The right thing to do here is to check ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the gen7 code path does too. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694 Reviewed-by: Tapani Pälli <[email protected]>
*	nir/live_variables: Use a worklist	Jason Ekstrand	2015-01-15	1	-55/+75
\| \| \| \| \| \| \| \| \| \|	This is a rework of the liveness algorithm using a worklist as suggested by Connor. Doing so reduces the number of times we walk over the instructions because we don't have to do an entire pointless walk over the instructions just to figure out it's time to stop. Also, the stuff after the last loop in the funciton will only ever get visited once. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Add a worklist helper structure	Jason Ekstrand	2015-01-15	3	-0/+237
\| \| \| \| \| \| \|	A worklist is a common concept in optimizations. This adds a structure that we can reuse for many different types of optimizations. Reviewed-by: Connor Abbott <[email protected]>
*	nir: fix incorrect argument passed to validate_src() in validate_tex_instr()	Brian Paul	2015-01-15	1	-1/+1
\| \| \| \| \| \|	Silences a compiler warning. Reviewed-by: Connor Abbott <[email protected]>
*	nir: silence compiler warning from visit_src() call	Brian Paul	2015-01-15	1	-1/+1
\| \| \| \| \| \|	v2: use proper argument Reviewed-by: Connor Abbott <[email protected]>
*	mesa: move GET_CURRENT_CONTEXT() to top of _mesa_init_renderbuffer()	Brian Paul	2015-01-15	1	-1/+2
\| \| \| \| \| \|	To fix MSVC build. Reviewed-by: Matt Turner <[email protected]>
*	mesa: Fix render buffer initial internal format in GLES 3	Mike Mason	2015-01-15	1	-1/+18
\| \| \| \| \| \| \| \| \| \|	Changes the initial internal format of a render buffer to GL_RGBA4 in GLES 3. This fixes a failure in the following DrawElements test: dEQP-GLES3.functional.state_query.rbo.renderbuffer_internal_format Reviewed-by: Chad Versace <[email protected]>
*	util/hash_set: Rework the API to know about hashing	Jason Ekstrand	2015-01-15	15	-132/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the set API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash set to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_set_add(ht, _mesa_pointer_hash(key), key); In the above case, there is no reason why the hash set shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash set will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. This is analygous to 94303a0750 where we did this for hash_table Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	util: Move main/set to util/hash_set	Jason Ekstrand	2015-01-15	9	-9/+8
\| \| \| \| \|	Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	hash_table: Rename insert_with_hash to insert_pre_hashed	Jason Ekstrand	2015-01-15	5	-10/+10
\| \| \| \| \| \| \|	We already have search_pre_hashed. This makes the APIs match better. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Don't consider null dst instructions as matching non-null dst.	Matt Turner	2015-01-15	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When performing common subexpression elimination on instructions with non-null destinations we emit a MOV to copy the result to a new register that must have no other uses. In the case of: cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f ... cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f we put the first instruction in the AEB and decided that we could reuse its result when we found the second. Unfortunately, that meant that we'd emit a MOV from the first's destination, which is null. Don't do anything if the entry's destination is null and the instruction's destination is non-null. Tested-by: Tapani Pälli <[email protected]>
*	i965/vec4: Make sure that imm writes are to registers in the same file.	Matt Turner	2015-01-15	1	-2/+8
\| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
*	i965/fs: Emit MADs from (x + abs(y * z)).	Matt Turner	2015-01-15	1	-3/+15
\| \| \| \| \| \| \| \| \|	Just use the abs source modifier on both of the multiplicand arguments. instructions in affected programs: 300 -> 296 (-1.33%) Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Emit MADs from (x + -(y * z)).	Matt Turner	2015-01-15	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	Just use the negation source modifier on one of the multiplicand arguments. total instructions in shared programs: 5889529 -> 5880016 (-0.16%) instructions in affected programs: 600846 -> 591333 (-1.58%) Reviewed-by: Kristian Høgsberg <[email protected]>
*	nir/algebraic: Only replace an instruction once	Jason Ekstrand	2015-01-15	1	-1/+3
\| \| \| \| \| \| \| \| \|	Without the break, it was possible that an instruction would match multiple expressions. If this happened, you could end up trying to replace it multiple times and get a segfault. This makes it so that, after a successful replacement, it moves on to the next instruction. Reviewed-by: Connor Abbott <[email protected]>
*	i965/nir: Do a final copy lowering pass before lowering locals to regs	Jason Ekstrand	2015-01-15	1	-0/+3
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir/vars_to_ssa: Use the copy lowering from lower_var_copies	Jason Ekstrand	2015-01-15	1	-152/+46
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir: Add a pass for lowering copy instructions	Jason Ekstrand	2015-01-15	3	-0/+227
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir/vars_to_ssa: Refactor get_deref_node	Jason Ekstrand	2015-01-15	1	-20/+25
\| \| \| \| \| \| \| \|	This refactor allows you to more easily get the deref node associated with a given variable. We then use that new functionality in the deref_may_be_aliased function instead of creating a 1-element deref chain. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Rename lower_variables to lower_vars_to_ssa	Jason Ekstrand	2015-01-15	4	-6/+6
\| \| \| \| \| \| \| \|	The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <[email protected]>
*	nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array	Jason Ekstrand	2015-01-15	7	-42/+50
\| \| \| \| \| \| \| \|	This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <[email protected]>
*	nir/validate: Only build in debug mode	Jason Ekstrand	2015-01-15	2	-0/+11
\| \| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_variables: Improve documentation	Jason Ekstrand	2015-01-15	1	-27/+79
\| \| \| \| \| \| \| \|	Additional description was added to a variety of places. Also, we no longer use the term "leaf" to describe fully-qualified direct derefs. Instead, we simply use the term "direct" or spell it out completely. Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_variables: Use a for loop for get_deref_node	Jason Ekstrand	2015-01-15	1	-58/+48
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir: Use the actual FNV-1a hash for hashing derefs	Jason Ekstrand	2015-01-15	2	-90/+79
\| \| \| \| \| \|	We also switch to using loops rather than recursion. Reviewed-by: Connor Abbott <[email protected]>
*	util/hash_table: Pull the details of the FNV-1a into helpers	Jason Ekstrand	2015-01-15	2	-13/+23
\| \| \| \| \| \| \|	This way the basics of the FNV-1a hash can be reused to easily create other hashing functions. Reviewed-by: Eric Anholt <[email protected]>
*	nir: Make intrinsic flags into an enum	Jason Ekstrand	2015-01-15	1	-14/+14
\| \| \| \| \| \| \| \|	This should be much better for debugging as GDB will pick up on the fact that it's an enum and actually tell you what you're looking at instead of giving you some arbitrary hex value you have to go look up. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Use static inlines instead of macros for list getters	Jason Ekstrand	2015-01-15	1	-28/+81
\| \| \| \| \| \| \| \|	This should make debugging a lot easier as GDB handles static inlines much better than macros. Also, static inlines are typesafe. Reviewed-By: Glenn Kennard <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/variable: Remove the constant_value field	Jason Ekstrand	2015-01-15	2	-16/+4
\| \| \| \| \| \| \| \| \|	This was a left-over relic of GLSL IR that we aren't using for anything. If we ever want that value again, we can add it back, but NIR constant folding should be just as good as GLSL IR's if not better pretty soon, so I'm not worried about it. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Add some documentation	Jason Ekstrand	2015-01-15	1	-22/+69
\| \| \| \| \|	Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_variables: Follow the Cytron paper more closely	Jason Ekstrand	2015-01-15	1	-26/+69
\| \| \| \| \| \| \| \| \| \|	Previously, our variable renaming algorithm, while similar to the one in the Cytron paper, was not the same. While I'm pretty sure it was correct, it will be easier for readers of the code in the variable renaming pass if it follows more closely. This commit removes the automatic stack popping we were doing and replaces it with explicit popping like Cytron does. Reviewed-by: Connor Abbott <[email protected]>
*	nir/print: Various cleanups recommended by Eric	Jason Ekstrand	2015-01-15	1	-33/+12
\| \| \| \| \|	Cc: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_variables: Add a bunch of comments and re-arrange a few things	Jason Ekstrand	2015-01-15	1	-57/+170
\| \| \| \| \| \| \| \|	This commit seeks to make the lower_variables pass much more clear by adding a pile of comments and re-arranging a few things. There are no functional or algorithmic changes. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Rename parallel_copy_copy to parallel_copy_entry and add a foreach macro	Jason Ekstrand	2015-01-15	4	-46/+55
\| \| \| \| \| \| \| \| \| \|	parallel_copy_copy was a silly name. Also, things were getting long and annoying, so I added a foreach macro. For historical reasons, several of the original iterations over parallel copy entries in from_ssa used the _safe variants of the loop. However, all of these no longer ever remove an entry so it's ok to make them all use the normal iterator. Reviewed-by: Connor Abbott <[email protected]>
*	nir/from_ssa: Clean up parallel copy handling and document it better	Jason Ekstrand	2015-01-15	3	-66/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we were doing a lazy creation of the parallel copy instructions. This is confusing, hard to get right, and involves some extra state tracking of the copies. This commit adds an extra walk over the basic blocks to add the block-end parallel copies up front. This should be much less confusing and, consequently, easier to get right. This commit also adds more comments about parallel copies to help explain what all is going on. As a consequence of these changes, we can now remove the at_end parameter from nir_parallel_copy_instr. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Rename nir_block_following_if to nir_block_get_following_if	Jason Ekstrand	2015-01-15	5	-5/+5
\| \| \| \| \| \|	The new name is a little longer but less confusing. Reviewed-by: Connor Abbott <[email protected]>
*	i965/fs_nir: Handle sample ID, position, and mask better	Jason Ekstrand	2015-01-15	2	-12/+71
\| \| \| \| \| \| \| \| \| \|	Before, we were emitting the full pile of setup instructions for sample_id and sample_pos every time they were used. With this commit, we emit them in their own pass once at the beginning of the shader and simply emit uses later on. When it comes time for setting up VS, we can put setup for its special values in the same pass. Reviewed-by: Connor Abbott <[email protected]>
*	nir/opcodes: Remove the per_component info field	Jason Ekstrand	2015-01-15	3	-37/+33
\| \| \| \| \| \| \| \| \| \| \|	Originally, this field was intended for determining if the given instruction acted per-component or if it had mismatching source and destination sizes that would have to be interpreted specially. However, we can easily derive this from output_size == 0, so it's not really that useful. Also, the values we were setting in nir_opcodes.h for this field were completely bogus and it was never used. Reviewed-by: Connor Abbott <[email protected]>
*	nir/search: Use nir_op_infos to determine if an operation is commutative	Jason Ekstrand	2015-01-15	1	-33/+2
\| \| \| \| \| \| \|	Prior to this commit, we had a big switch statement for this. Now it's baked into the opcode metadata so we can just use that. Reviewed-by: Connor Abbott <[email protected]>