summaryrefslogtreecommitdiffstats
path: root/src/glsl
Commit message (Collapse)AuthorAgeFilesLines
...
* glsl: Add a foreach_in_list_reverse_safe macro.Matt Turner2015-01-231-0/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Expose nir_print_instr() for debug printsEric Anholt2015-01-232-2/+8
| | | | | | | | | It's nice to have this present in your default cases so you can see what instruction is triggering an abort. v2: Just pass a NULL state, now that it won't crash when you do. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: When asked to print with a NULL state, just use bare variable names.Eric Anholt2015-01-231-6/+16
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add nir_lower_alu_to_scalar.Eric Anholt2015-01-233-0/+188
| | | | | | | | | | | | | | | | This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted for vc4. v2: Use the nir_src_for_ssa() helper, and another instance of nir_alu_src_copy(). v3: Drop the non-SSA support. All intended callers will have SSA-only ALU ops. v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused unsupported() function, drop lower_context struct. v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert about weird input_sizes[]. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make some helpers for copying ALU src/dests.Eric Anholt2015-01-234-9/+25
| | | | | | | | | There aren't many users yet, but I wanted to do this from my scalarizing pass. v2: Constify the src arguments. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add algebraic optimizations for division and reciprocal.Kenneth Graunke2015-01-231-0/+5
| | | | | | | | | | | | | | | | | These also exist in opt_algebraic.cpp. total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%) NIR instructions in affected programs: 42221 -> 42002 (-0.52%) helped: 198 total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%) i965 instructions in affected programs: 84322 -> 83885 (-0.52%) helped: 394 HURT: 1 (by 1 instruction) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add algebraic optimizations for exponential/logarithmic functions.Kenneth Graunke2015-01-231-0/+10
| | | | | | | | | | | | | | | | | | Most of these exist in the GLSL IR algebraic pass already. However, SSA allows us to find more instances of the patterns. total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%) NIR instructions in affected programs: 124189 -> 120026 (-3.35%) helped: 604 total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%) i965 instructions in affected programs: 261295 -> 254507 (-2.60%) helped: 1295 HURT: 3 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add algebraic optimizations for simplifying comparisons.Kenneth Graunke2015-01-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | The first batch removes bonus fnot/inot operations, possibly allowing other optimizations to better recognize patterns. The next batch replaces a fadd and constant 0.0 with an fneg - negation is usually free on GPUs, while addition is not. total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%) NIR instructions in affected programs: 411143 -> 405922 (-1.27%) helped: 2233 HURT: 214 A few shaders are hurt by a few instructions due to moving neg such that it has a constant operand, which is then folded, resulting in two distinct load_consts for x and -x. We can always clean that up later. total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%) i965 instructions in affected programs: 784980 -> 775093 (-1.26%) helped: 4508 HURT: 2 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add algebraic optimizations for pointless shifts.Kenneth Graunke2015-01-231-0/+7
| | | | | | | | | | | | | | | | | The GLSL IR optimization pass contained these; we may as well include them too. v2: Fix a >> 0 and a << 0 optimizations (caught by Matt). No change in the number of NIR instructions on a shader-db run. total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%) i965 instructions in affected programs: 542 -> 537 (-0.92%) helped: 2 (in glamor) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add a bunch of algebraic optimizations on logic/bit operations.Kenneth Graunke2015-01-231-0/+13
| | | | | | | | | | | | | | | | | | | | Matt and I noticed a bunch of "val <- ior a a" operations in a shader, so we decided to add an algebraic optimization for that. While there, I decided to add a bunch more of them. v2: Delete bogus fand/for optimizations (caught by Jason). total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%) NIR instructions in affected programs: 149634 -> 146937 (-1.80%) helped: 1032 total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%) i965 instructions in affected programs: 537 -> 542 (0.93%) HURT: 2 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Implement CSE on intrinsics that can be eliminated and reordered.Kenneth Graunke2015-01-231-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had load_input and load_uniform intrinsics repeated several times, with the same parameters, but each one generating a distinct SSA value. This made ALU operations on those values appear distinct as well. Generating distinct SSA values is silly - these are read only variables. CSE'ing them makes everything use a single SSA value, which then allows other operations to be CSE'd away as well. Generalizing a bit, it seems like we should be able to safely CSE any intrinsics that can be eliminated and reordered. I didn't implement support for variables for the time being. v2: Assert that info->num_variables == 0 (requested by Jason). total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%) NIR instructions in affected programs: 2413496 -> 2001071 (-17.09%) helped: 16872 total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%) i965 instructions in affected programs: 640654 -> 620094 (-3.21%) helped: 2071 HURT: 585 GAINED: 14 LOST: 25 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Pull nir_instr_can_cse()'s SSA checks out of the switch.Kenneth Graunke2015-01-231-2/+6
| | | | | | | | | | | This should not be a change in behavior, as all current cases that potentially answer "yes" require SSA. The next patch will introduce another case that requires SSA. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Build a libglsl_util library.Matt Turner2015-01-231-16/+7
| | | | | Rather than sourcing files with ../dir/file.c which leads to distclean wiping out ../dir's .deps directory.
* glsl: Build with subdir-objects.Matt Turner2015-01-233-190/+188
| | | | | | Apparently $(top_srcdir) is not expanded in a source list when using subdir-objects, so remove that. It's not clear to me why we were going to such lengths to prefix each source file anyway.
* nir: Add headers to distribution.Matt Turner2015-01-231-0/+2
|
* nir: Add nir_{opt_,}algebraic.py to distribution.Matt Turner2015-01-231-0/+2
|
* nir: add generated file to .gitignoreConnor Abbott2015-01-231-0/+1
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: fix stale commentConnor Abbott2015-01-231-5/+4
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: Fix setup of constant bool initializers.Eric Anholt2015-01-221-1/+1
| | | | | | | | | | brw_fs_nir has only seen scalar bools so far, thanks to vector splitting, and the ralloc of in glsl_to_nir.cpp will *usually* get you a 0-filled chunk of memory, so reading too large of a value will usually get you the right bool value. But once we start doing vector bools in a few commits, we end up getting bad values. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make an easier helper for setting up SSA defs.Eric Anholt2015-01-2213-66/+46
| | | | | | | | Almost all instructions we nir_ssa_def_init() for are nir_dests, and you have to keep from forgetting to set is_ssa when you do. Just provide the simpler helper, instead. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Link glsl_test with pthreads library.Jonathan Gray2015-01-221-1/+3
| | | | | | | | | | Otherwise pthread_mutex_lock will be an undefined reference on OpenBSD. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219 Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: "10.4 10.3" <[email protected]>
* glsl: do not allow interface block to have name already takenTapani Pälli2015-01-221-1/+15
| | | | | | | | | | Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert v2: combine var declaration with assignment (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Replace assert(0) with unreachable().Matt Turner2015-01-214-43/+22
| | | | | | Fixes a couple of warnings in the process. Reviewed-by: Connor Abbott <[email protected]>
* nir: Stop using designated initializersJason Ekstrand2015-01-2110-105/+47
| | | | | | | | | Designated initializers with anonymous unions don't work in MSVC or GCC < 4.6. With a couple of constructor methods, we don't need them any more and the code is actually cleaner. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467 Reviewed-by: Connor Abbot <[email protected]>
* nir: Add src and dest constructorsJason Ekstrand2015-01-211-0/+37
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a nir_foreach_phi_src helper macroJason Ekstrand2015-01-209-11/+14
| | | | Reviewed-by: Connor Abbott <cwabbott02gmail.com>
* mesa: Add ARB_shader_precision infrastructureMicah Fedke2015-01-193-0/+6
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* nir: s/malloc.h/stdlib.h/Vinson Lee2015-01-161-1/+1
| | | | | | | | | | | Fix build error on Mac OS X. CC nir_to_ssa.lo nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478 Signed-off-by: Vinson Lee <[email protected]>
* glsl: Add unit tests for blob.cCarl Worth2015-01-163-0/+328
| | | | | | In addition to exercising all of the functions in blob.h, this includes a stress test that forces some reallocing, and also tests to verify the alignment and overrun-detection code in blob.c.
* glsl: Add blob_overwrite_bytes and blob_overwrite_uint32Tapani Pälli2015-01-162-0/+66
| | | | | | | | | | | | | | | | | | These functions are useful when serializing an unknown number of items to a blob. The caller can first save the current offset, write a placeholder uint32, write out (and count) the items, then use blob_overwrite_uint32 with the saved offset to replace the placeholder value. Then, when deserializing, the reader will first read the count and know how many subsequent items to expect. (I wrote this code after reading a very similar patch written by Tapani when he wrote serialization code for IR. Since I re-used the idea of his code so directly, I've credited him as the author of this code. --Carl) Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Add blob.c---a simple interface for serializing dataCarl Worth2015-01-163-0/+548
| | | | | | | | | | | | | | | This new interface allows for writing a series of objects to a chunk of memory (a "blob").. The allocated memory is maintained within the blob itself, (and re-allocated by doubling when necessary). There are also functions for reading objects from a blob as well. If code attempts to read beyond the available memory, the read functions return 0 values (or its moral equivalent) without reading past the allocated memory. Once the caller is done with the reads, it can check blob->overrun to ensure whether any invalid values were previously returned due to attempts to read too far. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Add convenience function get_sampler_instanceCarl Worth2015-01-162-0/+120
| | | | | | | | | | This is similar to the existing functions get_instance, get_array_instance, etc. for getting a type singleton. The new get_sampler_instance() function will be used by the upcoming shader cache. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/live_variables: Use a worklistJason Ekstrand2015-01-151-55/+75
| | | | | | | | | | This is a rework of the liveness algorithm using a worklist as suggested by Connor. Doing so reduces the number of times we walk over the instructions because we don't have to do an entire pointless walk over the instructions just to figure out it's time to stop. Also, the stuff after the last loop in the funciton will only ever get visited once. Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a worklist helper structureJason Ekstrand2015-01-153-0/+237
| | | | | | | A worklist is a common concept in optimizations. This adds a structure that we can reuse for many different types of optimizations. Reviewed-by: Connor Abbott <[email protected]>
* nir: fix incorrect argument passed to validate_src() in validate_tex_instr()Brian Paul2015-01-151-1/+1
| | | | | | Silences a compiler warning. Reviewed-by: Connor Abbott <[email protected]>
* nir: silence compiler warning from visit_src() callBrian Paul2015-01-151-1/+1
| | | | | | v2: use proper argument Reviewed-by: Connor Abbott <[email protected]>
* util/hash_set: Rework the API to know about hashingJason Ekstrand2015-01-159-103/+76
| | | | | | | | | | | | | | | | | | | | | | | | Previously, the set API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash set to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_set_add(ht, _mesa_pointer_hash(key), key); In the above case, there is no reason why the hash set shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash set will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. This is analygous to 94303a0750 where we did this for hash_table Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* util: Move main/set to util/hash_setJason Ekstrand2015-01-151-1/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* hash_table: Rename insert_with_hash to insert_pre_hashedJason Ekstrand2015-01-151-1/+1
| | | | | | | We already have search_pre_hashed. This makes the APIs match better. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/algebraic: Only replace an instruction onceJason Ekstrand2015-01-151-1/+3
| | | | | | | | | Without the break, it was possible that an instruction would match multiple expressions. If this happened, you could end up trying to replace it multiple times and get a segfault. This makes it so that, after a successful replacement, it moves on to the next instruction. Reviewed-by: Connor Abbott <[email protected]>
* nir/vars_to_ssa: Use the copy lowering from lower_var_copiesJason Ekstrand2015-01-151-152/+46
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a pass for lowering copy instructionsJason Ekstrand2015-01-153-0/+227
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/vars_to_ssa: Refactor get_deref_nodeJason Ekstrand2015-01-151-20/+25
| | | | | | | | This refactor allows you to more easily get the deref node associated with a given variable. We then use that new functionality in the deref_may_be_aliased function instead of creating a 1-element deref chain. Reviewed-by: Connor Abbott <[email protected]>
* nir: Rename lower_variables to lower_vars_to_ssaJason Ekstrand2015-01-153-5/+5
| | | | | | | | The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <[email protected]>
* nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src arrayJason Ekstrand2015-01-156-40/+48
| | | | | | | | This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <[email protected]>
* nir/validate: Only build in debug modeJason Ekstrand2015-01-152-0/+11
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Improve documentationJason Ekstrand2015-01-151-27/+79
| | | | | | | | Additional description was added to a variety of places. Also, we no longer use the term "leaf" to describe fully-qualified direct derefs. Instead, we simply use the term "direct" or spell it out completely. Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Use a for loop for get_deref_nodeJason Ekstrand2015-01-151-58/+48
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Use the actual FNV-1a hash for hashing derefsJason Ekstrand2015-01-152-90/+79
| | | | | | We also switch to using loops rather than recursion. Reviewed-by: Connor Abbott <[email protected]>
* nir: Make intrinsic flags into an enumJason Ekstrand2015-01-151-14/+14
| | | | | | | | This should be much better for debugging as GDB will pick up on the fact that it's an enum and actually tell you what you're looking at instead of giving you some arbitrary hex value you have to go look up. Reviewed-by: Connor Abbott <[email protected]>