aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_eu.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Rename brw_compile to brw_codegenJason Ekstrand2015-04-221-14/+14
| | | | | | | | | | | | This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Remove the context field from brw_compilerJason Ekstrand2015-04-221-11/+7
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the disassembler take a device_info instead of a contextJason Ekstrand2015-04-221-4/+4
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make instruction compaction take a device_info instead of a contextJason Ekstrand2015-04-221-1/+1
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the brw_inst helpers take a device_info instead of a contextJason Ekstrand2015-04-221-14/+14
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Add a devinfo parameter to brw_compileJason Ekstrand2015-04-221-0/+1
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Replace guess_execution_size with something simpler.Matt Turner2015-04-211-0/+7
| | | | | | | | | | | | | | | | | | | | | | | guess_execution_size() does two things: 1. Cope with small destination registers. 2. Cope with SIMD8 vs SIMD16 mode. This patch replaces the first with a simple if block in brw_set_dest: if the destination register width is less than 8, you probably want the execution size to match. (I didn't put this in the 3src block because it doesn't seem to matter.) Since only the FS compiler cares about SIMD16 mode, it's easy to just set the default execution size there. This pattern was already been proven in the Gen8+ generator, but we didn't port it back to the existing generator when we combined the two. This is based on a patch from Ken from about a year ago. I've rebased it and and fixed a few bugs. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Fix assertion in brw_reg_type_lettersBen Widawsky2015-03-021-1/+1
| | | | | | | | | | | | | | | | | While using various debugging features (optimization debug, instruction dumping, etc) this function is called in order to get a readable letter for the type of unit. On GEN8, two new units were added, the Qword and the Unsigned Qword (Q, and UQ respectively). The existing assertion tries to determine that the argument passed in is within the correct boundary, however, it was using UQ as the upper limit instead of Q. To my knowledge you can only hit this case with the branch I am currently working on, so it doesn't fix any known issues. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Introduce brw_negate_cmod().Kenneth Graunke2015-02-271-0/+22
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Return NONE from brw_swap_cmod on unknown input.Matt Turner2014-08-121-1/+1
| | | | | | | | Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to false. Shouldn't gcc warn about this? Reported-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util: Move ralloc to a new src/util directory.Kenneth Graunke2014-08-041-1/+1
| | | | | | | | | | | | | | | | | | For a long time, we've wanted a place to put utility code which isn't directly tied to Mesa or Gallium internals. This patch creates a new src/util directory for exactly that purpose, and builds the contents as libmesautil.la. ralloc seemed like a good first candidate. These days, it's directly used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl didn't make much sense. Signed-off-by: Kenneth Graunke <[email protected]> v2 (Jason Ekstrand): More realloc uses and some scons fixes Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Disable hex offset printing in disassembly.Kenneth Graunke2014-07-211-1/+2
| | | | | | | | | | | | | | | Printing the hex offsets makes it basically impossible to diff assembly: if you add even a single instruction, the entire shader shows up as a difference. So, every time I want to compare assembly, I have to strip this out. The hex offsets might be useful when debugging compaction, or when inspecting the program cache buffer. Since it's occasionally useful, but uncommon, this patch disables it by default, but makes it easy to re-enable it temporarily when the need arises. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make a brw_conditional_mod enum.Matt Turner2014-07-051-1/+1
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-011-3/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* i965: Replace struct brw_compact_instruction with brw_compact_inst.Matt Turner2014-06-261-1/+1
| | | | | Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Replace 'struct brw_instruction' with 'brw_inst'.Matt Turner2014-06-261-4/+4
| | | | | | | | Use this an an opportunity to clean up the formatting of some old code (brw_ADD, for instance). Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pass brw rather than gen to brw_disassemble_inst().Matt Turner2014-06-261-1/+1
| | | | | | | We will need it in order to use the new brw_inst API. Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Convert brw_eu.[ch] to use the new brw_inst API.Kenneth Graunke2014-06-261-15/+17
| | | | | | | v2: Don't set flag_reg_nr prior to Gen7 (as it doesn't exist). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-111-2/+6
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Put '_default_' in the name of functions that set default state.Kenneth Graunke2014-06-021-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Delete brw_set_conditionalmod.Kenneth Graunke2014-06-021-5/+0
| | | | | | | | | This removes the ability to set the default conditional modifier on all future instructions. Nothing uses it, and it's not really a sensible thing to do anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c.Kenneth Graunke2014-05-271-18/+0
| | | | | | | | Only the Gen4-5 SF program compiler actually uses this function; move it there. Soon the fields will be moved out of brw_compile. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Drop useless push/pop state from flag register mashing code.Kenneth Graunke2014-05-271-2/+0
| | | | | | | | There's no point in pushing and popping the default state; the code between the two stack operations doesn't alter anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Reset flag_value to 0xff before emitting SF subroutines.Kenneth Graunke2014-05-271-1/+0
| | | | | | | | | | | | | | | | | | | | | When compiling any of the SF program variants, flag_value starts off as 0xff and will be modified when generating code. brw_emit_anyprim_setup emits several subroutines, saving and restoring flag_value across each of them. Since it starts out as 0xff, this is equivalent to simply setting it to 0xff at the start of each subroutine. Resetting the value makes more logical sense; each subroutine doesn't know whether one of the others even executed, much less what it did to the flag register. This also lets us to drop the brw_set_predicate_control_flag_value call from brw_init_compile: predicate is already initialized to BRW_PREDICATE_NONE by the memset, and the value of flag_value is irrelevant (as it's only used by the SF compiler). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.Kenneth Graunke2014-05-181-1/+2
| | | | | | | "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst.Kenneth Graunke2014-05-181-1/+1
| | | | | | | | We're going to use "disassemble" for the function that disassembles the whole program. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Pass brw_context and assembly separately to brw_dump_compile.Matt Turner2014-05-151-5/+3
| | | | | | | | brw_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pull brw_compact_instructions() out of brw_get_program().Matt Turner2014-05-151-2/+0
| | | | | Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: Disassemble the compaction control bit.Matt Turner2014-05-151-1/+2
| | | | | | | | | brw_disasm doesn't disassemble compacted instructions, so we uncompact before disassembling them which would unset the compaction control bit. Instead pass it as a separate argument. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix register types in dump_instructions().Kenneth Graunke2014-02-051-0/+29
| | | | | | | | | | | | | | This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <[email protected]> Reported-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Replace 8-wide and 16-wide with SIMD8 and SIMD16.Eric Anholt2014-01-171-4/+4
| | | | | | | | Those are the terms used in the docs, and think "n-wide" was something I just happened to say. Note that shader-db needs updating for the INTEL_DEBUG=fs parsing. Reviewed-by: Ian Romanick <[email protected]>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
* i965: dump the disassembly to the given fileTopi Pohjolainen2013-12-271-10/+10
| | | | | | | | instead of ignoring the argument and always dumping to standard output. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Don't use GL types in files shared with intel-gpu-tools.Kenneth Graunke2013-12-051-9/+9
| | | | | | | | | sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \ -e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \ -e 's/GLshort/int16_t/g' \ brw_eu* brw_disasm.c brw_structs.h Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Drop trailing whitespace from files shared with intel-gpu-tools.Kenneth Graunke2013-12-051-8/+8
| | | | | | Performed via s/ *$//g. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Move intel_context::gen and gt fields to brw_context.Kenneth Graunke2013-07-091-3/+3
| | | | | | | | | | Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-3/+2
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965/fs: Add an instruction flag for choosing the flag subregister.Eric Anholt2012-12-111-0/+6
| | | | | | | | We're going to redo discard handling to track discards in the other flag subregister, saving instructions in the discard and allowing predicated jumps out to the end of the shader. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Let brw_flag_reg() choose the flag reg and subreg.Eric Anholt2012-12-111-1/+1
| | | | | | We're about to start using the f0.1 subregister. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop putting 8 NOPs after each prorgam.Eric Anholt2012-09-171-8/+0
| | | | | | | | | | | | | As far as I can see, the intention of the requirement that we do so is to prevent instruction prefetch from wandering out into either unmapped memory or memory with a different caching type, and hanging the chip. The kernel makes sure that the page after your BO has a valid page of the same caching type, which meets this requirement, so there's no need to waste space between our programs (and in instruction cache) on this. Saves another 9kb instructions in l4d2 shaders. Acked-by: Kenneth Graunke <[email protected]>
* i965: Add support for instruction compaction on Gen7.Kenneth Graunke2012-09-171-0/+2
| | | | | | | | | | Reduces l4d2 program size from 1195kb to 919kb. Improves performance by 0.22% +/- 0.11% (n=70). v2: Rebase on compaction v2, fix up flag reg handling (by anholt). v3: Fix uncompaction of the flag register number. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add support for instruction compaction.Eric Anholt2012-09-171-8/+30
| | | | | | | | | | | | | | | This reduces program size by using some smaller encodings for common bit patterns in the Gen ISA, with the hope of making programs fit in the instruction cache better. v2: Use larger bitshifts for the uncompressed field setups, in line with the way it's described in the spec. Consistently name a brw_compile "p" like all other code. Add a couple more tests. Consistently call things "compacted" not "compressed" (which is a different feature). Drop the explicit check for not compacting SENDs, which is unjustified and already implied by our lack of support for immediate values. Reviewed-by: Paul Berry <[email protected]>
* i965: Move program dump to a helper function in brw_eu.c.Eric Anholt2012-09-171-1/+23
| | | | | | | | | It's going to get more complicated when we do instruction compaction. This also introduces putting the program offset in the output. v2: Use next_insn_offset in brw_get_program(), too. Reviewed-by: Paul Berry <[email protected]>
* i965: Clear brw_compile on setup.Eric Anholt2012-09-171-0/+2
| | | | | | | | I noticed in valgrind that p->single_program_flow was used while uninitialized. Everything else zeroed out brw_compile, but this is better API. Reviewed-by: Paul Berry <[email protected]>
* i965: Make brw_set_saturate() use stdbool.Eric Anholt2012-08-081-2/+2
| | | | | | There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of 1, which would get converted to 0 when stored. Instead, use stdbool which converts nonzero to true/1 like we want.
* i965: Fix brw_swap_cmod() for LE/GE comparisons.Kenneth Graunke2012-06-181-4/+4
| | | | | | | | | | | | | | | | | | | | | The idea here is to rewrite comparisons like 2 >= x with x <= 2; we want to simply exchange arguments, not negate the condition. If equality was part of the original comparison, it should remain part of the swapped version. This is the true cause of bug #50298. It didn't manifest itself on Sandybridge because we embed the conditional modifier in the IF instruction rather than emitting a CMP. All other platforms use CMP. It also didn't manifest itself on the master branch because commit be5f27a84d ("glsl: Refine the loop instruction counting.") papered over the problem. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50298 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove vestiges of function call support from the old VS backend.Kenneth Graunke2012-04-091-124/+0
| | | | | | | | This never worked. brwProgramStringNotify also explicitly rejects programs that use CAL and RET. So there's no need for this to exist. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: increase the brw eu instruction store size dynamicallyYuanhan Liu2011-12-261-0/+7
| | | | | | | | | | | | | | Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 10000'. v2: comments from ken, do not hardcode the eu limit to (1024 * 1024) Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: let the if_stack just store the instruction indexYuanhan Liu2011-12-261-2/+1
| | | | | | | | | | | | | If dynamic instruction store size is enabled, while after the brw_IF/ELSE() and before the brw_ENDIF() function, the eu instruction store base address(p->store) may change. Thus let if_stack just store the instruction index. This is somehow more flexible and safe than store the instruction memory address. Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop.Eric Anholt2011-12-211-0/+1
| | | | | | | The codegen backends all had this same tracking, so just do it at the EU level. Reviewed-by: Yuanhan Liu <[email protected]>