aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_eu.c
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Add an instruction flag for choosing the flag subregister.Eric Anholt2012-12-111-0/+6
| | | | | | | | We're going to redo discard handling to track discards in the other flag subregister, saving instructions in the discard and allowing predicated jumps out to the end of the shader. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Let brw_flag_reg() choose the flag reg and subreg.Eric Anholt2012-12-111-1/+1
| | | | | | We're about to start using the f0.1 subregister. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop putting 8 NOPs after each prorgam.Eric Anholt2012-09-171-8/+0
| | | | | | | | | | | | | As far as I can see, the intention of the requirement that we do so is to prevent instruction prefetch from wandering out into either unmapped memory or memory with a different caching type, and hanging the chip. The kernel makes sure that the page after your BO has a valid page of the same caching type, which meets this requirement, so there's no need to waste space between our programs (and in instruction cache) on this. Saves another 9kb instructions in l4d2 shaders. Acked-by: Kenneth Graunke <[email protected]>
* i965: Add support for instruction compaction on Gen7.Kenneth Graunke2012-09-171-0/+2
| | | | | | | | | | Reduces l4d2 program size from 1195kb to 919kb. Improves performance by 0.22% +/- 0.11% (n=70). v2: Rebase on compaction v2, fix up flag reg handling (by anholt). v3: Fix uncompaction of the flag register number. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add support for instruction compaction.Eric Anholt2012-09-171-8/+30
| | | | | | | | | | | | | | | This reduces program size by using some smaller encodings for common bit patterns in the Gen ISA, with the hope of making programs fit in the instruction cache better. v2: Use larger bitshifts for the uncompressed field setups, in line with the way it's described in the spec. Consistently name a brw_compile "p" like all other code. Add a couple more tests. Consistently call things "compacted" not "compressed" (which is a different feature). Drop the explicit check for not compacting SENDs, which is unjustified and already implied by our lack of support for immediate values. Reviewed-by: Paul Berry <[email protected]>
* i965: Move program dump to a helper function in brw_eu.c.Eric Anholt2012-09-171-1/+23
| | | | | | | | | It's going to get more complicated when we do instruction compaction. This also introduces putting the program offset in the output. v2: Use next_insn_offset in brw_get_program(), too. Reviewed-by: Paul Berry <[email protected]>
* i965: Clear brw_compile on setup.Eric Anholt2012-09-171-0/+2
| | | | | | | | I noticed in valgrind that p->single_program_flow was used while uninitialized. Everything else zeroed out brw_compile, but this is better API. Reviewed-by: Paul Berry <[email protected]>
* i965: Make brw_set_saturate() use stdbool.Eric Anholt2012-08-081-2/+2
| | | | | | There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of 1, which would get converted to 0 when stored. Instead, use stdbool which converts nonzero to true/1 like we want.
* i965: Fix brw_swap_cmod() for LE/GE comparisons.Kenneth Graunke2012-06-181-4/+4
| | | | | | | | | | | | | | | | | | | | | The idea here is to rewrite comparisons like 2 >= x with x <= 2; we want to simply exchange arguments, not negate the condition. If equality was part of the original comparison, it should remain part of the swapped version. This is the true cause of bug #50298. It didn't manifest itself on Sandybridge because we embed the conditional modifier in the IF instruction rather than emitting a CMP. All other platforms use CMP. It also didn't manifest itself on the master branch because commit be5f27a84d ("glsl: Refine the loop instruction counting.") papered over the problem. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50298 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove vestiges of function call support from the old VS backend.Kenneth Graunke2012-04-091-124/+0
| | | | | | | | This never worked. brwProgramStringNotify also explicitly rejects programs that use CAL and RET. So there's no need for this to exist. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: increase the brw eu instruction store size dynamicallyYuanhan Liu2011-12-261-0/+7
| | | | | | | | | | | | | | Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 10000'. v2: comments from ken, do not hardcode the eu limit to (1024 * 1024) Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: let the if_stack just store the instruction indexYuanhan Liu2011-12-261-2/+1
| | | | | | | | | | | | | If dynamic instruction store size is enabled, while after the brw_IF/ELSE() and before the brw_ENDIF() function, the eu instruction store base address(p->store) may change. Thus let if_stack just store the instruction index. This is somehow more flexible and safe than store the instruction memory address. Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop.Eric Anholt2011-12-211-0/+1
| | | | | | | The codegen backends all had this same tracking, so just do it at the EU level. Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start.Eric Anholt2011-12-211-0/+4
| | | | | | | This is a similar cleanup to what we did for brw_IF(), brw_ELSE(), brw_ENDIF() handling. Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Replace incorrect use of GLboolean with enum brw_compression.Kenneth Graunke2011-10-111-1/+3
| | | | | | | | | | | | | | | brw_set_compression_control took a GLboolean as an argument, then promptly used a switch statement to compare it with various enumeration values. Clearly it's not actually a boolean. Introduce a new enumeration type, enum brw_compression, and use that. Found by converting GLboolean to bool; clang then gave warnings about switching on a boolean and ultimately duplicated case errors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-1/+1
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Move IF stack handling into the EU abstraction layer/brw_compile.Kenneth Graunke2011-05-171-0/+8
| | | | | | | | This hides the IF stack and back-patching of IF/ELSE instructions from each of the code generators, greatly simplifying the interface. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Get a ralloc context into brw_compile.Kenneth Graunke2011-05-171-1/+4
| | | | | | | | | | | | This would be so much easier if we were using C++; we could simply use constructors and destructors. Instead, we have to update all the callers. While we're at it, ralloc various brw_wm_compile fields rather than explicitly calloc/free'ing them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Pass brw_compile pointer to brw_set_src[01].Kenneth Graunke2011-05-161-1/+1
| | | | | | | | This makes it symmetric with brw_set_dest, which is convenient, and will also allow for assertions to be made based off of intel->gen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Constant-fold immediates in src0 of SEL instructions.Eric Anholt2011-04-131-0/+5
| | | | | | | | | | | This is like what we do for add/mul, but we have to invert the predicate to choose the other source instead. This removes 5 extra moves of constants in nexuiz shaders. No statistically significant performance difference on my Sandybridge laptop (n=5). Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Constant-fold immediates in src0 of CMP instructions.Eric Anholt2011-04-131-0/+22
| | | | | | | This is like what we do with add/mul, but we also have to flip the conditional test. Reviewed-by: Ian Romanick <[email protected]>
* i965: Add support for the instruction compression bits on gen6.Eric Anholt2010-12-061-1/+34
| | | | | | Since the 8-wide first-quarter and 16-wide first-half have the same bit encoding, we now need to track "do you want instruction compression" in the compile state.
* i965: Add AccWrCtl support on Sandybridge.Zhenyu Wang2010-08-201-0/+6
| | | | Whenever the accumulator results are needed, this bit must be set.
* Replace _mesa_malloc, _mesa_calloc and _mesa_free with plain libc versionsKristian Høgsberg2010-02-191-2/+2
|
* i965: Spell "conditional" correctly.Eric Anholt2009-08-041-1/+1
|
* i965: rewrite the code for handling shader subroutine callsBrian Paul2009-02-131-0/+123
| | | | | | | | | | | | | | | | | | Previously, the prog_instruction::Data field was used to map original Mesa instructions to brw instructions in order to resolve subroutine calls. This was a rather tangled mess. Plus it's an obstacle to implementing dynamic allocation/growing of the instruction buffer (it's still a fixed size). Mesa's GLSL compiler emits a label for each subroutine and CAL instruction. Now we use those labels to patch the subroutine calls after code generation has been done. We just keep a list of all CAL instructions that needs patching and a list of all subroutine labels. It's a simple matter to resolve them. This also consolidates some redundant post-emit code between brw_vs_emit.c and brw_wm_glsl.c and removes some loops that cleared the prog_instruction::Data fields at the end. Plus, a bunch of new comments.
* i965: new integrated graphics chipset supportXiang, Haihao2008-01-291-1/+2
|
* Add Intel i965G/Q DRI driver.Eric Anholt2006-08-091-0/+130
This driver comes from Tungsten Graphics, with a few further modifications by Intel.