summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965
Commit message (Collapse)AuthorAgeFilesLines
* i965: Use intel->gen >= 6 instead of IS_GEN6.Eric Anholt2010-08-223-5/+5
|
* i965: Rename nr_depth_regs to nr_payload_regs.Eric Anholt2010-08-205-8/+8
| | | | | | | Only 8 out of the up to 13 regs are for source/dest depth, so the name wasn't particularly appropriate. Note that this doesn't count the constant or URB payload regs. Also, don't pre-divide by 2, so it's actually a number of registers.
* i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB.Eric Anholt2010-08-203-17/+18
|
* i965: Add support for FB writes on Sandybridge.Zhenyu Wang2010-08-202-12/+74
|
* i965: Set the destination horiz stride even for da16, as SNB seems to need it.Zhenyu Wang2010-08-202-2/+6
|
* i965: Set the maximum number of threads on Sandybridge.Zhenyu Wang2010-08-201-1/+5
|
* i965: Add AccWrCtl support on Sandybridge.Zhenyu Wang2010-08-205-2/+20
| | | | Whenever the accumulator results are needed, this bit must be set.
* i965: Mention the mlen and rlen for URB reads.Zhenyu Wang2010-08-201-0/+5
|
* i965: Sandybridge doesn't have Compr4 mode, since it's not needed any more.Zhenyu Wang2010-08-201-1/+2
|
* i965: Adjust disasm of subreg numbers to be in units of the register type.Zhenyu Wang2010-08-201-6/+20
| | | | | This makes reading the code easier when matching up to the specs, which also use this format.
* i965: Fix DP write channel ordering on Sandybridge.Eric Anholt2010-08-201-2/+25
| | | | | The SIMD16 message no longer has the goofy interleaved format that made Compr4 compression necessary before.
* i965: Fix compile warnings on 64-bit Linux.Kenneth Graunke2010-08-201-4/+4
| | | | format ‘%d’ expects type ‘int’, but argument 2 has type ‘long int’
* i965: Set the if stack pop count when breaking out of a loop inside an if.Eric Anholt2010-08-181-5/+11
| | | | | Otherwise, we might end up with the if stack pointing at the wrong place. Fixes GPU hang with glsl-vs-if-loop.
* i965: Don't set the swizzle on an immediate value in the VS.Eric Anholt2010-08-182-4/+11
| | | | | | | Fixes glsl-vs-if-nested (70.0 is not <= 70.000648 thanks to the swizzle bits getting set). Some safety checks are added to make sure this doesn't happen again as we increase the usage of immediate values in program generation.
* i965: Throw a link error when we see a "return" in main().Eric Anholt2010-08-171-0/+8
| | | | | We'll need to use the HALT instruction to do this right, like returns from other functions.
* i965: Add support for DP2 in the VS.Eric Anholt2010-08-171-0/+4
| | | | Fixes glsl-vs-dot-vec2.
* i965: Use the implied move available in most brw_wm_emit brw_math() calls.Eric Anholt2010-08-161-16/+4
| | | | | | | | This saves an extra message reg move in the program, though I'm not clear on whether it will have any performance impact other than cache footprint. It will also fix those math calls on Sandybridge, where the brw_eu_emit.c brw_math() support relies on the implied move being used.
* i965: Add disasm for Compr4 instruction compression.Eric Anholt2010-08-161-1/+16
|
* Merge branch 'master' into glsl2Ian Romanick2010-08-133-63/+65
|\
| * mesa: Remove inclusion of compiler.h from mtypes.h.Vinson Lee2010-07-311-0/+2
| | | | | | | | | | | | | | mtypes.h does not use any symbols from compiler.h. Also add the required headers for files that depended on symbols from compiler.h but were indirectly including compiler.h through mtypes.h.
| * intel: Declare the various tracked state variables using "extern"Kristian Høgsberg2010-07-291-62/+62
| |
| * intel: Remove unused intel/server filesKristian Høgsberg2010-07-271-1/+1
| |
* | intel: Remove include of texmem.h, since we haven't used it in ages.Eric Anholt2010-08-131-1/+1
| |
* | i965: More s/stderr/stdout/ for program debug.Eric Anholt2010-08-093-3/+3
| |
* | i965: Settle on printing our program debug to stdout.Eric Anholt2010-08-043-10/+11
| | | | | | | | | | | | Mixing stderr (_mesa_print_program, _mesa_print_instruction, _mesa_print_alu) with stdout means that when writing both to a file, there isn't a consistent ordering between the two.
* | Initialize a couple of HasIndex2 fields on Mesa IR src regs.Eric Anholt2010-08-021-0/+1
| |
* | ir_to_mesa: Respect the driver if it rejects a shader.Eric Anholt2010-07-281-4/+2
| |
* | Merge remote branch 'origin/master' into glsl2Eric Anholt2010-07-2634-256/+1092
|\| | | | | | | | | | | | | | | | | | | | | This pulls in multiple i965 driver fixes which will help ensure better testing coverage during development, and also gets past the conflicts of the src/mesa/shader -> src/mesa/program move. Conflicts: src/mesa/Makefile src/mesa/main/shaderapi.c src/mesa/main/shaderobj.h
| * i965: Fix reversed naming of the operations in compute-to-mrf optimization.Eric Anholt2010-07-263-6/+11
| | | | | | | | | | Also fix up comments, so that the difference between the two passes is clarified.
| * i965: Clean up a few magic numbers to use brw_defines.h defs.Eric Anholt2010-07-263-18/+20
| |
| * i965: Use MIN2, MAX2 instead of rolling our own.Eric Anholt2010-07-261-15/+12
| |
| * i965: Fold the "is arithmetic" bit of 965 opcodes into the opcode list.Eric Anholt2010-07-261-50/+26
| |
| * i965: Remove some duped register size/count definitionsEric Anholt2010-07-262-34/+26
| |
| * i965: Move the GRF-to-MRF optimizations to brw_optimize.c.Eric Anholt2010-07-263-619/+618
| |
| * i965: Improve (i.e. remove) some grf-to-mrf unnecessary movesBenjamin Segovia2010-07-261-2/+626
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several routines directly analyze the grf-to-mrf moves from the Gen binary code. When it is possible, the mov is removed and the message register is directly written in the arithmetic instruction Also redundant mrf-to-grf moves are removed (frequently for example, when sampling many textures with the same uv) Code was tested with piglit, warsow and nexuiz on an Ironlake machine. No regression was found there Note that the optimizations are *deactivated* on Gen4 and Gen6 since I did test them properly yet. No reason there are bugs but who knows The optimizations are currently done in branch free programs *only*. Considering branches is more complicated and there are actually two paths: one for branch free programs and one for programs with branches Also some other optimizations should be done during the emission itself but considering that some code is shader between vertex shaders (AOS) and pixel shaders (SOA) and that we may have branches or not, it is pretty hard to both factorize the code and have one good set of strategies
| * i965: Allow VS MOVs to use immediate constants.Eric Anholt2010-07-261-0/+1
| | | | | | | | | | Clarifies program assembly, and with a little tweak to always use constant_map, we could cut down on constant buffer payload.
| * i965: Cleanly fail programs with unsupported array access.Eric Anholt2010-07-231-1/+28
| | | | | | | | | | This should be more useful for developers and for bug triaging than just generating wrong code.
| * i965: Add support for VS relative addressing of temporary arrays.Eric Anholt2010-07-231-2/+49
| | | | | | | | Fixes glsl-vs-arrays. Bug #27388.
| * i965: Respect VS/VP point size result when enabled.Eric Anholt2010-07-221-3/+4
| | | | | | | | Fixes glsl-vs-point-size.
| * i965: Fix the disasm output for da16 src widths.Eric Anholt2010-07-221-1/+1
| | | | | | | | | | | | This has confused me twice now. It's a fixed width of 4 (usually a region description of <4,4,1>), not 1. If it was 1, we'd have been skipping all over register space.
| * i965: Avoid extra MOV in VS indirect register reads.Eric Anholt2010-07-221-15/+16
| |
| * i965: Fix up VS temporary array access for fixed index offset != 0.Eric Anholt2010-07-221-1/+1
| |
| * i965: In the VS, multiply the address reg by the appropriate register size.Eric Anholt2010-07-211-27/+14
| | | | | | | | | | | | | | | | | | | | | | | | The ARL value is increments of vec4 in the register file. But PROGRAM_TEMPORARY or PROGRAM_INPUT are stored as vec4s interleaved between the two verts being executed (thus a vec8 each), compared to PROGRAM_STATE_VAR being packed vec4s. Fixes: glsl-vs-arrays-2 glsl-vs-mov-after-deref (without regressing glsl-vs-arrays-3)
| * i965: Clean up brw_dp_READ_4_vs() now that it has fewer options to support.Eric Anholt2010-07-213-52/+31
| |
| * i965: Support relative addressed VS constant reads using the appropriate msg.Eric Anholt2010-07-213-31/+66
| | | | | | | | | | The previous support was overly complicated by trying to use the same 1-OWORD message for both offsets.
| * i965: Fix the DP read msg_control definitions other than plain OWORD.Eric Anholt2010-07-211-6/+16
| |
| * i965: Clean up dead code from the VS get_constant/get_reladdr_constant split.Eric Anholt2010-07-211-3/+1
| |
| * i956: Set the execution size correctly for scratch space writes.Eric Anholt2010-07-211-2/+2
| | | | | | | | | | | | | | | | Otherwise, the second half isn't written, and we end up reading back black. Fixes the remaining junk drawn in glsl-max-varyings, and will likely help with a number of large real-world shaders.
| * i965: Set the GEM domain flags for the scratch space.Eric Anholt2010-07-211-1/+1
| | | | | | | | | | | | They go into the render cache, so while we don't care about their contents after execution, failing to note them could cause the writes to be flushed over important buffer contents later.
| * i965: Use the pretty define for 4-oword DP reads.Eric Anholt2010-07-211-1/+1
| |