aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_shader.cpp
Commit message (Collapse)AuthorAgeFilesLines
* i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.Kenneth Graunke2014-06-231-2/+7
| | | | | | | | | | | | | As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE workarounds for textureGather() bugs, so there's no need to emit a second set of identical copies. To keep things simple, just point the gather surface index base to the same place as the texture surface index base. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "10.2" <[email protected]>
* i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.Matt Turner2014-06-171-0/+3
| | | | | | Will be used to simplify the handling of large virtual GRFs in SSA form. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode.Kenneth Graunke2014-06-151-0/+2
| | | | | | | "shader_time_add" is a lot more informative than "op152". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-111-2/+2
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Skip IR annotations with INTEL_DEBUG=noann.Matt Turner2014-06-011-2/+4
| | | | | | | | | Running shader-db with INTEL_DEBUG=noann reduces the runtime from ~90 to ~80 seconds on my machine. It also reduces the disk space consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on disk). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Give dump_instructions() a filename argument.Matt Turner2014-06-011-2/+20
| | | | | | | | This will allow debugging code to dump the IR after an optimization pass makes progress (the next patch). Only let it open and write to a file if the effective user isn't root. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Print disassembly after compaction.Matt Turner2014-05-241-0/+54
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: Relax accumulator dependency scheduling on Gen < 6Iago Toral Quiroga2014-05-131-0/+10
| | | | | | | | | | | Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 Reviewed-by: Matt Turner <[email protected]>
* i965: Add reads_accumulator_implicitly() function.Matt Turner2014-04-161-0/+13
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Drop do_common_optimization's max_unroll_iterations parameter.Kenneth Graunke2014-04-111-1/+1
| | | | | | | | | | | | Now that we pass in gl_shader_compiler_options, it makes sense to just use options->MaxUnrollIterations, rather than passing a separate parameter. Half of the invocations already passed options->MaxUnrollIterations, while the other half passed in a hardcoded value of 32. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use EmitNoIndirect flags in lower_variable_index_to_cond_assign.Kenneth Graunke2014-04-111-8/+7
| | | | | | | This will prevent the two from getting out of sync again. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove stale comment.Eric Anholt2014-04-081-1/+0
| | | | | | | | We stopped doing variable index lowering for uniforms in a64c1eb9b110f29b8abf803a8256306702629bdc, 5 months after the comment was added. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Pass ctx->Const.NativeIntegers to do_common_optimization().Kenneth Graunke2014-04-081-1/+2
| | | | | | | | | | | The next few patches will introduce an optimization that only works when integers are not represented as floating point values. v2: Re-word-wrap a line, as requested by Ian Romanick. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Mark is_tex() and friends as const.Matt Turner2014-04-051-5/+5
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa/sso: rename Shader to the pointer _ShaderGregory Hainaut2014-03-251-2/+2
| | | | | | | | | | | | | | | | Basically a sed but shaderapi.c and get.c. get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior shaderapi.c => the old api stil update the Shader object directly V2: formatting improvement V3 (idr): * Rebase fixes after a block of code was moved from ir_to_mesa.cpp to shaderapi.c. * Trivial reformatting. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Merge resolving of shader program sourceTopi Pohjolainen2014-03-051-1/+4
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Merge initialisation of backend_visitorTopi Pohjolainen2014-03-051-0/+12
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Don't try to dump shader source for fixed-function FS programs.Kenneth Graunke2014-02-261-1/+1
| | | | | | | | | | sh->Source is NULL and this will segfault. Fixes MESA_GLSL=dump with "The Swapper". Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Stop lowering ir_triop_lrp.Kenneth Graunke2014-02-261-2/+0
| | | | | | | | | | Both the vector and scalar backends now support it natively, so there's no point in lowering it. Cc: "10.1" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Eric Anholt <[email protected]>
* glsl/i965: move lower_offset_array up to GLSL compiler level.Dave Airlie2014-02-251-1/+1
| | | | | | | | This lowering pass will be useful for gallium drivers as well, in order to support the GL TG4 oddity that is textureGatherOffsets. Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Move compiler debugging output to stderr.Eric Anholt2014-02-221-13/+12
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a file argument to the IR printer.Eric Anholt2014-02-221-1/+1
| | | | | | | | | | | | While we want to be able to print to stdout for glsl_compiler, for debugging drivers we want to be able to dump to stderr because that's where other driver debug (like LIBGL_DEBUG) tends to go, and because some apps actually close stdout to shut up their own messages (such as the X Server, or NWN). Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Revert "i965/fs: Make fs_reg's type an enum for better debugging."Matt Turner2014-02-211-1/+1
| | | | | | | | This reverts commit 5ceadd29b0af835d741bcf09b9622c628e549ae6. I rebased and apparently failed to build test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75355
* i965/fs: Make fs_reg's type an enum for better debugging.Matt Turner2014-02-211-1/+1
| | | | | | Since the enum is marked as packed, it'll still take only one byte. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: support gl_InvocationID for gen7Jordan Justen2014-02-201-0/+2
| | | | | | | | | | | | | v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <[email protected]> Acked-by: Paul Berry <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free ↵Francisco Jerez2014-02-191-1/+1
| | | | | | | | function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <[email protected]>
* glsl: Add image type to the GLSL IR.Francisco Jerez2014-02-121-0/+2
| | | | | | | | | v2: Reuse the glsl_sampler_dim enum for images. Reuse the glsl_type::sampler_* fields instead of creating new ones specific to image types. Reuse the same constructor as for samplers adding a new 'base_type' argument. Reviewed-by: Paul Berry <[email protected]>
* i965: Add can_do_saturate() method to backend_instruction.Matt Turner2014-01-281-0/+44
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: introduce blorp specific rt-write for fs_generatorTopi Pohjolainen2014-01-231-0/+2
| | | | | | | | | | | | | | | | | The compiler for blorp programs likes to emit instructions for the message construction itself meaning that the generator needs to skip any such when blorp programs are translated for the hw. In addition, the binding table control is special for blorp programs and the generator does not need to update the binding tables associated with the compiler bookkeeping (this in fact gets thrown away as the blorp compiler sets the program data in its own way). v2 (Paul): do not hardcode the binding table index but use fs_inst::target instead. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: introduce non-compressed equivalent of tex_cmsTopi Pohjolainen2014-01-231-0/+3
| | | | | | | v2: introduces 'SHADER_OPCODE_TXF_UMS' also for gen8 Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: rename tex_ms to tex_cmsTopi Pohjolainen2014-01-231-3/+3
| | | | | | | | | | Prepares for the introduction of non-compressed multi-sampled lookup used in the blorp programs. v2: now also taking into account gen8 Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.Paul Berry2014-01-211-1/+1
| | | | | | | | | | | | | | | | In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Stop doing our optimization on a copy of the GLSL IR.Eric Anholt2014-01-171-32/+23
| | | | | | | | | | | The original intent was that we'd keep a driver-private copy, and there would be the normal copy for swrast to make use of without the tuning (or anything more invasive we might do) specific to i965. Only, we don't generate swrast code any more, because swrast can't render current shaders anyway. Thus, our private copy is rather a waste, and we can just do our backend-specific operations on the linked shader. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp.Paul Berry2014-01-081-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Use gl_shader::Stage instead of gl_shader::Type where possible.Paul Berry2014-01-081-2/+3
| | | | | | | | | | | | | | | | | | | | | This reduces confusion since gl_shader::Type is sometimes GL_SHADER_PROGRAM_MESA but is more frequently GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}. It also has the advantage that when switching on gl_shader::Stage, the compiler will alert if one of the possible enum types is unhandled. Finally, many functions in src/glsl (especially those dealing with linking) already use gl_shader_stage to represent pipeline stages; using gl_shader::Stage in those functions avoids the need for a conversion. Note: in the process I changed _mesa_write_shader_to_file() so that if it encounters an unexpected shader stage, it will use a file suffix of "????" rather than "geom". Reviewed-by: Brian Paul <[email protected]> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Store gl_shader_stage enum in gl_shader objects.Paul Berry2014-01-081-0/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Clean up nomenclature for pipeline stages.Paul Berry2014-01-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL_*_SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL_*_PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL_*_SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL_*_PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <[email protected]> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <[email protected]>
* Rename overloads of _mesa_glsl_shader_target_name().Paul Berry2013-12-301-2/+2
| | | | | | | | | | | | Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <[email protected]>
* glsl: move variables in to ir_variable::data, part ITapani Pälli2013-12-121-1/+1
| | | | | | | | | | This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.Paul Berry2013-12-091-2/+0
| | | | | | | | Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <[email protected]>
* glsl/loops: consolidate bounded loop handling into a lowering pass.Paul Berry2013-12-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the i965 fs and vec4 visitors) had nearly identical logic for handling bounded loops. This replaces the duplicate logic with an equivalent lowering pass that is used by all the back-ends. Note: on i965, there is a slight increase in instruction count. For example, a loop like this: for (int i = 0; i < 100; i++) { total += i; } would previously compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) break(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop After this patch, the "(+f0) break(8)" turns into: (+f0) if(8) break(8) endif(8) because the back-end isn't smart enough to recognize that "if (condition) break;" can be done using a conditional break instruction. However, it should be relatively easy for a future peephole optimization to properly optimize this. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Add shader opcode for sampling MCS surfaceChris Forbes2013-12-071-0/+3
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add a 'has_side_effects' back-end instruction predicate.Francisco Jerez2013-11-041-0/+11
| | | | | | | | | | | | | This patch fixes the three dead code elimination passes and the VEC4/FS instruction scheduling passes so they leave instructions with side effects alone. At some point it might be interesting to have the instruction scheduler calculate the exact memory dependencies between atomic ops, but they're rare enough that it seems unlikely that it will make any practical difference. Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Use the gen7 scratch read opcode when possible.Eric Anholt2013-10-301-0/+2
| | | | | | | | | | This avoids a lot of message setup we had to do otherwise. Improves GLB2.7 performance with register spilling force enabled by 1.6442% +/- 0.553218% (n=4). v2: Use BRW_PREDICATE_NONE, improve a comment (by Paul). Reviewed-by: Paul Berry <[email protected]>
* i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITEEric Anholt2013-10-301-9/+5
| | | | | | | I'm going to be introducing gen7 variants, and the previous naming was going to get confusing. Reviewed-by: Paul Berry <[email protected]>
* i965: Implement ABO surface state emission.Francisco Jerez2013-10-291-0/+7
| | | | | | | | | | | | The maximum number of atomic buffer objects is somewhat arbitrary, we can change it in the future easily if it turns out it's not enough... v2: Add comments with the relevant mesa dirty bits. Fix usage of BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom. v3: Update binding table layout diagrams. v4: Resolve conflicts with the recent dynamic surface index assignment changes. Reviewed-by: Paul Berry <[email protected]>
* glsl: Add new atomic_uint built-in GLSL type.Francisco Jerez2013-10-291-0/+1
| | | | | | | | | v2: Fix GLSL version in which the type became available. Add contains_atomic() convenience method. Split off atomic counter comparison error checking to a separate patch that will handle all opaque types. Include new ir_variable fields for atomic types. Reviewed-by: Ian Romanick <[email protected]>
* i965: Add lowering pass to fold offset into unnormalized coordsChris Forbes2013-10-261-0/+1
| | | | | | | | | | | | | | | It turns out that nonzero offsets with gsampler2DRect don't work -- they just return garbage. Work around this by folding the offset into the coord. Done as an IR pass rather than yet another hack in the visitors because it's clear what's going on this way. Can possibly reuse this to replace the existing txf coord+offset hacks. V2: Use ir_builder Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add lowering pass for splitting textureGatherOffsetsChris Forbes2013-10-261-0/+1
| | | | | | | | | | | | | | | | Rewrites textureGatherOffsets(s, p, offsets) into gvec4( textureGatherOffset(s, p, offsets[0]).w, textureGatherOffset(s, p, offsets[1]).w, textureGatherOffset(s, p, offsets[2]).w, textureGatherOffset(s, p, offsets[3]).w ) V2: Use ir_builder to be slightly clearer. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: relax brw_texture_offset assertChris Forbes2013-10-261-2/+7
| | | | | | | | Some texturing ops are about to have nonconstant offset support; the offset in the header in these cases should be zero. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>