aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_fs.cpp
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen6+: Add support for noperspective interpolation.Paul Berry2011-10-271-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This required the following changes: - WM setup now makes the appropriate set of barycentric coordinates (perspective vs. noperspective) available to the fragment shader, based on whether the shader requires perspective interpolation, noperspective interpolation, both, or neither. - The fragment shader backend now uses the appropriate set of barycentric coordiantes when interpolating, based on the interpolation mode returned by ir_variable::determine_interpolation_mode(). - SF setup now uses gl_fragment_program::InterpQualifier to determine which attributes are to be flat shaded (as opposed to the old logic, which only flat shaded colors). - CLIP setup now ensures that the clipper outputs non-perspective barycentric coordinates when they are needed by the fragment shader. Fixes the remaining piglit tests of interpolation qualifiers that were failing: - interpolation-flat-*-smooth-none - interpolation-flat-other-flat-none - interpolation-noperspective-* - interpolation-smooth-gl_*Color-flat-* Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: use determine_interpolation_mode().Paul Berry2011-10-271-4/+4
| | | | | | | | | | | | | | This patch changes how fs_visitor::emit_general_interpolation() decides what kind of interpolation to do. Previously, it used the shade model to determine how to interpolate colors, and used smooth interpolation on everything else. Now it uses ir_variable::determine_interpolation_mode(), so that it respects GLSL 1.30 interpolation qualifiers. Fixes piglit tests interpolation-flat-*-smooth-{distance,fixed,vertex} and interpolation-flat-other-flat-{distance,fixed,vertex}. Reviewed-by: Eric Anholt <[email protected]>
* i965/gen6+: Parameterize barycentric interpolation modes.Paul Berry2011-10-271-7/+18
| | | | | | | | | | | | | | | | | This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Fix split_virtual_grfs() when delta_xy not in a virtual register.Paul Berry2011-10-271-1/+1
| | | | | | | | | | | | | | This patch modifies the special case in fs_visitor::split_virtual_grfs() that prevents splitting from being applied to the delta_x/delta_y register pair (this register pair needs to remain contiguous so that it can be used by the PLN instruction). When gen>=6, this register pair is in a fixed location, not a virtual register, so it was in no danger of being split. And split_virtual_grfs' attempt not to split it was preventing some other unrelated register from being split. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Fix comparisions with uint negation.Eric Anholt2011-10-201-0/+13
| | | | | | | | | | The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes fs-op-neg-* Reviewed-by: Ian Romanick <[email protected]>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-1/+1
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Fix computation of abs(-x) in FSPaul Berry2011-10-111-1/+4
| | | | | | | | | | | | When updating a register reference to reflect the fact that we were taking its absolute value, the fragment shader back-end failed to clear the negate flag, resulting in abs(-x) getting computed as -abs(x). I also found (and fixed) a similar problem in brw_eu.h, but I'm not aware of an actual manifestation of that problem. Fixes piglit test glsl-fs-abs-neg-with-intermediate.
* mesa: Use gl_shader_program::_LinkedShaders instead of FragmentProgramIan Romanick2011-10-071-3/+5
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: silence unused var warnings in non-debug buildsBrian Paul2011-10-071-0/+1
| | | | Reviewed-by: Chad Versace <[email protected]>
* i965: Reverse the operands for INT DIV prior to Gen6.Kenneth Graunke2011-10-021-2/+15
| | | | | | | | | Apparently on Gen4 and 5, the denominator comes first. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Implement integer quotient and remainder math operations.Kenneth Graunke2011-10-021-2/+14
| | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Implement texelFetch() on Ironlake and Sandybridge.Kenneth Graunke2011-09-191-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: add casts to silence int/enum conversion warningsBrian Paul2011-09-061-2/+2
|
* mesa: put _mesa_ prefix on vert_result_to_frag_attrib()Brian Paul2011-09-061-2/+2
|
* Refactor code that converts between gl_vert_result and gl_frag_attrib.Paul Berry2011-09-061-14/+2
| | | | | | | | | | | | Previously, this conversion was duplicated in several places in the i965 driver. This patch moves it to a common location in mtypes.h, near the declaration of gl_vert_result and gl_frag_attrib. I've also added comments to remind us that we may need to revisit the conversion code when adding elements to gl_vert_result and gl_frag_attrib. Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-2/+2
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/fs: Implement textureSize (TXS) on Gen5+.Kenneth Graunke2011-08-231-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Don't double-convert integer/boolean uniforms.Kenneth Graunke2011-08-191-16/+20
| | | | | | | | | | | | | When ctx->Const.NativeIntegers is set, Core Mesa loads integer/boolean uniforms directly, rather than loading the floating point equivalent. So, when that's set, we don't need to perform any conversions. Unfortunately, we can't properly support native integers with the old vertex shader backend, so this patch leaves them disabled for now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Run the shader backend at link time and return compile failures.Eric Anholt2011-08-161-1/+1
| | | | | | Link failure is something that shouldn't happen, but we sometimes want it during development. The precompile also allows analysis of shader codegen with shader-db.
* i965: Rename math FS_OPCODE_* to SHADER_OPCODE_*.Eric Anholt2011-08-161-17/+17
| | | | I want to just use the same enums in the VS.
* i965: Create a shared enum for hardware and compiler-internal opcodes.Eric Anholt2011-08-161-2/+9
| | | | | This should make gdbing more pleasant, and it might be used in sharing part of the codegen between the VS and FS backends.
* i965: Drop the reg/hw_reg distinction.Eric Anholt2011-08-101-17/+17
| | | | | | | "reg" was set in only one case, virtual GRFs pre register allocation, and would be unset and have hw_reg set after allocation. Since we never bothered with looking at virtual GRF number after allocation anyway, just use the same storage and avoid confusion.
* i965/fs: Eliminate the magic nature of virtual GRF 0.Eric Anholt2011-08-101-6/+3
| | | | | | | This was a debugging aid at one point -- virtual grf 0 should never be allocated, and it would be used if undefined register access occurred in codegen. However, it made the confusing register allocation code even more confusing by indexing things off of 1 all over.
* i965/fs: Don't upload unused uniform components.Eric Anholt2011-08-051-3/+86
| | | | | | | | | This saves both register space and upload bandwidth for unused values. Note that previously we were relying on the visitor not initially generating references to different sets of uniforms between the 8-wide and 16-wide code generation, and now we're relying on them dead-code eliminating the same stuff, too.
* Merge branch 'glsl-to-tgsi'Bryan Cain2011-08-041-1/+1
|\ | | | | | | | | | | Conflicts: src/mesa/state_tracker/st_atom_pixeltransfer.c src/mesa/state_tracker/st_program.c
| * r200, r600c, i965: fix buildBryan Cain2011-08-011-1/+1
| |
* | i965/fs: Allow register coalescing where the source is a uniform.Eric Anholt2011-07-291-10/+14
| | | | | | | | Removes 0.8% of the fragment shader instructions on Unigine Tropics.
* | i965/fs: Optimize a * 1.0 -> a.Eric Anholt2011-07-291-0/+43
| | | | | | | | | | This appears in our instruction stream as a result of the brw_vs_constval.c handling.
* | i965/fs: If we see a RCP of a constant, try to constant fold it.Eric Anholt2011-07-291-0/+14
| |
* | i965/fs: Port texture projection avoidance optimization from the old backend.Eric Anholt2011-07-291-3/+15
| | | | | | | | | | | | | | This is part of fixing a ~1% performance regression in OpenArena when changing the fixed function fragment shader to using the new backend. Right now this just avoids the LINTERP of the projector, not the math using it.
* | i965/fs: Stop using the exec_list iterator.Eric Anholt2011-07-291-37/+33
| | | | | | | | | | The old style has gone out of favor in the project, but I kept copy and pasting from existing iterator code.
* | i965/fs: Add support for TXD with shadow comparisons.Kenneth Graunke2011-06-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Our hardware doesn't have a sample_d_c message, so we have to do a regular sample_d and emit instructions to manually perform the comparison. This requires a state dependent recompile whenever the sampler's compare mode or function change. This adds the per-sampler comparison functions to brw_wm_prog_key, but only sets them when the sampler's compare mode is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | i965/fs: Check for compilation failure and bail before optimizing.Kenneth Graunke2011-06-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, it would attempt to optimize and allocate registers for the program even if it failed to compile. This seems wasteful. More importantly, the "message length > 11" failure seems to choke the instruction scheduler, making it somehow use an undefined value and segmentation fault. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | i965: Use state streaming on programs, and state base address on gen5+.Eric Anholt2011-06-181-4/+2
|/ | | | | | | | | | There will be a little bit of thrashing of the program cache BO as the cache warms up, but once the application is in steady state, this reduces relocations on gen5 and later. On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6% +/- 1.3% (n=6). No statistically significant performance difference on nexuiz (n=5).
* Fix format not a string literal error with -Werror=format-securityEugeni Dodonov2011-06-101-1/+1
| | | | | | | A trivial fix for error: format not a string literal and no format arguments with compiling with -Werror=format-security flags. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use the embedded compare in SEL on gen6+.Eric Anholt2011-05-311-4/+8
| | | | | | | | | | | This avoids the extra CMP and the predication on SEL, so in addition to one less instruction, it makes scheduling less constrained. Improves glbenchmark Egypt performance 0.6% +/- 0.2% (n=3). Reduces FS instruction count across affected shaders in shader-db by 1.3% without regressing any. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Do a FS compile up front at link time to produce link errors.Eric Anholt2011-05-271-17/+93
| | | | | | At glLinkShaders time, a fail() call in FS compile in 8-wide (the one that's required to succeed, though we may relax that at some point for pre-Ironlake performance) will now report out as a link error.
* i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp.Eric Anholt2011-05-271-1679/+9
| | | | | | | | | We now have: brw_fs.cpp handles calling out to everything and optimization. brw_fs_visitor.cpp handles translating to our LIR. brw_fs_emit.cpp handles emitting from our LIR to native code. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Split the BRW native code emit to brw_fs_emit.cppEric Anholt2011-05-271-839/+0
| | | | | | | This is all separate from the visitor and the optimization passes which feed into it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move a couple of GLSL IR -> BRW helper functions to brw_shader.cpp.Eric Anholt2011-05-271-49/+1
| | | | | | These will be used by the VS backend as well. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move non-FS-specific shader support to brw_shader.cpp.Eric Anholt2011-05-271-99/+0
| | | | | | | These only existed in brw_fs.cpp because it was the only .cpp file in the area when I wrote them. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Avoid generating MOVs for assignments of expressions.Eric Anholt2011-05-271-11/+71
| | | | | | No statistically significant difference measured in 3dbenchmark egypt/pro. It does reduce fragment shader instructions across shader-db by 0.3%.
* i965/fs: Move the computation of register block count from unit to compile.Eric Anholt2011-05-271-2/+2
| | | | | | | No net code size change, but unit update is down 0.8% code size pre-gen6. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Track fixed GRF regs separate from allocated GRF file in scheduling.Eric Anholt2011-05-271-1/+1
| | | | | | | | | | | | | | There's an assumption here that fixed GRFs will never intersect with the allocated GRFs. That's true today, though it might change some day if we decide to register-allocate the regs containing push constants once they're dead. This fixes a regression in 0f7325b89038937bd428f7c89ed9859189a0ab0b in Lightsmark from the texture instructions now containing g0 references instead of having that be implied. Performance is improved 15.2% +/- 3.6% (n=3). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34968
* i965: Remove linear_color for GL_PERSPECTIVE_CORRECTION_HINT.Eric Anholt2011-05-261-1/+1
| | | | | | | | | | From the GL 2.1 spec: "Required perspective-correct interpolation for all fragment attributes except depth in sections 3.4.1 and 3.5.1, effectively making GL PERSPECTIVE CORRECT HINT a no-op." Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix assertion failures in unused brw_reg setup by deleting it.Eric Anholt2011-05-251-1/+0
| | | | | | | I was using undefined values to create an unused value. Go me. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37366 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix sampling on Ivybridge after headerless change.Kenneth Graunke2011-05-201-2/+13
| | | | | | Fixes a regression since 90e922267a89fa9bef254bb257405531ceff7356. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Remove "TXD" from justification of sampler message headers.Kenneth Graunke2011-05-201-1/+1
| | | | | | | The coordinate offsets set in the m1 header are for textureOffset; they have nothing to do with textureGrad (TXD). Signed-off-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't emit a header on gen5+ sample messages unless required.Eric Anholt2011-05-181-7/+19
| | | | | | Improves glbenchmark egypt performance 0.6% +/- 0.4% (n=6). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix GPU hang on texture2d-bias on pre-Ironlake.Eric Anholt2011-05-181-4/+7
| | | | | | | In the 16-wide rework, I missed that we were setting some things to be SIMD16 mode (corresponding to their setup in emit_texture_gen4()). Reviewed-by: Kenneth Graunke <[email protected]>