aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_wm.h
Commit message (Collapse)AuthorAgeFilesLines
* i965: Move up duplicated fields from stage-specific prog_data to ↵Francisco Jerez2014-02-191-1/+0
| | | | | | | | | | | | | brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <[email protected]>
* i965: Use sample barycentric coordinates with per sample shadingAnuj Phogat2014-01-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
* i965/fs: add support for gl_SampleMaskIn[]Chris Forbes2013-12-141-0/+1
| | | | | | | | v2: - add assert so we don't run into trouble on Gen6. - adjust for Tapani's rearrangement of ir_variable Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop trailing whitespace from the rest of the driver.Kenneth Graunke2013-12-051-5/+5
| | | | | | | Performed via: $ for file in *; do sed -i 's/ *//g'; done Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Gen4-5: Include alpha func/ref in program keyChris Forbes2013-11-061-0/+2
| | | | | | | V2: Better explanation of the rationale for doing this. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add FS backend for builtin gl_SampleIDAnuj Phogat2013-11-011-0/+1
| | | | | | | | | | | | | V2: - Update comments - Add compute_sample_id variables in brw_wm_prog_key - Add a special backend instruction to compute sample_id. V3: - Make changes to support simd16 mode. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Add FS backend for builtin gl_SamplePositionAnuj Phogat2013-11-011-0/+2
| | | | | | | | | | | | | V2: - Update comments. - Add compute_pos_offset variable in brw_wm_prog_key. - Add variable uses_pos_offset in brw_wm_prog_data. V3: - Make changes to support simd16 mode. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Remove dead arguments from prog_data_compare.Eric Anholt2013-10-151-2/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: compute DDX in a subspan based only on top rowChia-I Wu2013-10-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Consider only the top-left and top-right pixels to approximate DDX in a 2x2 subspan, unless the application requests a more accurate approximation via GL_FRAGMENT_SHADER_DERIVATIVE_HINT or this optimization is disabled from the new driconf option disable_derivative_optimization. This results in a less accurate approximation. However, it improves the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at 95.0% confidence) on Haswell. No noticeable image quality difference observed. The improvement comes from faster sample_d. It seems, on Haswell, some optimizations are introduced to allow faster sample_d when all pixels in a subspan have the same derivative. I considered SAMPLE_STATE too, which allows one to control the quality of sample_d on Haswell. But it gave much worse image quality without giving better performance comparing to this change. No piglit quick.tests regression on Haswell (tested with v1). v2: better guess for precompile program key Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: fix alpha test for MRTChris Forbes2013-07-061-1/+1
| | | | | | | | | | | | | | | | | Include src0 alpha in the RT write message when using MRT, so it is used for the alpha test instead of the normal per-RT alpha value. Fixes broken rendering in Dota2 under Wine [FDO #62647]. No Piglit regressions on Ivybridge. V2: reuse (and simplify) existing sample_alpha_to_coverage flag in the FS key, rather than adding another redundant one. Signed-off-by: Chris Forbes <[email protected]> Reviewd-by: Paul Berry <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647 NOTE: This is a candidate for the stable branches.
* i965: Remove render_target_supported from the vtable.Kenneth Graunke2013-07-031-2/+0
| | | | | | | | | | | | brw_render_target_supported() is the only implementation of this function, so it makes sense to just call it directly. Rather than adding an #include of brw_wm.h, this patch moves the prototype to brw_context.h. Prototypes seem to be in rather arbitrary places at the moment, and either place seems as good as the other. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Remove now dead brw_wm_prog_key::proj_attrib_mask field.Kenneth Graunke2013-04-041-2/+0
| | | | | | | | | The previous commit removed the last user of this field, so there's no longer any point in setting it. Removing this should eliminate state-dependent recompiles, and make the precompile more reliable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Rename vp_outputs_written to input_slots_valid.Paul Berry2013-03-241-1/+1
| | | | | | | | | | | | | | With the introduction of geometry shaders, fragment inputs will no longer come exclusively from the vertex shader; sometimes they come from the geometry shader. So the name "vp_outputs_written" will become a misnomer. This patch renames vp_outputs_written to input_slots_valid, to reflect the true meaning of the bitfield from the fragment shader's point of view: it indicates which of the possible input slots contain valid data that was written by the previous shader stage. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Change fragment input related bitfields to 64-bit.Paul Berry2013-03-151-1/+1
| | | | | | | | | | | | | | This patch updates the bitfields brw_context::wm.input_size_masks, tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of which are indexed by gl_frag_attrib, from 32-bit to 64-bit. This paves the way for supporting geometry shaders, and for merging the gl_frag_attrib and gl_vert_result enums. The combination of these two will require at least 55 bits in the bitfields. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Brian Paul <[email protected]>
* i965/fs: Move struct brw_compile (p) entirely inside fs_generator.Kenneth Graunke2012-11-261-1/+0
| | | | | | | | | | | | | | | | The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. v2: rzalloc p, as suggested by Eric. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move uses of brw_compile from do_wm_prog to brw_wm_fs_emit.Kenneth Graunke2012-11-261-3/+10
| | | | | | | | | | | | The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_wm_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_compile::fp to fs_visitor.Kenneth Graunke2012-11-261-2/+1
| | | | | | | | Also change it from a brw_fragment_program to a gl_fragment_program, since that seems to be what everything wants anyway. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_compile::dispatch_width into fs_visitor.Kenneth Graunke2012-11-261-2/+0
| | | | | | | | | | | | | | Also, rather than having brw_wm_fs_emit poke at it directly, make it a parameter to the fs_visitor constructor. All other changes generated by search and replace (with occasional whitespace fixup). v2: Make dispatch_width const (as suggested by Paul); fix doxygen mistake (pointed out by Eric); update for rebase. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_lookup_iz() to fs_visitor::setup_payload_gen4().Kenneth Graunke2012-11-261-3/+0
| | | | | | | This necessitates compiling brw_wm_iz.c as C++. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Move brw_wm_payload_setup() to fs_visitor::setup_payload_gen6()Kenneth Graunke2012-11-261-2/+0
| | | | | | | | Now that we only have the one backend, there's no real point in keeping this separate. Moving it should allow some future simplifications. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Remove brw_wm_compile::computes_depth field.Kenneth Graunke2012-11-261-1/+0
| | | | | | | | | | Everybody determines this by checking if fp's OutputsWritten field contains the FRAG_RESULT_DEPTH bit. Rather than having payload setup check this and set the computes_depth flag, we can just do the check in the only place that actually used it: emit_fb_writes(). Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Remove the old ARB_fragment_program backend.Eric Anholt2012-10-081-364/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the param pointer arrays for the WM dynamically sized.Eric Anholt2012-09-071-0/+1
| | | | | | | | | Saves 26.5MB of wasted memory allocation in the l4d2 demo. v2: Rebase on compare func change, fix comments. Reviewed-by: Ian Romanick <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add functions for comparing two brw_wm/vs_prog_data structs.Eric Anholt2012-09-071-0/+2
| | | | | | | | | | | | Currently, this just avoids comparing all unused parts of param[] and pull_param[], but it's a step toward getting rid of those giant statically sized arrays. v2: Actually use the new function instead of just looking at its address. This required changing the args to const pointers. (review by Kenneth) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/msaa: flag _NEW_MULTISAMPLE in the brw_tracked_stateAnuj Phogat2012-08-301-1/+1
| | | | | | | This is required to get the program recompiled when SampleAlphaToCoverage is enabled. Reviewed-by: Paul Berry <[email protected]>
* i965/msaa: Add sample-alpha-to-coverage support for multiple render targetsAnuj Phogat2012-08-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | Render Target Write message should include source zero alpha value when sample-alpha-to-coverage is enabled for an FBO with multiple render targets. Source zero alpha value is used as fragment coverage for all the render targets. This patch makes piglit tests draw-buffers-alpha-to-coverage and alpha-to-coverage-no-draw-buffer-zero to pass on Sandybridge. No regressions are observed with piglit all.tests. V2: Revert all the changes made in emit_color_write() function to include src0 alpha for targets > 0. Now handling this case in a if block. V3: Correctly calculate the instruction length for buffer zero. Properly handle the case of dual_src_blend when alpha-to-coverage is enabled. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Add performance debug for shader recompiles.Eric Anholt2012-08-121-0/+3
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Always emit alpha when nr_color_buffers == 0.Kenneth Graunke2012-07-121-1/+0
| | | | | | | | | | | | | | | If alpha-testing is enabled, we need to send alpha down the pipeline even if nr_color_buffers == 0. However, tracking whether alpha-testing is enabled in the WM program key is expensive: it causes us to compile multiple specializations of the same shader, using program cache space. This patch removes the check for alpha-testing, and simply emits alpha whenever nr_color_buffers == 0. We believe this will also be necessary for alpha-to-coverage, and it should add minimal overhead to an uncommon case. Saving the recompiles should more than make up the difference. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Compute dFdy() correctly for FBOs.Paul Berry2012-06-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On i965, dFdx() and dFdy() are computed by taking advantage of the fact that each consecutive set of 4 pixels dispatched to the fragment shader always constitutes a contiguous 2x2 block of pixels in a fixed arrangement known as a "sub-span". So we calculate dFdx() by taking the difference between the values computed for the left and right halves of the sub-span, and we calculate dFdy() by taking the difference between the values computed for the top and bottom halves of the sub-span. However, there's a subtlety when FBOs are in use: since FBOs use a coordinate system where the origin is at the upper left, and window system framebuffers use a coordinate system where the origin is at the lower left, the computation of dFdy() needs to be negated for FBOs. This patch modifies the fragment shader back-ends to negate the value of dFdy() when an FBO is in use. It also modifies the code that populates the program key (brw_wm_populate_key() and brw_fs_precompile()) so that they always record in the program key whether we are rendering to an FBO or to a window system framebuffer; this ensures that the fragment shader will get recompiled when switching between FBO and non-FBO use. This will result in unnecessary recompiles of fragment shaders that don't use dFdy(). To fix that, we will need to adapt the GLSL and NV_fragment_program front-ends to record whether or not a given shader uses dFdy(). I plan to implement this in a future patch series; I've left FIXME comments in the code as a reminder. Fixes Piglit test "fbo-deriv". NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Pass the gl_renderbuffer to render_target_supported() vtable method.Eric Anholt2012-01-271-1/+2
| | | | | | I'm going to want to go looking at it for an integer texture fix. NOTE: This is a candidate for the 8.0 branch.
* i965/fs: Factor out texturing related data from brw_wm_prog_key.Kenneth Graunke2011-12-191-16/+3
| | | | | | | | | | The idea is to reuse this for the VS and (in the future) GS as well. v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Ian Romanick <[email protected]>
* intel: Add the context to the render_target_supported() vtbl method.Eric Anholt2011-11-221-1/+1
| | | | | | | We're going to want to provide different answers per chipset generation. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa, i965: prepare for more than 8 texture targetsChia-I Wu2011-11-031-1/+1
| | | | | | | | | 3-bit fields are used store texture target in several places. That will fail when TEXTURE_EXTERNAL_INDEX, which happends to be the 9th texture target, is added. Make them 4-bit fields. Reviewed-by: Brian Paul <[email protected]> Acked-by: Jakob Bornecrantz <[email protected]>
* i965/gen6+: Parameterize barycentric interpolation modes.Paul Berry2011-10-271-0/+1
| | | | | | | | | | | | | | | | | This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <[email protected]>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-6/+6
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965/fs: Add support for TXD with shadow comparisons.Kenneth Graunke2011-06-181-0/+12
| | | | | | | | | | | | | | Our hardware doesn't have a sample_d_c message, so we have to do a regular sample_d and emit instructions to manually perform the comparison. This requires a state dependent recompile whenever the sampler's compare mode or function change. This adds the per-sampler comparison functions to brw_wm_prog_key, but only sets them when the sampler's compare mode is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Do a FS compile up front at link time to produce link errors.Eric Anholt2011-05-271-1/+6
| | | | | | At glLinkShaders time, a fail() call in FS compile in 8-wide (the one that's required to succeed, though we may relax that at some point for pre-Ironlake performance) will now report out as a link error.
* i965: Pack the lookup and line_aa bits into the first dword of the key.Eric Anholt2011-05-261-2/+2
| | | | | | | They were occupying whole 32-bit words, despite being only 10 or so bits. Reduces code size slightly (80/3300 bytes). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove dead shadowtex_mask entry in the WM key.Eric Anholt2011-05-261-1/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove linear_color for GL_PERSPECTIVE_CORRECTION_HINT.Eric Anholt2011-05-261-1/+0
| | | | | | | | | | From the GL 2.1 spec: "Required perspective-correct interpolation for all fragment attributes except depth in sections 3.4.1 and 3.5.1, effectively making GL PERSPECTIVE CORRECT HINT a no-op." Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for correct GL_CLAMP behavior by clamping coordinates.Eric Anholt2011-05-181-1/+1
| | | | | | | | This removes the stupid strict-conformance fallback code I broke when adding ARB_sampler_objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36572 Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965: Fix fragcoord_w on gen6 with 16-wide.Eric Anholt2011-04-291-5/+5
| | | | | | | | | | | | The payload regs can go all the way up to register 60+, so just give them 8 bits to be addressed by instead of 3-4 (which made source_w_reg of 8 end up 0). There's no reason to aggressively pack these fields, as they are just used as compiler information, where being easier to access is probably more important than shaving a byte or two off of the structure. Fixes piglit fragcoord_w. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36649
* i965/fs: Add initial support for 16-wide dispatch on gen6.Eric Anholt2011-04-261-1/+3
| | | | | | | | | At this point it doesn't do uniforms, which have to be laid out the same between 8 and 16. Other than that, it supports everything but flow control, which was the thing that forced us to choose 8-wide for general GLSL support. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't double-emit fragment.color writes for MRT with ARB_fp.Eric Anholt2011-04-231-1/+0
|
* intel: Add support for ARB_color_buffer_float.Eric Anholt2011-04-201-0/+1
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: Remove the CompileShader driver hook; it's just a no-op.Kenneth Graunke2011-03-171-2/+0
|
* i965: Fix alpha testing when there is no color buffer in the FBO.Eric Anholt2011-03-151-0/+1
| | | | | We were alpha testing against an unwritten value, resulting in garbage. (part of) Bug #35073.
* intel: Add a vtbl hook for determining if a format is renderable.Eric Anholt2011-01-071-0/+1
| | | | | | | By relying on just intel_span_supports_format, some formats that aren't supported pre-gen4 were not reporting FBO incomplete. And we also complained in stderr when it happened on i915 because draw_region gets called before framebuffer completeness validation.
* i965: Drop KIL_NV from the ff/ARB_fp path since it was only used for GLSL.Eric Anholt2010-12-081-1/+0
|