aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_clip.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Put '_default_' in the name of functions that set default state.Kenneth Graunke2014-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add annotation data structure and support code.Matt Turner2014-05-241-1/+1
| | | | | | | | | | | | | | | | Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <[email protected]>
* i965: Pass in start_offset to brw_compact_instructions().Matt Turner2014-05-241-1/+1
| | | | | | | Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.Kenneth Graunke2014-05-181-1/+1
| | | | | | | "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use brw_dump_compile for clip, SF, and old GS programs.Kenneth Graunke2014-05-181-4/+1
| | | | | | | | | | | | Looping over the instructions and calling brw_disasm doesn't handle compacted instructions. In most cases, this hasn't been a problem since we don't compact prior to Sandybridge. However, Sandybridge's transform feedback GS program should already be compacted, and so this ought to fix decoding of that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Pull brw_compact_instructions() out of brw_get_program().Matt Turner2014-05-151-1/+1
| | | | | Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: Disassemble the compaction control bit.Matt Turner2014-05-151-1/+1
| | | | | | | | | brw_disasm doesn't disassemble compacted instructions, so we uncompact before disassembling them which would unset the compaction control bit. Instead pass it as a separate argument. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move compiler debugging output to stderr.Eric Anholt2014-02-221-3/+3
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
* i965: Drop trailing whitespace from the rest of the driver.Kenneth Graunke2013-12-051-16/+16
| | | | | | | Performed via: $ for file in *; do sed -i 's/ *//g'; done Signed-off-by: Kenneth Graunke <[email protected]>
* i965: get rid of clip plane compactionChris Forbes2013-08-161-1/+2
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965 Gen4/5: clip: Add support for noperspective varyingsChris Forbes2013-08-011-0/+2
| | | | | | | | | | | | | | | | | | | Adds support for interpolating noperspective varyings linearly in screen space when clipping. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024341.html At this point all -fixed and -vertex interpolation tests work. V5: Add brw_clip_compile.has_noperspective_shading rather than another key flag. V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965 Gen4/5: clip: correctly handle flat varyingsChris Forbes2013-08-011-1/+4
| | | | | | | | | | | | | | | | Previously we only gave special treatment to the builtin color varyings. This patch adds support for arbitrary flat-shaded varyings, which is required for GLSL 1.30. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024340.html V5: Move key.do_flat_shading to brw_clip_compile.has_flat_shading V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965 Gen4/5: Introduce 'interpolation map' alongside the VUE mapChris Forbes2013-08-011-1/+7
| | | | | | | | | | | | | | | | | | | | | | The interpolation map (in brw->interpolation_mode) is a new auxiliary structure alongside the post-GS VUE map, which describes the interpolation modes for each VUE slot, for use by the clip and SF stages. This patch introduces a new state atom to compute the interpolation map, and adjusts the program keys for the clip and SF stages, but it is not actually used yet. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> V3: Updated for vue_map changes, intel -> brw merge, etc. (Chris Forbes) V4: Compute interpolation map as a new state atom rather than tacking it on the front of the clip setup V5: Rework commit message, make interpolation_mode_map a struct. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Delete intel_context entirely.Kenneth Graunke2013-07-091-2/+1
| | | | | | | | | | This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Move intel_context::gen and gt fields to brw_context.Kenneth Graunke2013-07-091-3/+2
| | | | | | | | | | Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Move intel_context::reduced_primitive to brw_context.Kenneth Graunke2013-07-091-1/+1
| | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965/gen4-5: Stop using bogus polygon_offset_scale field.Eric Anholt2013-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | The polygon offset math used for triangles by the WM is "OffsetUnits * 2 * MRD + OffsetFactor * m" where 'MRD' is the minimum resolvable difference for the depth buffer (~1/(1<<16) or ~1/(1<<24)), 'm' is the approximated slope from the GL spec, and '2' is this magic number from the original i965 code dump that we deviate from the GL spec by because "it makes glean work" (except that it doesn't, because of some hilarity with 0.5 * approximately 2.0 != 1.0. go glean!). This clipper code for unfilled polygons, on the other hand, was doing "OffsetUnits * garbage + OffsetFactor * m", where garbage was MRD in the case of 16-bit depth visual (regardless the FBO's depth resolution), or 128 * MRD for 24-bit depth visual. This change just makes the unfilled polygons behavior match the WM's filled polygons behavior. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.Paul Berry2013-03-241-5/+4
| | | | | | | | | | | | | This patch modifies post-GS pipeline stages (transform feedback, clip, sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather than brw->vs.prog_data->vue_map. This ensures that when geometry shader support is added, these pipeline stages will consult the geometry shader output VUE map when appropriate, rather than the vertex shader output VUE map. v2: Fixed some stale "CACHE_NEW_VS_PROG" comments. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move brw_vs_prog_data::outputs_written into VUE map.Paul Berry2013-03-241-1/+1
| | | | | | | | | | | | | | Future patches will allow for there to be separate VUE maps when both a geometry shader and a vertex shader are in use. When this happens, we will want to have correspondingly separate outputs_written bitfields. Moving outputs_written into the VUE map will make this easy. For consistency with the terminology used in the VUE map, the bitfield is renamed to "slots_valid" in the process. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move VUE map computation to once at VS compile time.Eric Anholt2012-02-211-1/+1
| | | | | | | | | | With this and the previous patch, 640x480 nexuiz is running 0.169118% +/- 0.0863696% faster (n=121). On a VS state change microbenchmark, performance is increased 8.28645% +/- 0.460478% (n=52). v2: Fix CACHE_NEW_VS comment. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the userclip flag for the VUE map come from VS prog data.Eric Anholt2012-02-211-3/+3
| | | | | | | | This reduces recomputation of state based on non-clipping-related transform changes, and is a step toward removing VUE map recomputation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move program compile to emit() time.Eric Anholt2011-10-291-2/+3
| | | | | | | Only 4 other prepare() functions are left, which don't rely on this. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Make brw_compute_vue_map's userclip dependency a boolean.Paul Berry2011-10-061-1/+1
| | | | | | | | | | | | | Previously, brw_compute_vue_map required an argument indicating the number of clip planes in use, but all it did with it was check if it was nonzero. This patch changes brw_compute_vue_map to take a boolean instead. This allows us to avoid some unnecessary recompilation of the Gen4/5 GS and SF threads. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Create _mesa_bitcount_64() to replace i965's brw_count_bits()Paul Berry2011-10-061-1/+1
| | | | | | | | | | | | | | | | The i965 driver already had a function to count bits in a 64-bit uint (brw_count_bits()), but it was buggy (it only counted the bottom 32 bits) and it was clumsy (it had a strange and broken fallback for non-GCC-like compilers, which fortunately was never used). Since Mesa already has a _mesa_bitcount() function, it seems better to just create a _mesa_bitcount_64() function rather than special-case this in the i965 driver. This patch creates the new _mesa_bitcount_64() function and rewrites all of the old brw_count_bits() calls to refer to it. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove two_side_color from brw_compute_vue_map().Paul Berry2011-09-061-3/+1
| | | | | | | | | | | Since we now lay out the VUE the same way regardless of whether two-sided color is enabled, brw_compute_vue_map() no longer needs to know whether two-sided color is enabled. This allows the two-sided color flag to be removed from the clip, GS, and VS keys, so that fewer GPU programs need to be recompiled when turning two-sided color on and off. Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: Remove no-longer-needed variables.Paul Berry2011-09-061-23/+0
| | | | | | | | The variables offset[], idx_to_attr[], nr_bytes, nr_attrs, and header_regs were all serving purposes which are now served by the VUE map. Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: Change computation of nr_regs to use VUE map.Paul Berry2011-09-061-5/+5
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: Move header_regs into brw_clip_compile.Paul Berry2011-09-061-5/+4
| | | | | | This makes header_regs available for computing VUE offsets within clip code. Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: Move hpos_offest and ndc_offset into local functions.Paul Berry2011-09-061-2/+0
| | | | | | | | | The offsets within the VUE of HPOS and NDC are needed only in a few auxiliary clipping functions. This patch moves computation of those offsets into the functions that need them, and does the computation using the VUE map. Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: rename header_position_offset to the more correct ndc_offset.Paul Berry2011-09-061-1/+1
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: clip: Add VUE map computation to clip stage for Gen4-5.Paul Berry2011-09-061-0/+3
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-1/+1
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Use state streaming on programs, and state base address on gen5+.Eric Anholt2011-06-181-14/+10
| | | | | | | | | | There will be a little bit of thrashing of the program cache BO as the cache warms up, but once the application is in steady state, this reduces relocations on gen5 and later. On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6% +/- 1.3% (n=6). No statistically significant performance difference on nexuiz (n=5).
* i965: Get a ralloc context into brw_compile.Kenneth Graunke2011-05-171-1/+7
| | | | | | | | | | | | This would be so much easier if we were using C++; we could simply use constructors and destructors. Instead, we have to update all the callers. While we're at it, ralloc various brw_wm_compile fields rather than explicitly calloc/free'ing them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove dead entrypoints to state cache, rename the one that's left.Eric Anholt2011-04-291-9/+6
| | | | | | | | | | As we expanded the usage of the state cache, it grew extra functionality. However, with the recent state streaming rework, we're back to the state cache being used only for shader kernels, which is the piece of GPU state that's actually expensive to compute again from scratch, since it involves compiling. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Annotate debug printout checks with unlikely().Eric Anholt2010-11-031-2/+2
| | | | | | | This provides the optimizer with hints about code hotness, which we're quite certain about for debug printouts (or, rather, while we developers often hit the checks for debug printouts, we don't care about performance while doing so).
* Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg2010-10-131-1/+1
|
* i965: Reduce repeated calculation of the attribute-offset-in-VUE.Eric Anholt2010-07-191-9/+11
| | | | | | This cleans up some chipset dependency sprinkled around, and fixes a potential overflow of the attribute offset array for many vertex results.
* i965: Clarify the nr_regs calculation in brw_clip.cEric Anholt2010-07-191-3/+8
|
* intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt2010-06-081-2/+2
| | | | | The slightly less mechanical change of converting the emit_reloc calls will follow.
* i965: Dump out the correct shared function for SEND on Ironlake.Eric Anholt2010-05-141-1/+2
|
* i965: Support INTEL_DEBUG=clip to dump the clip program.Eric Anholt2010-05-141-1/+7
|
* intel: Clean up chipset name and gen num for IronlakeZhenyu Wang2010-04-211-3/+3
| | | | | | | | | Rename old IGDNG to Ironlake, and set 'gen' number for Ironlake as 5, so tracking the features with generation num instead of special is_ironlake flag. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
* i965: Allow for variable-sized auxdata in the state cache.Eric Anholt2010-01-191-7/+8
| | | | | | Everything has been constant-sized until now, but constant buffer handling changes will make us want some additional variable sized array.
* intel: Replace IS_IGDNG checks with intel->is_ironlake or needs_ff_sync.Eric Anholt2009-12-221-5/+6
| | | | Saves ~480 bytes of code.
* Merge branch 'outputswritten64'Ian Romanick2009-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Add a GLbitfield64 type and several macros to operate on 64-bit fields. The OutputsWritten field of gl_program is changed to use that type. This results in a fair amount of fallout in drivers that use programs. No changes are strictly necessary at this point as all bits used are below the 32-bit boundary. Fairly soon several bits will be added for clip distances written by a vertex shader. This will cause several bits used for varyings to be pushed above the 32-bit boundary. This will affect any drivers that support GLSL. At this point, only the i965 driver has been modified to support this eventuality. I did this as a "squash" merge. There were several places through the outputswritten64 branch where things were broken. I foresee this causing difficulties later for bisecting. The history is still available in the branch. Conflicts: src/mesa/drivers/dri/i965/brw_wm.h
* i965: fix EXT_provoking_vertex supportRoland Scheidegger2009-11-111-0/+1
| | | | | | | | This didn't work for quad/quadstrips at all, and for all other primitive types it only worked when they were unclipped. Fix up the former in gs stage (could probably do without these changes and instead set QuadsFollowProvokingVertexConvention to false), and the rest in clip stage.
* i965: Don't clip everything if FRONT_AND_BACK culling while culling disabled.Eric Anholt2009-07-201-1/+2
| | | | | | Fixes everything-black with meta_clear_tris on quake4-mpdemo and doom3-demo. Bug #18844, 22077.
* i965: add support for new chipsetsXiang, Haihao2009-07-131-4/+18
| | | | | | | | | | 1. new PCI ids 2. fix some 3D commands on new chipset 3. fix send instruction on new chipset 4. new VUE vertex header 5. ff_sync message (added by Zou Nan Hai <[email protected]>) 6. the offset in JMPI is in unit of 64bits on new chipset 7. new cube map layout