aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_vs.h
Commit message (Collapse)AuthorAgeFilesLines
* i965/vs: Fix unit mismatch in scratch base_offset parameter.Kenneth Graunke2012-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | move_grf_array_access_to_scratch() calculates scratch buffer offsets in bytes. However, emit_scratch_read/write() expects the base_offset parameter to be measured in OWords. As a result, a shader using a scratch read/write offset greater than zero (in practice, a shader containing more than one variable in scratch) would use too large an offset, frequently exceeding the available scratch space. This patch corrects the mismatch by removing spurious conversion from OWords to bytes in move_grf_array_access_to_scratch(). This is based on a patch by Paul Berry. NOTE: This is a candidate for stable release branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vp: Remove support for reading destination registers.Eric Anholt2012-09-271-1/+0
| | | | | | It's prohibited by ARB_vp and NV_vp, and not used by fixed function t&l. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the param pointer arrays for the VS dynamically sized.Eric Anholt2012-09-071-0/+1
| | | | | | | | | Saves 96MB of wasted memory in the l4d2 demo. v2: Rebase on compare func change, change brace style. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Add functions for comparing two brw_wm/vs_prog_data structs.Eric Anholt2012-09-071-0/+2
| | | | | | | | | | | | Currently, this just avoids comparing all unused parts of param[] and pull_param[], but it's a step toward getting rid of those giant statically sized arrays. v2: Actually use the new function instead of just looking at its address. This required changing the args to const pointers. (review by Kenneth) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add VS program key dumping to INTEL_DEBUG=perf.Kenneth Graunke2012-08-271-0/+3
| | | | | | | Eric added support for WM key debugging. This adds it for the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove vestiges of function call support from the old VS backend.Kenneth Graunke2012-04-091-3/+0
| | | | | | | | This never worked. brwProgramStringNotify also explicitly rejects programs that use CAL and RET. So there's no need for this to exist. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move VUE map computation to once at VS compile time.Eric Anholt2012-02-211-1/+0
| | | | | | | | | | With this and the previous patch, 640x480 nexuiz is running 0.169118% +/- 0.0863696% faster (n=121). On a VS state change microbenchmark, performance is increased 8.28645% +/- 0.460478% (n=52). v2: Fix CACHE_NEW_VS comment. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: Add texture related data to brw_vs_prog_key.Kenneth Graunke2011-12-191-0/+3
| | | | | | | | Now that this is all factored out, it's trivial to do. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-2/+2
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965 Gen6+: De-compact clip planes.Paul Berry2011-10-061-6/+9
| | | | | | | | | | | | | | | | | | | Previously, if the user enabled a non-consecutive set of clip planes (e.g. 0, 1, and 3), the driver would compact them down to a consecutive set starting at 0. This optimization was of dubious value, and complicated the implementation of gl_ClipDistance. This patch changes the driver so that with Gen6 and later chipsets, we no longer compact the clip planes. However, we still discard any clip planes beyond the highest number that is in use, so performance should not be affected for applications that use clip planes consecutively from 0. With chipsets previous to Gen6, we still compact the clip planes, since the pre-Gen6 clipper thread relies on this behavior. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965 VS: Change nr_userclip to nr_userclip_planes.Paul Berry2011-10-061-2/+3
| | | | | | | | | | The only remaining uses of brw_vs_prog_key::nr_userclip only occurred when using clip planes (as opposed to gl_ClipDistance). This patch renames the value to nr_userclip_planes and sets it to zero when gl_ClipDistance is in use. This avoids unnecessary VS recompiles. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Make brw_compute_vue_map's userclip dependency a boolean.Paul Berry2011-10-061-0/+6
| | | | | | | | | | | | | Previously, brw_compute_vue_map required an argument indicating the number of clip planes in use, but all it did with it was check if it was nonzero. This patch changes brw_compute_vue_map to take a boolean instead. This allows us to avoid some unnecessary recompilation of the Gen4/5 GS and SF threads. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move ClipPlanesEnabled state to VS cache key.Paul Berry2011-10-061-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Previous to this patch, setup_uniform_clipplane_values() was setting up clip plane uniforms based on ctx->Transform.ClipPlanesEnabled, a piece of state not stored in the vertex shader cache key. As a result, a change to this piece of state might not trigger a necessary vertex shader recompile. The patch adds a field to the vertex shader cache key, userclip_planes_enabled, to store the current value of ctx->Transform.ClipPlanesEnabled. Also, it changes setup_uniform_clipplane_values() to read from this new field, so that it's manifestly clear that the vertex shader isn't depending on state not stored in the cache key. Note: when the vertex shader uses gl_ClipDistance, the VS backend doesn't need to know which clip planes are in use, so we leave the field as zero in that case to avoid unnecessary recompiles. Fixes Piglit test vs-clip-vertex-enables. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rearrange VS cache key struct.Paul Berry2011-10-061-1/+11
| | | | | | | | | | | No functional change. This patch rearranges the struct brw_vs_prog_key so that the two fields related to clipping are together, and documents those fields. This should make the patches that follow easier to comprehend, since they add additional clipping-related fields to this structure. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix a hardcoded user clip plane count.Paul Berry2011-09-281-1/+1
| | | | | | | | | Now that i965 supports 8 clip planes instead of 6, the size of the brw_vs_compile::userplane array needs to be increased to 8. Changed the array size to MAX_CLIP_PLANES so that if the number changes again in the future, this array size won't be missed. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't upload clip planes when gl_ClipDistance is in use.Paul Berry2011-09-231-0/+1
| | | | | | | | When the vertex shader writes to gl_ClipDistance, we do clipping based on clip distances rather than user clip planes, so don't waste push constant space storing user clip planes that won't be used. Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove two_side_color from brw_compute_vue_map().Paul Berry2011-09-061-1/+0
| | | | | | | | | | | Since we now lay out the VUE the same way regardless of whether two-sided color is enabled, brw_compute_vue_map() no longer needs to know whether two-sided color is enabled. This allows the two-sided color flag to be removed from the clip, GS, and VS keys, so that fewer GPU programs need to be recompiled when turning two-sided color on and off. Reviewed-by: Eric Anholt <[email protected]>
* i965: old VS: use the VUE map to compute the URB entry size.Paul Berry2011-09-061-1/+0
| | | | | | | | | | | | | | | | | | Previously, the old VS backend computed the URB entry size by adding the number of vertex shader outputs to the size of the URB header. This often produced a larger result than necessary, because some vertex shader outputs are stored in the header, so they were being double counted. This patch changes the old VS backend to compute the URB entry size directly from the number of slots in the VUE map. Note: there's a subtle change in that we no longer count header registers towards the size of the VF input. I believe this is correct, because the header is only emitted in the output of the VS stage--it is not present in the input. (As evidence for this, note that brw_vs_state.c sets urb_entry_read_offset to 0--it does not include space for the header as part of the VS input). Reviewed-by: Eric Anholt <[email protected]>
* i965: old VS: Use brw_vue_map instead of implicit assumptions about VUE ↵Paul Berry2011-09-061-1/+1
| | | | | | structure. Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Run the shader backend at link time and return compile failures.Eric Anholt2011-08-161-1/+2
| | | | | | Link failure is something that shouldn't happen, but we sometimes want it during development. The precompile also allows analysis of shader codegen with shader-db.
* i965: Set up allocation of a VS scratch space if required.Eric Anholt2011-08-161-0/+1
|
* i965: Start adding the VS visitor and codegen.Eric Anholt2011-08-161-1/+2
| | | | | | The low-level IR is a mashup of brw_fs.cpp and ir_to_mesa.cpp. It's currently controlled by the INTEL_NEW_VS=1 environment variable, and only tested for the trivial "gl_Position = gl_Vertex;" shader so far.
* i965: Add support for GL_FIXED vertex attributes.Eric Anholt2011-06-101-0/+4
| | | | | | | | | This sadly requires work in the VS to rescale them, because the hardware doesn't support this format natively. Fixes arb_es2_compatibility-fixed-type and gtf/fixed_data_type. Reviewed-by: Ian Romanick <[email protected]>
* intel: Add support for ARB_color_buffer_float.Eric Anholt2011-04-201-0/+1
| | | | Reviewed-by: Brian Paul <[email protected]>
* i965: support for two-sided lighting on SandybridgeXiang, Haihao2010-12-101-0/+1
| | | | | | VS places color attributes together so that SF unit can fetch the right attribute according to object orientation. This fixes light issue in mesa demo geartrain, projtex.
* mesa: rename src/mesa/shader/ to src/mesa/program/Brian Paul2010-06-101-1/+1
|
* i965: Fix point coordinate replacement after airlied's ffvertex changes.Eric Anholt2010-05-171-1/+1
| | | | | | | | This basically restores the previous state, where a vertex result slot is set up for the texcoord to be replaced with point coord. Fixes piglit point-sprite test. Bug #27625
* i965: Upload as many VS constants as possible through the push constants.Eric Anholt2010-01-191-0/+1
| | | | | | | The pull constants require sending out to an overworked shared unit and waiting for a response, while push constants are nicely loaded in for us at thread dispatch time. By putting things we access in every VS invocation there, ETQW performance improved by 2.5% +/- 1.6% (n=6).
* i965: Only set up the stack register if it's going to get used.Eric Anholt2010-01-181-0/+2
|
* i965: first attempt at handling URB overflow when there's too many vs outputsBrian Paul2009-06-301-0/+1
| | | | | | | | | | | If we can't fit all the VS outputs into the MRF, we need to overflow into temporary GRF registers, then use some MOVs and a second brw_urb_WRITE() instruction to place the overflow vertex results into the URB. This is hit when a vertex/fragment shader pair has a large number of varying variables (12 or more). There's still something broken here, but it seems close...
* i965: only upload constant buffer data when we actually need the const bufferBrian Paul2009-04-271-2/+0
| | | | | | | Make the use_const_buffer field per-program and only call the code which updates the constant buffer's data if the flag is set. This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
* i965: checkpoint commit: VS constant buffersBrian Paul2009-04-141-0/+7
| | | | | | Hook up a constant buffer, binding table, etc for the VS unit. This will allow using large constant buffers with vertex shaders. The new code is disabled at this time (use_const_buffer=FALSE).
* i965: Delete old metaops code now that there are no remaining consumers.Eric Anholt2009-02-021-1/+0
|
* Remove TNL-to-VP tracking from i965Ian Romanick2008-09-281-4/+0
| | | | | | | | The i965 driver previously had it's own set of code to convert fixed-function TNL state to a vertex program. Core Mesa has code to do this, so there is no reason to duplicate that effort in the driver. In fact, this duplication leads to bugs when other aspects of the Mesa infrastructure change.
* Merge branch '965-glsl'Zou Nan hai2007-10-261-0/+6
|\ | | | | | | | | | | | | Conflicts: src/mesa/drivers/dri/i965/brw_sf.h src/mesa/drivers/dri/i965/intel_context.c
| * support branch and loop in pixel shaderZou Nan hai2007-06-211-0/+5
| | | | | | | | most of the sample working with some small modification
| * support nested function callZou Nan hai2007-04-301-1/+1
| | | | | | | | else instruction fix.
| * Initial 965 GLSL supportZou Nan hai2007-04-121-0/+1
| |
* | Fix-up #includes to remove some -I options.Brian2007-09-111-1/+1
|/ | | | eg: #include "shader/program.h" and remove -I$(TOP)/src/mesa/program
* eliminate rhw divide under some circumstancesKeith Whitwell2006-10-051-1/+2
|
* Add Intel i965G/Q DRI driver.Eric Anholt2006-08-091-0/+80
This driver comes from Tungsten Graphics, with a few further modifications by Intel.