mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	st/mesa/r200/i915/i965: eliminate gl_fragment_program	Timothy Arceri	2016-10-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Here we move OriginUpperLeft and PixelCenterInteger into gl_program all other fields have been replace by shader_info. V2: Don't use anonymous union/structs to hold vertex/fragment fields suggested by Ian. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/mesa/st/swrast: set fs shader_info directly and switch to using it	Timothy Arceri	2016-10-26	1	-2/+1
\| \| \| \| \| \| \|	Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <[email protected]>
*	mesa/i965/i915/r200: eliminate gl_vertex_program	Timothy Arceri	2016-10-26	1	-1/+1
\| \| \| \| \| \| \|	Here we move the only field in gl_vertex_program to the ARB program fields in gl_program. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir/i965/anv/radv/gallium: make shader info a pointer	Timothy Arceri	2016-10-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: get inputs read from nir info	Timothy Arceri	2016-10-06	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Eliminate brw->wm.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Eliminate brw->vs.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Use bitmask/ffs to iterate enabled clip planes.	Mathias Fröhlich	2016-06-16	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \|	Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	i965: Make all atoms to track BRW_NEW_BLORP by default	Kenneth Graunke	2016-04-23	1	-0/+2
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]
*	i965: Drop #include of main/glheader.h.	Matt Turner	2015-11-24	1	-1/+0
\| \| \| \| \| \|	It's never used. Reviewed-by: Ian Romanick <[email protected]>
*	i965: Mark constant static data as const.	Matt Turner	2015-07-14	1	-1/+1
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Implement proper workaround for Gen4 GPU CONSTANT_BUFFER hangs.	Kenneth Graunke	2015-04-14	1	-13/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I finally managed to dig up some information on our mysterious GPU hangs. A wiki page from the Crestline validation team mentions that they found a GPU hang in "Serious Sam 2" (on Windows) with remarkably similar conditions to the ones we've seen in Google Chrome and glmark2. Apparently, if WM_STATE has "PS Use Source Depth" enabled, CC_STATE has most depth state disabled, and you issue a CONSTANT_BUFFER command and immediately draw, the depth interpolator makes a small mistake that leads to hangs. Most of the traces I looked at contained a CONSTANT_BUFFER packet immediately followed by 3DPRIMITIVE, or at least very few packets. It appears they also have "PS Use Source Depth" enabled - either at the hang, or a little before it. So I think this is our bug. The workaround is to emit a non-pipelined state packet after issuing a CONSTANT_BUFFER packet. This is really similar to the workaround I developed in commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b. v2: Fix word-wrapping issues. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/state: Don't use brw->state.dirty.brw	Jordan Justen	2015-03-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in .[ch] .[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Work around mysterious Gen4 GPU hangs with minimal state changes.	Kenneth Graunke	2015-01-19	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gen4 hardware appears to GPU hang frequently when using Chromium, and also when running 'glmark2 -b ideas'. Most of the error states contain 3DPRIMITIVE commands in quick succession, with very few state packets between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER. I trimmed an apitrace of the glmark2 hang down to two draw calls with a glUniformMatrix4fv call between the two. Either draw by itself works fine, but together, they hang the GPU. Removing the glUniform call makes the hangs disappear. In the hardware state, this translates to removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets. Flushing before emitting CONSTANT_BUFFER packets also appears to make the hangs disappear. I observed a slowdown in glxgears by doing it all the time, so I've chosen to only do it when BRW_NEW_BATCH and BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or already flushed the whole pipeline). I'd much rather understand the problem, but at this point, I don't see how we'd ever be able to track it down further. We have no real tools, and the hardware people moved on years ago. I've analyzed 20+ error states and read every scrap of documentation I could find. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367 Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]> Cc: "10.4 10.3" <[email protected]>
*	i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).	Kenneth Graunke	2014-12-02	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I put the BRW_NEW__PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW__PROG_DATA and BRW_NEW__PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW__PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Rename CACHE_NEW__PROG to BRW_NEW__PROG_DATA.	Kenneth Graunke	2014-12-02	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW__PROG covers the brw__prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw__prog_data, I decided to use BRW_NEW__PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in .[ch]; do sed -i -e 's/CACHE_NEW_$[A-Z_\]$_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Matt Turner <[email protected]>
*	i965: Alphabetize brw_tracked_state flags and use a consistent style.	Kenneth Graunke	2014-11-29	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Make Gen4-5 push constants call _mesa_load_state_parameters too.	Kenneth Graunke	2014-11-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 5e37a2a4a8a, I made the pull constant code stop calling _mesa_load_state_parameters() when there were no pull parameters. This worked fine on Gen6+ because the push constant code also called it if there were any push constants. However, the Gen4-5 push constant code wasn't doing this. This patch makes it do so, like the Gen6+ code. A better long term solution would be to make core Mesa just handle this for us when necessary. Fixes around 8766 Piglit tests on Ironlake, and probably Gen4 as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Tested-by: Mark Janes <[email protected]>
*	Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404	Jordan Justen	2014-09-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <[email protected]>
*	i965: Create a macro for setting a dirty bit.	Paul Berry	2014-09-01	1	-1/+1
\| \| \| \| \| \| \|	This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
*	i965: Store uniform constant values in a gl_constant_value instead of float	Neil Roberts	2014-08-14	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brw_stage_prog_data struct previously contained an array of float pointers to the values of parameters. These were then copied into a batch buffer to upload the values using a regular assignment. However the float values were also being overloaded to store integer values for integer uniforms. This can break if x87 floating-point registers are used to do the assignment because the fst instruction tries to fix up invalid float values. If an integer constant happened to look like an invalid float value then it would get altered when it was copied into the batch buffer. This patch changes the pointers to be gl_constant_value instead so that the assignment should end up copying without any alteration. This also makes it more obvious that the values being stored here are overloaded for multiple types. There are some static asserts where the values are uploaded to ensure that the size of gl_constant_value is the same as a float. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81150 Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Update a ton of comments about constant buffers.	Eric Anholt	2014-07-02	1	-32/+54
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix state flags for gen4/5 CURBE.	Eric Anholt	2014-07-02	1	-8/+8
\| \| \| \| \| \| \| \|	If we had some NOS affecting VS compilation that resulted in optimization changing the set of constants to be uploaded, we might not have reuploaded the constants. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop the memcmp for finding duplicated CURBE uploads.	Eric Anholt	2014-07-02	1	-23/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	At this point, the extra copy of the data and memcmp are as expensive as just re-uploading. Note: now that we'll always upload, and brw_constant_buffer watches BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo at batch reset time. No significant performance difference on glamor copywinwin10 (n=55), despite that test having a 98% hit rate on the cache. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Reuse intel_upload.c for gen4/5 constant buffers.	Eric Anholt	2014-07-02	1	-28/+3
\| \| \| \| \| \|	No performance difference on glamor with copywinwin10 (n=40) on my gm45. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Delete the intel_regions.c code.	Eric Anholt	2014-05-01	1	-1/+0
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Move the remaining driver debug over to stderr.	Eric Anholt	2014-02-22	1	-13/+13
\| \| \| \| \| \|	Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Move up duplicated fields from stage-specific prog_data to ↵	Francisco Jerez	2014-02-19	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw__prog_data_free, and the simplification of the stage-specific brw__prog_data_compare. Reviewed-by: Paul Berry <[email protected]>
*	i965: Remove CACHED_BATCH support altogether.	Kenneth Graunke	2014-01-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using an unoptimized variant of glamor spending 50% of its CPU time in brw_draw_prims() (and hitting the cache very frequently): N Min Max Median Avg Stddev x 200 29200 40500 34900 34750 958.43256 + 200 31000 40300 34700 34622 916.35941 No difference proven at 95.0% confidence Similarly, no difference on GLB2.7: N Min Max Median Avg Stddev x 63 64.1 71.36 70.69 70.113175 1.6782026 + 63 63.6 71.18 70.75 70.223651 1.6044186 No difference proven at 95.0% confidence v2: Rebase on master (by anholt) v3: Add a missing BEGIN_BATCH(3) to aa_line_parameters -- CACHED_BATCH didn't have the asserts about batchbuffer usage that ADVANCE_BATCH does, so we started assertion failing. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Eric Anholt <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	s/Tungsten Graphics/VMware/	José Fonseca	2014-01-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
*	i965: Drop trailing whitespace from the rest of the driver.	Kenneth Graunke	2013-12-05	1	-9/+9
\| \| \| \| \| \| \|	Performed via: $ for file in ; do sed -i 's/ //g'; done Signed-off-by: Kenneth Graunke <[email protected]>
*	i965: Delete intel_context entirely.	Kenneth Graunke	2013-07-09	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
*	i965: Move intel_context::bufmgr to brw_context.	Kenneth Graunke	2013-07-09	1	-1/+1
\| \| \| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
*	i965: Pass brw_context to functions rather than intel_context.	Kenneth Graunke	2013-07-09	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
*	i965/vs: split brw_vs_prog_data into generic and VS-specific parts.	Paul Berry	2013-04-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <[email protected]> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Remove some stale comments about the brw_constant_buffer atom.	Eric Anholt	2013-02-11	1	-6/+0
\| \| \| \| \| \| \|	These have been wrong since f428255bde93a452a7cdd48fba21839c99beb6cb back in 2009! Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Update comment about clipper constants.	Kenneth Graunke	2012-11-01	1	-9/+1
\| \| \| \| \| \| \|	The old VS backend doesn't exist, but I believe these still need to be delivered to the clipper thread. Reviewed-by: Eric Anholt <[email protected]>
*	i965/vs: Remove support for the old parameter layout.	Kenneth Graunke	2012-11-01	1	-19/+2
\| \| \| \| \| \|	Only the old backend used it. Reviewed-by: Eric Anholt <[email protected]>
*	i965: Remove unused param conversion code.	Eric Anholt	2012-07-25	1	-2/+1
\| \| \| \| \| \| \| \|	Ever since ctx->NativeIntegers was set, the conversion flag has been PARAM_NO_CONVERT. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/gen4: Move CURBE offset calculation to emit() time.	Eric Anholt	2011-10-29	1	-1/+1
\| \| \| \| \| \| \|	This is consumed by the unit state. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	i965/gen4: Fold push constant prepare()/emit() together.	Eric Anholt	2011-10-29	1	-13/+9
\| \| \| \| \| \| \| \|	While other units need to know about our constant buffer offsets, nothing else cared about which particular BO other than the emit() half. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	i965: Remove the validated BO list, now that it's unused.	Eric Anholt	2011-10-29	1	-2/+0
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	mesa: Create _mesa_bitcount_64() to replace i965's brw_count_bits()	Paul Berry	2011-10-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i965 driver already had a function to count bits in a 64-bit uint (brw_count_bits()), but it was buggy (it only counted the bottom 32 bits) and it was clumsy (it had a strange and broken fallback for non-GCC-like compilers, which fortunately was never used). Since Mesa already has a _mesa_bitcount() function, it seems better to just create a _mesa_bitcount_64() function rather than special-case this in the i965 driver. This patch creates the new _mesa_bitcount_64() function and rewrites all of the old brw_count_bits() calls to refer to it. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965 Gen6: Implement gl_ClipVertex.	Paul Berry	2011-10-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements proper support for gl_ClipVertex by causing the new VS backend to populate the clip distance VUE slots using VERT_RESULT_CLIP_VERTEX when appropriate, and by using the untransformed clip planes in ctx->Transform.EyeUserPlane rather than the transformed clip planes in ctx->Transform._ClipUserPlane when a GLSL-based vertex shader is in use. When not using a GLSL-based vertex shader, we use ctx->Transform._ClipUserPlane (which is what we used prior to this patch). This ensures that clipping is still performed correctly for fixed function and ARB vertex programs. A new function, brw_select_clip_planes() is used to determine whether to use _ClipUserPlane or EyeUserPlane, so that the logic for making this decision is shared between the new and old vertex shaders. Fixes the following Piglit tests on i965 Gen6: - vs-clip-vertex-const-accept - vs-clip-vertex-const-reject - vs-clip-vertex-different-from-position - vs-clip-vertex-equal-to-position - vs-clip-vertex-homogeneity - vs-clip-based-on-position - vs-clip-based-on-position-homogeneity - clip-plane-transformation clipvert_pos - clip-plane-transformation pos_clipvert - clip-plane-transformation pos Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965 new VS: don't share clip plane constants in pre-GEN6	Paul Berry	2011-09-28	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In pre-GEN6, when using clip planes, both the vertex shader and the clipper need access to the client-supplied clip planes, since the vertex shader needs them to set the clip flags, and the clipper needs them to determine where to insert new vertices. With the old VS backend, we used a clever optimization to avoid placing duplicate copies of these planes in the CURBE: we used the same block of memory for both the clipper and vertex shader constants, with the clip planes at the front of it, and then we instructed the clipper to read just the initial part of this block containing the clip planes. This optimization was tricky, of dubious value, and not completely working in the new VS backend, so I've removed it. Now, when using the new VS backend, separate parts of the CURBE are used for the clipper and the vertex shader. Note that this doesn't affect the number of push constants available to the vertex shader, it simply causes the CURBE to occupy a few more bytes of URB memory. The old VS backend is unaffected. GEN6+, which does clipping entirely in hardware, is also unaffected. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Remove bogus assertion on MAX_CLIP_PLANES.	Paul Berry	2011-09-20	1	-1/+0
\| \| \| \| \| \| \| \| \|	This patch removes the assertion "MAX_CLIP_PLANES == 6" from the i965 driver. This assertion is unnecessary; nothing in the driver requires MAX_CLIP_PLANES to be 6. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Use native integer uniforms when the new VS backend is in use.	Eric Anholt	2011-08-30	1	-2/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vs: Start adding support for uniforms	Eric Anholt	2011-08-16	1	-10/+17
\| \| \| \|	There's no clever packing here, no pull constants, and no array support.
*	i965: Move repeat-instruction-suppression to batchbuffer core	Chris Wilson	2011-02-21	1	-9/+11
\| \| \| \| \| \| \| \|	Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <[email protected]>
*	i965: Drop push-mode reladdr constant loading and always use constant_map.	Eric Anholt	2010-12-08	1	-15/+7
\| \| \| \| \| \| \| \|	This eases the gen6 implementation, which can only handle up to 32 registers of constants, while likely not penalizing real apps using reladdr since all of those I've seen also end up hitting the pull constant buffer. On gen6, the constant map means that simple NV VPs fit under the 32-reg limit and now succeed. Fixes around 10 testcases.