mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/common: Return the block size from get_urb_config	Jason Ekstrand	2020-01-30	1	-1/+2
\| \| \| \| \| \|	Cc: "20.0" [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
*	intel: Take a gen_l3_config in gen_get_urb_config	Jason Ekstrand	2020-01-30	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	Instead of making each driver pass in the same push constant size and do it's own L3$ config URB size calculation, just make them pass in their L3$ configuration. Cc: "20.0" [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
*	i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9	Nanley Chery	2018-08-31	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to internal docs, some gen9 platforms have a pixel shader push constant synchronization issue. Although not listed among said platforms, this issue seems to be present on the GeminiLake 2x6's we've tested. We consider the available workarounds to be too detrimental on performance. Instead, we mitigate the issue by applying part of one of the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch (as suggested by Ken). Fixes ext_framebuffer_multisample-accuracy piglit test failures with the following options: * 6 depth_draw small depthstencil * 8 stencil_draw small depthstencil * 6 stencil_draw small depthstencil * 8 depth_resolve small * 6 stencil_resolve small depthstencil * 4 stencil_draw small depthstencil * 16 stencil_draw small depthstencil * 16 depth_draw small depthstencil * 2 stencil_resolve small depthstencil * 6 stencil_draw small * all_samples stencil_draw small * 2 depth_draw small depthstencil * all_samples depth_draw small depthstencil * all_samples stencil_resolve small * 4 depth_draw small depthstencil * all_samples depth_draw small * all_samples stencil_draw small depthstencil * 4 stencil_resolve small depthstencil * 4 depth_resolve small depthstencil * all_samples stencil_resolve small depthstencil v2: Include more platforms in WA (Ken). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355 Cc: <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Convert brw->*_program into a brw->programs[i] array.	Kenneth Graunke	2017-09-26	1	-2/+2
\| \| \| \| \| \|	This makes it easier to loop over programs. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	i965: drop brw->is_haswell in favor of devinfo->is_haswell	Lionel Landwerlin	2017-08-30	1	-4/+4
\| \| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	i965: drop brw->is_baytrail in favor of devinfo->is_baytrail	Lionel Landwerlin	2017-08-30	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	i965: drop brw->gt in favor of devinfo->gt	Lionel Landwerlin	2017-08-30	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	i965: drop brw->gen in favor of devinfo->gen	Lionel Landwerlin	2017-08-30	1	-5/+8
\| \| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	i965: Stop looking at NewDriverState when emitting 3DSTATE_URB	Jason Ekstrand	2017-08-18	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking at NewDriverState is not safe in general. The state atom system is set up to ensure that new bits that get added to NewDriverState get accumulated into the set of bits used when emitting atoms but it doesn't go the other way. If we read NewDriverState, we may not get the full picture because the per-pipeline state (3D or compute) does not get added to NewDriverState before state emit is done. It's especially dangerous to do this from BLORP (either explicitly or implicitly when BLORP calls gen7_upload_urb) because that does not happen during one of the normal state upload paths. This commit solves the problem by whacking all of the per-shader-stage URB sizes to zero whenever we change the total URB size. We still have to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but the actual decision in gen7_upload_urb can now be based entirely on URB sizes rather than on state atoms. This also makes BLORP correct because it just asks for a new URB config whenever the vsize is too small and so any change to the total URB size will trigger blorp to re-emit as well because 0 < vs_entry_size. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Bugzilla: https://bugs.freedesktop.org/102289 Cc: [email protected]
*	i965: Stop re-uploading push constants after URB reconfiguration.	Kenneth Graunke	2017-07-13	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	Previously we would re-upload the constant data to the batchbuffer, then re-emit the packets. We only need to do the last step (causing the existing data in the batchbuffer to be re-uploaded to the push constant staging area in the L3). Now that we've separated the two, it's pretty easy to accomplish. Reviewed-by: Matt Turner <[email protected]>
*	i965/urb: Trigger upload_urb on NEW_BLORP	Jason Ekstrand	2017-07-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's a bit rare, but blorp can trigger a urb reconfiguration. When that happens, we need to re-upload the URB config. Previoulsy blorp would set BRW_NEW_URB_SIZE, but this is a pretty big hammer as it would cause back-to-black blorp operations to reconfigure both times. Using BRW_NEW_BLORP is a small, more accurate hammer. v2 (idr): Sort BRW_NEW_ tokens to match brw_recalculate_urb_fence and gen6_urb. v3 (idr): Don't whack BRW_NEW_URB_SIZE in blorp. Suggested by Jason. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3	Anuj Phogat	2017-06-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	v1: By Ben Widawsky <[email protected]> v2: v1 had an assert only for VS. Add the restriction for GS, HS and DS as well and make sure the allocated sizes are not multiple of 3. v3: Move the entry_size checks in to compiler code (Ken) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Delete unused variable.	Kenneth Graunke	2016-11-19	1	-2/+0
\| \| \| \| \| \|	I forgot to delete this in 9ef2b9277d3bead6dbfa47e95794ca61e8be4e84. Signed-off-by: Kenneth Graunke <[email protected]>
*	intel: Share URB configuration code between GL and Vulkan.	Kenneth Graunke	2016-11-19	1	-138/+4
\| \| \| \| \| \| \| \| \|	This code is far too complicated to cut and paste. v2: Update the newly added genX_gpu_memcpy.c; const a few things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Use arrays in Gen7+ URB code.	Kenneth Graunke	2016-11-19	1	-202/+134
\| \| \| \| \| \| \| \| \| \| \|	So much of this code was cut and pasted per stage. We can accomplish much of it by looping over shader stages. Improves performance of OglBatch7 (version 6) by 1.50783% +/- 0.287049% (n = 71) at 1024x768 on Cherryview. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Drop brw->urb.{nr__entries,_start} assignments from gen7_urb.c.	Kenneth Graunke	2016-11-19	1	-17/+8
\| \| \| \| \| \| \| \|	The context fields are for Gen4-5; setting them has always been useless. There's no point in spending the cost in the hottest path in the driver. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Switch to roundf in HS/DS URB code.	Kenneth Graunke	2016-11-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Matt intentionally switched the VS calculation to be float-based in commit c1da15709a0c0c2775bd9e534f67c60f7dc95ce8. Tessellation support was written before this and rebased forward, and missed the change. Now it's consistent. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Make URB code use prog_data for GS/tessellation enable checks.	Kenneth Graunke	2016-11-19	1	-6/+4
\| \| \| \| \| \| \| \|	If geometry/tessellation shaders are disabled, prog_data will be NULL (see brw_state_upload.c). This consolidates dirty bits a little. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	intel: Convert devinfo->urb.min_*_entries into an array.	Kenneth Graunke	2016-11-19	1	-4/+5
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	intel: Convert devinfo->urb.max_*_entries into an array.	Kenneth Graunke	2016-11-19	1	-10/+16
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Eliminate brw->gs.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->gs.base.prog_data = &brw->gs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_gs_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Eliminate brw->tes.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->tes.base.prog_data = &brw->tes.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tes_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Eliminate brw->tcs.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->tcs.base.prog_data = &brw->tcs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tcs_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Eliminate brw->vs.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: rename max_ds_* variable to max_tes_*	Timothy Arceri	2016-10-03	1	-2/+2
\| \| \| \| \| \|	Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: rename max_hs_* variables to max_tcs_*	Timothy Arceri	2016-10-03	1	-2/+2
\| \| \| \| \| \|	Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: get rid of duplicated values from gen_device_info	Lionel Landwerlin	2016-09-23	1	-8/+8
\| \| \| \| \| \| \| \|	Now that we have gen_device_info mutable, we can update its values and drop all copies we had in brw_context. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/i965: make gen_device_info mutable	Lionel Landwerlin	2016-09-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Rename intelScreen to screen.	Kenneth Graunke	2016-09-20	1	-1/+1
\| \| \| \| \| \| \| \|	"intelScreen" is wordy and also doesn't fit our style guidelines. "screen" is shorter, which is nice, because we use it fairly often. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	intel: s/brw_device_info/gen_device_info/	Jason Ekstrand	2016-09-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.h sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.c sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.cpp sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.h Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/urb: Allow blorp to record current settings	Topi Pohjolainen	2016-07-04	1	-40/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	This makes it possible to skip urb re-configuration if the subsequent renders agree with the settings. Also allows blorp to allocate the maximun amount of vs entries available. Core upload logic already knows how to calculate this. Helps one synthetic benchmark. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp/gen7+: Do not trigger push constant space reconfig	Topi Pohjolainen	2016-07-04	1	-2/+1
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Avoid division by zero.	Ardinartsev Nikita	2016-06-23	1	-11/+15
\| \| \| \| \| \| \| \| \|	Fixes regression introduced by af5ca43f2676bff7499f93277f908b681cb821d0 Cc: "12.0 11.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
*	i965: Fix issues with number of VS URB entries on Cherryview/Broxton.	Kenneth Graunke	2016-06-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cherryview/Broxton annoyingly have a minimum number of VS URB entries of 34, which is not a multiple of 8. When the VS size is less than 9, the number of VS entries has to be a multiple of 8. Notably, BLORP programmed the minimum number of VS URB entries (34), with a size of 1 (less than 9), which is invalid. It seemed like this could be a problem in the regular URB code as well, so I went ahead and updated that to be safe. Cc: "12.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	Revert "i965/urb: fixes division by zero"	Matt Turner	2016-05-18	1	-5/+19
\| \| \| \|	This reverts commit 2a8aa1e3deb99a1ae16d942318da648c1327ece5.
*	i965/urb: fixes division by zero	Ardinartsev Nikita	2016-05-18	1	-19/+5
\| \| \| \| \| \| \|	Fixes regression introduced by af5ca43f2676bff7499f93277f908b681cb821d0 Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
*	i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bits	Topi Pohjolainen	2016-04-23	1	-2/+1
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Make all atoms to track BRW_NEW_BLORP by default	Kenneth Graunke	2016-04-23	1	-2/+4
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]
*	i965: Remove unnecessary brw->tess_ctrl_program assertions.	Kenneth Graunke	2015-12-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is trying to enforce the fact that the hardware requires HS, TE, and DS to be enabled or disabled together. But it's kind of an ad-hoc attempt, and not too useful. More importantly, we aren't going to have a gl_shader_program for the TCS which is automatically generated when none is present. (We'll just handle it in the driver backend.) So, these will trip for no reason. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Consolidate BRW_NEW_TESS_{CTRL,EVAL}_PROGRAM flags.	Kenneth Graunke	2015-12-22	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For several reasons, I don't think it's particularly useful to have separate flags: 1. Most of the time, tessellation shaders are paired, so both will be replaced at the same time. 2. The data layout is tightly coupled. Both need to agree on the number of per-patch slots in the VUE map. Even adding extra TCS outputs that aren't read by the TES will trigger the need for recompiles. 3. The TCS is optional from an API perspective, but required by the hardware whenever tessellation is enabled. So, atoms that deal with the TCS must check brw->tess_eval_program (BRW_NEW_TESS_EVAL_PROGRAM?) rather than brw->tess_ctrl_program to tell whether tessellation is enabled. So, not only is it unlikely to be useful, it's a bit confusing to get right. Simply using one flag for both simplifies this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Allocate URB space for HS and DS stages when required.	Chris Forbes	2015-12-15	1	-34/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: (by Ken, incorporating feedback from Matt Turner): - Rewrite the push constant allocation code to be clearer. - Only apply the minimum VS entries workaround on Gen 8. v3: (by Ken) - Fix a bug in v2 where we failed to allocate the full push constant space when the number of enabled stages didn't divide the available push constant space evenly. (Any left over space is now allocated to the PS, as it was in v1.) - Fix an off-by-one error in v2's number of enabled stages calculation. - Use DIV_ROUND_UP for nicer formatting. - Line wrapping fixes. Signed-off-by: Chris Forbes <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Use DIV_ROUND_UP() in gen7_urb.c code.	Kenneth Graunke	2015-12-14	1	-9/+8
\| \| \| \| \| \| \|	This is a newer convention, which we prefer over ALIGN(x, n) / n. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Define state flag to signal that the URB size has been altered.	Francisco Jerez	2015-12-09	1	-0/+3
\| \| \| \| \| \| \| \|	This will make sure that we recalculate the URB layout anytime the URB size is modified by the L3 partitioning code. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Fix PIPE_CONTOL typo.	Kenneth Graunke	2015-11-17	1	-1/+1
\| \| \| \|	PIPE_CONTOL!!!
*	i965: Use float calculations when double is unnecessary.	Matt Turner	2015-07-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Literals without an f/F suffix are of type double, and implicit conversion rules specify that the float in (float op double) be converted to a double before the operation is performed. I believe float execution was intended (in nearly all cases) or is sufficient (in the case of gen7_urb.c). Removes a lot of float <-> double conversion instructions and replaces many double instructions with float instructions which are cheaper. text data bss dec hex filename 4928659 195160 26192 5150011 4e953b i965_dri.so before 4928315 195152 26192 5149659 4e93db i965_dri.so after Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/state: Don't use brw->state.dirty.brw	Jordan Justen	2015-03-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in .[ch] .[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Delete brw_state_flags::cache and related code.	Kenneth Graunke	2014-12-02	1	-1/+0
\| \| \| \| \| \| \| \| \|	It's been merged into brw_state_flags::brw for simplicity and efficiency. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).	Kenneth Graunke	2014-12-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I put the BRW_NEW__PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW__PROG_DATA and BRW_NEW__PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW__PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Rename CACHE_NEW__PROG to BRW_NEW__PROG_DATA.	Kenneth Graunke	2014-12-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW__PROG covers the brw__prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw__prog_data, I decided to use BRW_NEW__PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in .[ch]; do sed -i -e 's/CACHE_NEW_$[A-Z_\]$_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Matt Turner <[email protected]>
*	i965: Alphabetize brw_tracked_state flags and use a consistent style.	Kenneth Graunke	2014-11-29	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>