aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/gen6_urb.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-1/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]
* i965: Rename intel_emit* to reflect their new location in brw_pipe_controlChris Wilson2015-06-241-1/+1
| | | | | Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).Kenneth Graunke2014-12-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | I put the BRW_NEW_*_PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.Kenneth Graunke2014-12-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in *.[ch]; do sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-2/+5
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Skip recalculating URB allocations if the entry size didn't change.Eric Anholt2014-10-241-2/+2
| | | | | | | | | We only get here if the VS/GS compiled programs change, but we can even skip it if the VS/GS size didn't change. Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6/gs: Enable URB space for user-provided geometry shaders.Iago Toral Quiroga2014-09-191-10/+20
| | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/gen6: Fix assertions on VS/GS URB size.Paul Berry2013-09-161-2/+2
| | | | | | | | | The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow values in the range 0-4, but they are U8-1 fields, so the range of possible allocation sizes is 1-5. We were erroneously prohibiting a size of 5. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7.5: Fix lower bound on number of VS URB entries.Paul Berry2013-09-051-1/+1
| | | | | | | | | | | | Haswell GT2 and GT3 require the number of vertex shader URB entries to be at least 64, not 32. At the moment, we always meet this requirement automatically, because in the absence of a geometry shader, we assign all available URB space to the vertex shader. But when we turn on support for geometry shaders, this lower limit will become important. Reviewed-by: Chad Versace <[email protected]>
* i965: rename legacy gs structs and functions to ff_gs.Paul Berry2013-08-311-4/+4
| | | | | | | | "ff" is for "fixed function". This frees up the name "gs" to refer to user-defined geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-2/+1
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965/vs: split brw_vs_prog_data into generic and VS-specific parts.Paul Berry2013-04-111-1/+1
| | | | | | | | | | | This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <[email protected]> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Turn brw->urb.vs_size and gs_size into local variables.Kenneth Graunke2013-04-041-9/+9
| | | | | | | | These variables are only used within a single function, so we may as well make them local variables. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965 gen6: Allocate URB space for GSPaul Berry2011-12-071-12/+57
| | | | | | | | | | | | | | When the GS is not in use, the entire URB space is available for the VS. When the GS is in use, we split the URB space 50/50. The 50/50 split is probably not optimal--we'll probably want tune this for performance in a future patch. For example, in most situations, it's probably worth allocating more than 50% of the space to the VS, since VS space is used for vertex caching. But for now this is good enough. Based on previous work by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fold the gen6/7 URB state prepare()/emit() together.Eric Anholt2011-10-291-9/+3
| | | | | | | No other unit cares about the prepare state, unlike gen4-5. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Use state streaming on programs, and state base address on gen5+.Eric Anholt2011-06-181-1/+1
| | | | | | | | | | There will be a little bit of thrashing of the program cache BO as the cache warms up, but once the application is in steady state, this reduces relocations on gen5 and later. On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6% +/- 1.3% (n=6). No statistically significant performance difference on nexuiz (n=5).
* i965: Rename max_vs_handles to max_vs_entries for consistency.Kenneth Graunke2011-05-171-2/+2
| | | | | | | | | | | The documentation uses the term "vertex URB entries", the code talks about "entry size", and so on. Also, handles are just "pointers" to entries (actually small integers). Also rename max_gs_handles to max_gs_entries. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Allocate the whole URB to the VS and fix calculations for Gen6.Kenneth Graunke2011-04-181-17/+17
| | | | | | | | | | | | | | | | | Since we never enable the GS on Sandybridge, there's no need to allocate it any URB space. Furthermore, the previous calculation was incorrect: it neglected to multiply by nr_vs_entries, instead comparing whether twice the size of a single VS URB entry was bigger than the entire URB space. It also neglected to take into account that vs_size is in units of 128 byte blocks, while urb_size is in bytes. Despite the above problems, the calculations resulted in an acceptable programming of the URB in most cases, at least on GT2. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: SNB GT1 has only 32k urb and max 128 urb entries.Zou Nan hai2011-03-031-4/+15
| | | | Signed-off-by: Zou Nan hai <[email protected]>
* i965: Maxinum the usage of urb space on SNB.Zou Nan hai2011-03-021-10/+6
| | | | | | | | | SNB has 64k urb space, we only use piece of them. The more urb space we alloc, the more concurrent vs threads we can run. push the urb space usage to the limit. Signed-off-by: Zou Nan hai <[email protected]>
* i965: Rename various gen6 #defines to match the documentation.Kenneth Graunke2011-01-061-1/+1
| | | | | | | | This should make it easier to cross-reference the code and hardware documentation, as well as clear up any confusion on whether constants like CMD_3D_WM_STATE mean WM_STATE (pre-gen6) or 3DSTATE_WM (gen6+). This does not rename any pre-gen6 defines.
* i965: Fix GS state uploading on SandybridgeZhenyu Wang2010-12-061-1/+1
| | | | | | | | Need to check the required primitive type for GS on Sandybridge, and when GS is disabled, the new state has to be issued too, instead of only updating URB state with no GS entry, that caused hang on Sandybridge. This fixes hang issue during conformance suite testing.
* i965: Fix VS URB entry sizing.Eric Anholt2010-10-261-1/+1
| | | | | | | | | I'm trying to clamp to a minimum of 1 URB row, not a maximum of 1. Fixes: glsl-kwin-blur glsl-max-varying glsl-routing
* i965: Remove the gen6 emit_mi_flushes I sprinkled around the driver.Eric Anholt2010-10-191-4/+0
| | | | | These were for debugging in bringup. Now that relatively complicated apps are working, they haven't helped debug anything in quite a while.
* i965: Fix the SNB URB entry count setup.Eric Anholt2010-02-251-2/+2
|
* i965: Giant pile of flushing to track down SNB bringup issues.Eric Anholt2010-02-251-0/+2
| | | | This should go away before we push the code.
* i965: Set up the SNB URB.Eric Anholt2010-02-251-0/+81
even with vs disabled, still doesn't work.