mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: add missing ir_unop_/ir_binop_ in visit_leave()	Samuel Pitoiset	2017-04-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warnings. brw_fs_channel_expressions.cpp:219:12: warning: enumeration values 'ir_unop_ballot', 'ir_unop_read_first_invocation', and 'ir_binop_read_invocation' not handled in switch [-Wswitch] switch (expr->operation) { ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: fix wrong comparison in update_framebuffer_state()	Samuel Pitoiset	2017-04-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	state_tracker/st_atom_framebuffer.c:208:27: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] if (framebuffer->width == UINT_MAX) ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~ state_tracker/st_atom_framebuffer.c:210:28: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] if (framebuffer->height == UINT_MAX) ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~ 2 warnings generated. Fixes: eb0fd0e5f86 ("gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	radeon: fix duplicate 'const' specifier	Samuel Pitoiset	2017-04-13	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warning. In file included from radeon_debug.c:32: ./radeon_common_context.h:500:19: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier] extern const char const *radeonVendorString; v2: - do not remove the duplicate 'const' qualifier, fix it Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	mesa: remove some unused functions in the perf monitor area	Samuel Pitoiset	2017-04-13	1	-27/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warnings. main/performance_monitor.c:157:1: warning: unused function 'index_to_queryid' [-Wunused-function] index_to_queryid(GLuint index) ^ main/performance_monitor.c:163:1: warning: unused function 'queryid_valid' [-Wunused-function] queryid_valid(const struct gl_context *ctx, GLuint queryid) ^ main/performance_monitor.c:169:1: warning: unused function 'counterid_to_index' [-Wunused-function] counterid_to_index(GLuint counterid) ^ 3 warnings generated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	mesa: remove unused clamp_float_to_uint() and clamp_half_to_uint()	Samuel Pitoiset	2017-04-13	1	-15/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warnings. main/pack.c:470:1: warning: unused function 'clamp_float_to_uint' [-Wunused-function] clamp_float_to_uint(GLfloat f) ^ main/pack.c:477:1: warning: unused function 'clamp_half_to_uint' [-Wunused-function] clamp_half_to_uint(GLhalfARB h) ^ 2 warnings generated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	mesa: remove unused _mesa_unmarshal_BindBufferBase()	Samuel Pitoiset	2017-04-13	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warning. main/marshal.c:209:1: warning: unused function '_mesa_unmarshal_BindBufferBase' [-Wunused-function] _mesa_unmarshal_BindBufferBase(struct gl_context ctx, const struct marshal_cmd_BindBufferBase cmd) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: add some _mesa_is_winsys_fbo() assertions	Brian Paul	2017-04-12	2	-2/+9
\| \| \| \| \| \| \| \|	A few functions related to FBOs/renderbuffers should only be used with window-system buffers, not user-created FBOs. Assert for that. Add additional comments. No piglit regressions. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: minor optimization in st_DrawBuffers()	Brian Paul	2017-04-12	1	-8/+16
\| \| \| \| \| \| \| \| \|	We only do on-demand renderbuffer allocation for window-system FBOs, not user-created FBOs. So put the loop inside a conditional. Plus, add some comments. No piglit regressions. Reviewed-by: Marek Olšák <[email protected]>
*	mesa/st: only update samplers for stages that have changed	Timothy Arceri	2017-04-13	4	-28/+94
\| \| \| \| \| \|	Might help reduce cpu for some apps that use sso. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Fix missing-braces warning.	Vinson Lee	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	CXX state_tracker/st_glsl_to_nir.lo state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces] nir_lower_wpos_ytransform_options wpos_options = {0}; ^ {} Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	mesa: fix memory leak in arb_fragment_program	Bartosz Tomczyk	2017-04-12	1	-0/+1
\| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: avoid NULL ptr in prog parameter name	Gregory Hainaut	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Context: _mesa_add_parameter is sometimes[0] called with a NULL name as a mean of an unnamed parameter. Allowing NULL pointer as a name means that it must be NULL checked each access. So far it isn't always[1] true. Parameter name is only used for debug purpose (printf) and to lookup the index/location of the program by the application. Conclusion, there is no valid reason to use a NULL pointer instead of an empty string. So it was decided to use an empty string which avoid all issues related to NULL pointer [0]: texture gather offsets glsl opcode and st_init_atifs_prog [1]: at least shader cache, st_nir_lookup_parameter_index and some printfs Issue found by piglit 'texturegatheroffsets' tests on Nouveau v4: new patch based on Nicolai/Timothy/ilia discussion Signed-off-by: Gregory Hainaut <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965/drm: Use bools for a few flags.	Kenneth Graunke	2017-04-11	1	-2/+2
\| \| \| \| \| \|	These one bit values are booleans. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit.	Kenneth Graunke	2017-04-11	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	unsigned long is a terrible type for a bitfield - if you need fewer than 32 bits, it wastes 4 bytes. If you need more, things break on 32-bit builds. Just use unsigned. Even that's a bit ridiculous as we only have one flag today. Still, it's at least somewhat better. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Make BO size a uint64_t rather than unsigned long.	Kenneth Graunke	2017-04-11	2	-11/+11
\| \| \| \| \| \| \| \| \|	The drm_i915_gem_create ioctl structure uses a __u64 for the size, so we should probably use uint64_t to match. In theory, we could probably have a BO larger than 4GB, using a 48-bit PPGTT - it just wouldn't be mappable in the CPU's 32-bit address space. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Make alignment parameter a uint64_t.	Kenneth Graunke	2017-04-11	2	-4/+4
\| \| \| \| \| \| \| \| \|	Theoretically, with a 48-bit address space, we could have buffers with an alignment of >= 4GB. It's a bit silly, but the exec_object structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may as well use the same type as the kernel API. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Make stride/pitch a uint32_t.	Kenneth Graunke	2017-04-11	4	-31/+18
\| \| \| \| \| \| \| \| \|	struct drm_i915_gem_set_tiling's stride field is a __u32. intel_mipmap_tree::stride is a uint32_t. Using unsigned long just doesn't make sense. Switching also lets us drop many pointless locals that only existed to deal with the type mismatch. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Fix types for pwrite/pread fields.	Kenneth Graunke	2017-04-11	2	-14/+14
\| \| \| \| \| \| \|	The ioctl structs contain __u64 offset and size fields, so make them uint64_t rather than unsigned long. Reviewed-by: Chris Wilson <[email protected]>
*	i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer.	Kenneth Graunke	2017-04-11	4	-62/+41
\| \| \| \| \| \| \| \| \| \| \| \| \|	For some reason we passed tiling by pointer, through several layers, even though the functions only read the initial value, and never actually change it. We even had a do-while loop that executed until the tiling mode matched - except it always did, so it only ran once. We then had bogus error handling in case it changed the tiling mode to something nonsensical...which it never did. Drop all this nonsense. Reviewed-by: Chris Wilson <[email protected]>
*	mesa/st: remove _mesa_get_fallback_texture() calls	Timothy Arceri	2017-04-12	2	-10/+3
\| \| \| \| \| \| \| \| \| \| \|	These calls look like leftover from fallback texture support first being added to the st in 8f6d9e12be0be and then later being added to core mesa in 00e203fe17cbf21. The piglit test fp-incomplete-tex continues to work with this change. Reviewed-by: Brian Paul <[email protected]>
*	mesa: use pre_hashed version of search for the mesa hash table	Timothy Arceri	2017-04-12	1	-2/+6
\| \| \| \| \| \| \|	The key is just an unsigned int so there is never any real hashing done. Reviewed-by: Eric Anholt <[email protected]>
*	i965: Set kernel features before computing max GL version.	Kenneth Graunke	2017-04-11	1	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We check these bitfields when computing the Haswell max GL version. We need to set them ahead of time, or they won't exist, and all our checks will fail. That sets the max core profile GL version to 4.2. This introduces the bizarre situation where asking for a GL context with version 4.3+ fails, but asking for a GL core profile context with version <= 4.2 actually promotes you a 4.5 context. GLX_MESA_query_renderer also reported the bogus 4.2 value. Now it shows 4.5. Cc: "17.0" <[email protected]> Reported-and-tested-by: Rafael Ristovski <[email protected]>
*	i965: Fix wonky indentation left by brw_bo_alloc_tiled rename.	Kenneth Graunke	2017-04-10	2	-18/+17
\|
*	mesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref()	Timothy Arceri	2017-04-11	1	-1/+3
\| \| \| \| \|	This function should only be used with a "freshly created" renderbuffer so assert RefCount is 1.
*	i965/drm: Add stall warnings when mapping or waiting on BOs.	Kenneth Graunke	2017-04-10	17	-55/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This restores the performance warnings removed in: i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings. but adds them for nearly all BO mapping, and also for wait_rendering. Because we add this to the core bufmgr, we automatically get stall warnings in all callers, unlike before where only a few callsites used the wrappers that gave stall warnings. We also do it a bit differently: we simply measure how long set_domain takes (the part that stalls), and complain if it's more than 0.01 ms. We don't bother calling brw_bo_busy(), and we don't measure the mmap time (which doesn't stall). This should be more accurate. Reviewed-by: Daniel Vetter <[email protected]>
*	i965/drm: Make a set_domain() helper function.	Kenneth Graunke	2017-04-10	1	-37/+20
\| \| \| \| \| \|	Less boilerplate. Reviewed-by: Daniel Vetter <[email protected]>
*	i965/batch: Ensure we use a consistent offset in relocs	Daniel Vetter	2017-04-10	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In theory gcc is free to re-load them, and if a concurrent execbuf races and updates bo->offset64 then we have a problem: execbuffer api requires that the ->presumed_offset and the one we used for the reloc matches. It does not require that the value is sensible, which means no locks needed, just a consistent load. Ken said his next series will nuke this, so just hand-roll the kernel's READ_ONCE idea inline. FIXME: Most callers of brw_emit_reloc recompute the relocation themselves, which means this doesn't really fix the race. But the long term plan is to move to per-context relocation handling, which will fix this all properly. So leave this for now as just a reminder. Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/bufmgr: Garbage-collect vma cache/pruning	Daniel Vetter	2017-04-10	2	-129/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was done because the kernel has 1 global address space, shared with all render clients, for gtt mmap offsets, and that address space was only 32bit on 32bit kernels. This was fixed in commit 440fd5283a87345cdd4237bdf45fb01130ea0056 Author: Thierry Reding <[email protected]> Date: Fri Jan 23 09:05:06 2015 +0100 drm/mm: Support 4 GiB and larger ranges which shipped in 4.0. Of course you still want to limit the bo cache to a reasonable size on 32bit apps to avoid ENOMEM, but that's better solved by tuning the cache a bit. On 64bit, this was never an issue. On top, mesa never set this, so it's all dead code. Collect an trash it. Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/bufmgr: Remove some reuse functions	Daniel Vetter	2017-04-10	2	-33/+0
\| \| \| \| \| \| \| \| \|	is_reusable was needed by uxa because it couldn't keep track of its scanout buffers and used this as a proxy. Disabling reuse is a silly idea, we set this once at start. Remove both. Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/bufmgr: remove start_gtt_access	Daniel Vetter	2017-04-10	2	-29/+14
\| \| \| \| \| \| \| \| \| \| \|	Iirc this was used by uxa for persistent mmpas of the frontbuffer. For mesa all the set_domain stuff needed before a synchronized mmap is handled within the bufmgr, so no reason ever to call this. Inline the implementation into its only internal user. Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/bufmgr: Delete set_tiling	Daniel Vetter	2017-04-10	2	-25/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Entirely unused, and really shouldn't be used. The alloc functions already take care of this. And even in a future where we're not going to h/v-align tiled buffers in the bufmgr, but only in isl, I think we still want to adjust the tiling mode in the bufmgr, since that ties in closely to mmaps and stuff like that. get_tiling is still needed for the import paths (until we have modifiers everywhere). Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/bufmgr: Delete alloc_for_render	Daniel Vetter	2017-04-10	2	-19/+0
\| \| \| \| \| \| \|	Entirely unused, mesa instead used the BO_ALLOC_FOR_RENDER flag. Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/drm: Use list_for_each_entry_safe in a couple of cases.	Kenneth Graunke	2017-04-10	1	-11/+3
\| \| \| \| \| \|	Suggested by Chris Wilson. A tiny bit simpler. Reviewed-by: Daniel Vetter <[email protected]>
*	i965/drm: Rename intel_bufmgr_gem.c to brw_bufmgr.c.	Kenneth Graunke	2017-04-10	2	-1/+1
\| \| \| \| \| \|	Matches the class name and the header file name. Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Reindent intel_bufmgr_gem.c and brw_bufmgr.h.	Kenneth Graunke	2017-04-10	2	-1215/+1161
\| \| \| \| \| \| \|	indent -i3 -nut -br -brs -npcs -ce --no-tabs -Tuint32_t -Tuint64_t plus some manual fixes because those aren't quite the right settings. Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Rename drm_bacon_bo to brw_bo.	Kenneth Graunke	2017-04-10	48	-477/+475
\| \| \| \| \| \| \| \| \| \|	The bacon is all gone. This renames both the class and the related functions. We're about to run indent on the bufmgr code, so no need to worry about fixing bad indentation. Acked-by: Jason Ekstrand <[email protected]>
*	i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings.	Kenneth Graunke	2017-04-10	7	-57/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stupid reason for eliminating these functions is that I'm about to rename drm_bacon_bo_map() to brw_bo_map(), which makes the real function have the short name, rather than the wrapper. I'm also planning on reworking our mapping code soon, so we use WC mappings and proper unsynchronized mappings on non-LLC platforms. It will be easier to do that without thinking about the stall warnings and wrappers. My eventual hope is to put the performance warnings in the BO map function itself, so all callers gain the warning. Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Rename drm_bacon_reg_read() to brw_reg_read().	Kenneth Graunke	2017-04-10	4	-12/+8
\| \| \| \| \| \|	Less bacon. Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Rename drm_bacon_bufmgr to struct brw_bufmgr.	Kenneth Graunke	2017-04-10	8	-72/+69
\| \| \| \| \| \|	Also stop using typedefs, per Mesa coding style. Acked-by: Jason Ekstrand <[email protected]>
*	i965: Just use a uint32_t context handle rather than a malloc'd wrapper.	Kenneth Graunke	2017-04-10	7	-70/+21
\| \| \| \| \| \| \| \| \| \| \| \|	drm_bacon_context is a malloc'd struct containing a uint32_t context ID and a pointer back to the bufmgr. The bufmgr pointer is pretty useless, as everybody already has brw->bufmgr. At that point...we may as well just use the ctx_id handle directly. A number of places already had to call drm_bacon_gem_context_get_id() to extract the ID anyway. Now they just have it. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Fold drm_bacon_gem_reset_stats into the callers.	Kenneth Graunke	2017-04-10	3	-56/+17
\| \| \| \| \| \| \| \| \| \|	We're going to get rid of drm_bacon_context shortly, so we'd have to change the interface slightly. It's basically just an ioctl wrapper that isn't terribly bufmgr-related, so We may as well just combine it with the code in brw_reset.c that actually uses it. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Rename drm_bacon_gem_bo_bucket to bo_cache_bucket.	Kenneth Graunke	2017-04-10	1	-9/+9
\| \| \| \| \| \| \| \| \|	No need for a prefix as this struct is local to the .c file. Less bacon. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Drop drm_bacon_* from static functions.	Kenneth Graunke	2017-04-10	1	-81/+69
\| \| \| \| \| \| \|	Mesa style is to not use lengthy prefixes for static functions. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Drop drm_bacon_gem_bo_madvise_internal().	Kenneth Graunke	2017-04-10	1	-16/+6
\| \| \| \| \| \| \| \| \| \| \| \|	The only difference is that it takes an explicit bufmgr rather than using bo->bufmgr, but there is only one bufmgr per screen so they should be identical anyway. Chris says this was added primarly to avoid bo/bo_gem casting, which was inconvenient. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Merge drm_bacon_bo_gem into drm_bacon_bo.	Kenneth Graunke	2017-04-10	2	-321/+272
\| \| \| \| \| \| \| \| \|	The separate class gives us a bit of extra encapsulation, but I don't know that it's really worth the boilerplate. I think we can reasonably expect the rest of the driver to be responsible. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Merge bo->handle and bo_gem->gem_handle.	Kenneth Graunke	2017-04-10	4	-66/+55
\| \| \| \| \| \| \| \| \|	These fields are the same value. In the bad old days, bo->handle could have been an identifier from the pre-GEM fake bufmgr, but that's long gone. Keep the "gem_handle" name for clarity. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965/drm: Rewrite relocation handling.	Kenneth Graunke	2017-04-10	9	-810/+225
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The execbuf2 kernel API requires us to construct two kinds of lists. First is a "validation list" (struct drm_i915_gem_exec_object2[]) containing each BO referenced by the batch. (The batch buffer itself must be the last entry in this list.) Each validation list entry contains a pointer to the second kind of list: a relocation list. The relocation list contains information about pointers to BOs that the kernel may need to patch up if it relocates objects within the VMA. This is a very general mechanism, allowing every BO to contain pointers to other BOs. libdrm_intel models this by giving each drm_intel_bo a list of relocations to other BOs. Together, these form "reloc trees". Processing relocations involves a depth-first-search of the relocation trees, starting from the batch buffer. Care has to be taken not to double-visit buffers. Creating the validation list has to be deferred until the last minute, after all relocations are emitted, so we have the full tree present. Calculating the amount of aperture space required to pin those BOs also involves tree walking, which is expensive, so libdrm has hacks to try and perform less expensive estimates. For some reason, it also stored the validation list in the global (per-screen) bufmgr structure, rather than as an local variable in the execbuffer function, requiring locking for no good reason. It also assumed that the batch would probably contain a relocation every 2 DWords - which is absurdly high - and simply aborted if there were more relocations than the max. This meant the first relocation from a BO would allocate 180kB of data structures! This is way too complicated for our needs. i965 only emits relocations from the batchbuffer - all GPU commands and state such as SURFACE_STATE live in the batch BO. No other buffer uses relocations. This means we can have a single relocation list for the batchbuffer. We can add a BO to the validation list (set) the first time we emit a relocation to it. We can easily keep a running tally of the aperture space required for that list by adding the BO size when we add it to the validation list. This patch overhauls the relocation system to do exactly that. There are many nice benefits: - We have a flat relocation list instead of trees. - We can produce the validation list up front. - We can allocate smaller arrays and dynamically grow them. - Aperture space checks are now (a + b <= c) instead of a tree walk. - brw_batch_references() is a trivial validation list walk. It should be straightforward to make it O(1) in the future. - We don't need to bloat each drm_bacon_bo with 32B of reloc data. - We don't need to lock in execbuffer, as the data structures are context-local, and not per-screen. - Significantly less code and a better match for what we're doing. - The simpler system should make it easier to take advantage of I915_EXEC_NO_RELOC in a future patch. Improves performance in Synmark 7.0's OglBatch7: - Skylake GT4e: 12.1499% +/- 2.29531% (n=130) - Apollolake: 3.89245% +/- 0.598945% (n=35) Improves performance in GFXBench4's gl_driver2 test: - Skylake GT4e: 3.18616% +/- 0.867791% (n=229) - Apollolake: 4.1776% +/- 0.240847% (n=120) v2: Feedback from Chris Wilson: - Omit explicit zero initializers for garbage execbuf fields. - Use .rsvd1 = ctx_id rather than i915_execbuffer2_set_context_id - Drop unnecessary fencing assertions. - Only use _WR variant of execbuf ioctl when necessary. - Shrink the arrays to be smaller by default. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/drm: Make register write check handle execbuffer directly.	Kenneth Graunke	2017-04-10	1	-7/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm about to rewrite how relocation handling works, at which point drm_bacon_bo_emit_reloc() and drm_bacon_bo_mrb_exec() won't exist anymore. This code is already largely not using the batchbuffer infrastructure, so just go all the way and handle relocations, the validation list, and execbuffer ourselves. That way, we don't have to think the weird case where we only have a screen, and no context, when redesigning the relocation handling. v2: Write reloc.presumed_offset + reloc.delta into the batch, rather than duplicating the comment, so it's obvious that they match (suggested by Chris). Also add a comment about why we don't do any error checking. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965: Make a screen::aperture_threshold field.	Kenneth Graunke	2017-04-10	2	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This is the threshold after which drm_intel_bufmgr_check_aperture_space returns -ENOSPC, signalling that it thinks an execbuf is likely to fail and we need to roll back and flush the batch. We'll need this when we rewrite aperture space checking, shortly. In the meantime, we can also use it in GLX_MESA_query_renderer. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965: Make/use a brw_batch_references() wrapper.	Kenneth Graunke	2017-04-10	11	-14/+21
\| \| \| \| \| \| \|	We'll want to change the implementation of this shortly. Reviewed-by: Chris Wilson <[email protected]> Acked-by: Jason Ekstrand <[email protected]>