mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list	Ilia Mirkin	2014-03-19	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Cc: "10.0 10.1" <[email protected]>
*	i965: Drop some more dead code from the old CACHED_BATCH feature.	Eric Anholt	2014-03-18	4	-38/+0
\| \| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop special case for edgeflag thanks to Marek's change to core.	Eric Anholt	2014-03-18	1	-9/+0
\| \| \| \| \| \| \|	As of 780ce576bb1781f027797039693b98253ee4813e, we end up with R8_SSCALED anyway. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Enable EWA anisotropic filtering algorithm	Ian Romanick	2014-03-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Volume 4, part 1 of the Ivybridge PRM says, "Generally, the EWA approximation algorithm results in higher image quality than the legacy algorithm." Using a classic anisotropic filtering "tunnel" demo, it appears that there is no anisotropic filtering on IVB without this bit set. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Actually initialize simd16_unsupported and no16_msg.	Kenneth Graunke	2014-03-18	1	-0/+2
\| \| \| \| \| \| \| \|	I meant to include this fixes in v3 of commit de7ad2c88f4ec243c95eaed22c41d0e537912e01, but accidentally pushed a previous version. Signed-off-by: Kenneth Graunke <[email protected]>
*	i965/upload: Refactor open-coded ALIGN-like computations.	Kenneth Graunke	2014-03-18	1	-3/+9
\| \| \| \| \| \| \| \|	Sadly, we can't use actual ALIGN(), since that only supports power-of-two values for the alignment parameter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Fix indentation in brw_upload_indices().	Kenneth Graunke	2014-03-18	1	-19/+19
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Consolidate code for setting brw->ib.start_vertex_offset.	Kenneth Graunke	2014-03-18	1	-9/+6
\| \| \| \| \| \| \|	This was set identically in three places. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Allocate register sets at screen creation, not context creation.	Kenneth Graunke	2014-03-18	6	-88/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Register sets depend on the particular hardware generation, but don't depend on anything in the actual OpenGL context. Computing them is fairly expensive, and they take up a large amount of memory. Putting them in the screen allows us to compute/allocate them once for all contexts, saving both time and space. Improves the performance of a context creation/destruction microbenchmark by about 3x on my Haswell i7-4750HQ. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Allocate the screen using ralloc rather than calloc.	Kenneth Graunke	2014-03-18	1	-2/+3
\| \| \| \| \| \| \|	This will allow us to use the screen as a memory context. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Accurately bail on SIMD16 compiles.	Kenneth Graunke	2014-03-18	3	-34/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ideally, we'd like to never even attempt the SIMD16 compile if we could know ahead of time that it won't succeed---it's purely a waste of time. This is especially important for state-based recompiles, which happen at draw time. The fragment shader compiler has a number of checks like: if (dispatch_width == 16) fail("...some reason..."); This patch introduces a new no16() function which replaces the above pattern. In the SIMD8 compile, it sets a "SIMD16 will never work" flag. Then, brw_wm_fs_emit can check that flag, skip the SIMD16 compile, and issue a helpful performance warning if INTEL_DEBUG=perf is set. (In SIMD16 mode, no16() calls fail(), for safety's sake.) The great part is that this is not a heuristic---if the flag is set, we know with 100% certainty that the SIMD16 compile would fail. (It might fail anyway if we run out of registers, but it's always worth trying.) v2: Fix missing va_end in early-return case (caught by Ilia Mirkin). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> [v1] Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Support pull parameters in SIMD16 mode.	Kenneth Graunke	2014-03-18	2	-11/+13
\| \| \| \| \| \| \| \| \| \| \|	This is just a matter of reusing the pull/push constant information set up by the SIMD8 compile. This gains us 78 SIMD16 programs in shader-db. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Use a single instance of the pull_constant_loc[] array.	Kenneth Graunke	2014-03-18	2	-28/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we don't renumber uniform registers, assign_constant_locations and move_uniform_array_access_to_pull_constants use the same names. So, they can share a single copy of the pull_constant_loc[] array. This simplifies the code considerably. assign_constant_locations() doesn't need to walk through pull_params[] to rediscover reladdr demotions; it just has that information in pull_constant_loc[]. We also only need to rewrite the instruction stream once, instead of twice. Even better, we now have a single array describing the layout of all pull parameters, which we can pass to the SIMD16 program. This actually hurts a few shaders in Serious Sam 3, and one in KWin: total instructions in shared programs: 1841957 -> 1842035 (0.00%) instructions in affected programs: 1165 -> 1243 (6.70%) Comparing dump_instructions() before and after the pull constant transformations with and without this patch, it appears that there is a uniform array with variable indexing (reladdr) and constant indexing (of array element 0). Previously, we uploaded array element 0 as both a pull constant (for reladdr) /and/ a push constant. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Don't renumber UNIFORM registers.	Kenneth Graunke	2014-03-18	3	-118/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, remove_dead_constants() would renumber the UNIFORM registers to be sequential starting from zero, and the resulting register number would be used directly as an index into the params[] array. This renumbering made it difficult to collect and save information about pull constant locations, since setup_pull_constants() and move_uniform_array_access_to_pull_constants() used different names. This patch generalizes setup_pull_constants() to decide whether each uniform register should be a pull constant, push constant, or neither (because it's unused). Then, it stores mappings from UNIFORM register numbers to params[] or pull_params[] indices in the push_constant_loc and pull_constant_loc arrays. (We already did this for pull constants.) Then, assign_curb_setup() just needs to consult the push_constant_loc array to get the real index into the params[] array. This effectively folds all the remove_dead_constants() functionality into assign_constant_locations(), while being less irritable to work with. v2: Add assert(remapped <= i), requested by Topi. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Split pull parameter decision making from mechanical demoting.	Kenneth Graunke	2014-03-18	2	-33/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	move_uniform_array_access_to_pull_constants() and setup_pull_constants() both have two parts: 1. Decide which UNIFORM registers to demote to pull constants, and assign locations. 2. Mechanically rewrite the instruction stream to pull the uniform value into a temporary VGRF and use that, eliminating the UNIFORM file access. In order to support pull constants in SIMD16 mode, we will need to make decisions exactly once, but rewrite both instruction streams. Separating these two tasks will make this easier. This patch introduces a new helper, demote_pull_constants(), which takes care of rewriting the instruction stream, in both cases. For the moment, a single invocation of demote_pull_constants can't safely handle both reladdr and non-reladdr tasks, since the two callers still use different names for uniforms due to remove_dead_constants() remapping of things. So, we get an ugly boolean parameter saying which to do. This will go away. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Record pull constant locations for all array elements.	Kenneth Graunke	2014-03-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When demoting a variably indexed uniform array to pull constants, we only recorded the location for the base of the array (element 0). Recording locations for all array elements is a trivial amount of code and will make subsequent refactoring easier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Save push constant location information.	Kenneth Graunke	2014-03-18	3	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, both move_uniform_array_access_to_pull_constants() and setup_pull_constants() maintained stack-local arrays with this information. Storing this information will allow it to be used from multiple functions, allowing us to split and move code around. We'll also eventually want to pass pull constant location information to the SIMD16 compile. Saving this information will help us do that. Unfortunately, the two functions cannot share the contents of the array just yet. remove_dead_constants() renumbers all the UNIFORM registers to be contiguous starting at zero, so the two functions talk about uniforms using different names. We can't even remap them, since move_uniform_array_access_to_pull_constants() deletes UNIFORM registers that are only accessed with reladdr, so remove_dead_constants can't even see them. This situation will improve in the next few patches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Delete dead code to fail compiles with SIMD16 pull parameters.	Kenneth Graunke	2014-03-18	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	The SIMD8 compile will determine whether pull parameters are necessary. If so, it will set prog_data->nr_pull_params to a value greater than 0. brw_wm_fs_emit checks if nr_pull_params > 0 and skips the SIMD16 compile altogether. So, this code should never occur. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Invalidate live intervals when demoting uniforms to pull params.	Kenneth Graunke	2014-03-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally, nothing uses live intervals at this point, so this isn't necessary. However, dump_instructions() calculates them and uses them to show register pressure. So, calling dump_instructions() in this area of the code would segfault due to the arrays being the wrong size. This is not a candidate for stable branches because it only serves to fix internal debugging code that you manually have to invoke by altering the source code or using gdb. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Print "+reladdr" on variably-indexed uniform arrays.	Kenneth Graunke	2014-03-14	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, dump_instruction() would print output such as: { 2} 3: mov vgrf1:F, u0:F { 3} 4: mov vgrf7:F, u0:F { 4} 5: mov vgrf8:F, u0:F which looked like either a scalar access or perhaps a constant-indexed access of element 0, when it was really a variable index. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Fix register types in dump_instructions(), again.	Kenneth Graunke	2014-03-14	4	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	In commit e57d77280efcbfd6579a88f071426653287ef833, I fixed this for destinations in the Vec4 backend, and sources in the scalar backend. But not both types in both backends. To prevent this mess from continuing, make the reg_encoding table static, so only the disassembler can use it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Fix register comparisons in saturate propagation.	Kenneth Graunke	2014-03-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	opt_saturate_propagation_local compares scan_inst->dst.reg/reg_offset with inst->src[0].reg/reg_offset, and ensures that scan_inst->dst.file is GRF. But nothing ensured that inst->src[0].file was GRF. In the following program, this resulted in u1:F matching vgrf1:UW, and a saturate being incorrectly propagated from instruction 8 to instruction 1. { 1} 0: add vgrf0:UW, hw_reg1+8:UW, hw_reg0:V { 1} 1: add vgrf1:UW, hw_reg1+10:UW, hw_reg0:V { 1} 2: linterp vgrf6:F, hw_reg2:F, hw_reg3:F, hw_reg0:F { 2} 3: linterp vgrf27:F, hw_reg2:F, hw_reg3:F, hw_reg0+16:F { 4} 4: mov vgrf10+0.0:F, vgrf6:F { 3} 5: mov vgrf10+1.0:F, vgrf27:F { 6} 6: tex vgrf8+0.0:F, vgrf10+0.0:F { 5} 7: mov vgrf32:F, u1:F { 5} 8: mov.sat vgrf12:F, u1:F From shader-db: total instructions in shared programs: 1841932 -> 1841957 (0.00%) instructions in affected programs: 5823 -> 5848 (0.43%) I inspected two of the 25 hurt shaders, and concluded that they were both hitting this bug, and not legitimately optimized. This fixes bugs in Left 4 Dead 2 and Team Fortress 2, possibly among others. The optimization pass didn't exist in 10.0, so this is only a candidate for 10.1. Cc: "10.1" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	i965: Add support for GL_ARB_buffer_storage.	Eric Anholt	2014-03-14	2	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	It turns out we can allow COHERENT storage/mappings all the time, regardless of LLC vs non-LLC. It just means never using temporary mappings to avoid GPU stalls, and on non-LLC we have to use the GTT intead of CPU mappings. If we were to use CPU maps on non-LLC (which might be useful if apps end up using buffer_storage on PBO reads, to avoid WC read slowness), those would be PERSISTENT but not COHERENT, but doing that would require us driving the clflushes from userspace somehow. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Always use CPU mappings for BOs on LLC platforms.	Eric Anholt	2014-03-14	1	-1/+1
\| \| \| \| \| \| \| \|	It looks like there's no big difference for write-only workloads, but using a CPU map means that if they happen to read without having set the MAP_READ_BIT, they get 100x the performance for those reads. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop the system-memory temporary allocations for flush explicit.	Eric Anholt	2014-03-14	2	-52/+58
\| \| \| \| \| \| \|	While in expected usage patterns nobody will ever hit this path, doubling our bandwidth used seems like a waste, and it cost us extra code too. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Switch mapping modes for non-explicit-flush blit-temporary maps.	Eric Anholt	2014-03-14	1	-3/+3
\| \| \| \| \| \| \| \| \|	On LLC, it should always be better to use a cached mapping than the GTT. On non-LLC, it seems pretty silly to try to optimize read performance for the INVALIDATE_RANGE_BIT case. This will make the buffer_storage logic easier. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix build warning of unused variable	Anuj Phogat	2014-03-14	1	-2/+0
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
*	Add the EGL_MESA_configless_context extension	Neil Roberts	2014-03-12	2	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extension provides a way for an application to render to multiple surfaces with different buffer formats without having to use multiple contexts. An EGLContext can be created without an EGLConfig by passing EGL_NO_CONFIG_MESA. In that case there are no restrictions on the surfaces that can be used with the context apart from that they must be using the same EGLDisplay. _mesa_initialze_context can now take a NULL gl_config which will mark the context as ‘configless’. It will memset the visual to zero in that case. Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. Mesa needs to be aware that the context is configless because it affects the initial value to use for glDrawBuffer. The first time the context is bound it will set the initial value for configless contexts depending on whether the framebuffer used is double-buffered. Reviewed-by: Kristian Høgsberg <[email protected]>
*	meta: Always restore the framebuffers and current renderbuffer.	Eric Anholt	2014-03-11	3	-21/+17
\| \| \| \| \| \| \| \|	The few paths that were playing with framebuffers and renderbuffer were saving and restoring them. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Drop intel_check_front_buffer_rendering().	Eric Anholt	2014-03-11	6	-27/+0
\| \| \| \| \| \| \| \| \|	This was being applied in a subset of the places that intel_prepare_render() was called, to set the same flag that intel_prepare_render() was setting. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Drop broken front_buffer_reading/drawing optimization.	Eric Anholt	2014-03-11	5	-42/+44
\| \| \| \| \| \| \| \| \| \| \| \|	The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	intel: When checking for updating front buffer reading, use the right fb.	Eric Anholt	2014-03-11	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's the ctx->ReadBuffer that gets read from, not the ctx->DrawBuffer. So, if you happened to have a ctx->ReadBuffer that was the winsys buffer, and it had previously been intel_prepare_render()ed but not invalidated since then, and you called glReadBuffer() to switch to front buffer instead of back buffer reading on the winsys fbo while your drawbuffer was a user FBO, you'd never get the front buffer's miptree fetched, and segfault. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	automake: allow only shared builds	Emil Velikov	2014-03-11	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Static and shared builds were possible in the good old days of static makefiles. Currently the build system does not distinguish nor does anything special when one requests a static build. Print a warning message for the packager that static builds are not supported and continue building shared libs. Currently only Debian and derivatives use static build, and they use it for building a Xlib powered libGL. This patch will only change the warning message they are seeing but the binaries produced will be identical. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	automake: create compat symlinks only for linux systems	Emil Velikov	2014-03-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The primary users of these are linux developers, although it can be extended for BSD and others if needed. Fixes make install for Cygwin and OpenBSD at least. v2: - Wrap vdpau targets as well. v3: - Fold HAVE_COMPAT_SYMLINKS conditional within installlinks.mk Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63269 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]> (v1) Reviewed-by: Christian König <[email protected]>
*	configure: use LIB_EXT rather than hardcoded .so	Emil Velikov	2014-03-11	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some platforms different library extension - dll, dylib, a. Honor that when we are creating the required links. Rename LIB_EXTENSION to LIB_EXT while we're here. With libglapi linking aside, building classic drivers on non-linux platforms should be possible now. v2: Resolve conflicts. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	automake: do not use symbols names for static glapi.la	Emil Velikov	2014-03-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	In the cases where one links against the static glapi.la there is no need to create temporary variables only to explicitly link agaist it. Instead use SHARED_GLAPI_LIB to explicitly indicate when one is building and linking with the shared glapi provider. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	automake: use install-lib-links.mk across all classic mesa	Emil Velikov	2014-03-11	2	-13/+2
\| \| \| \| \| \| \|	Use the handy script and minimise the boilerplate in the makefiles. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	automake: silence folder creation	Emil Velikov	2014-03-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	There is little gain in printing whenever a folder is created. v2: - Use $(AM_V_at) over @ to have control in verbose builds. Suggested by Erik Faye-Lund. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	automake: use MKDIR_P when possible	Emil Velikov	2014-03-11	1	-1/+1
\| \| \| \| \| \| \|	Use the automake predefined macro over hardcoding mkdir -p everywhere. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
*	meta: use non-ARB shader/program create/delete functions	Brian Paul	2014-03-10	2	-30/+30
\| \| \| \| \| \| \|	The non-ARB versions take GLuint ids, not GLhandleARB. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: rename MESA_FORMAT_X8Z24_UNORM -> MESA_FORMAT_X8_UINT_Z24_UNORM	Brian Paul	2014-03-10	2	-2/+2
\| \| \| \| \| \| \|	To follow the example of MESA_FORMAT_Z24_UNORM_X8_UINT. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965/vec4: Don't fix-up scalar uniforms for 3 src instructions.	Matt Turner	2014-03-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Removes unnecessary MOV instructions in L4D2, TF2, Dota2, and many other Steam games. total instructions in shared programs: 1668126 -> 1657509 (-0.64%) instructions in affected programs: 242235 -> 231618 (-4.38%) Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Disassemble 3 src instructions' rep_ctrl field.	Matt Turner	2014-03-10	2	-6/+24
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Disassemble 3-src operands widths' correctly.	Matt Turner	2014-03-10	4	-38/+38
\| \| \| \| \| \| \|	<4,1,1> isn't a real thing. We meant <4,4,1>, i.e., each component of the whole register. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move binding table update packets to binding table setup time.	Eric Anholt	2014-03-10	7	-39/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This keeps us from needing to reemit all the other stage state just because a surface changed. Improves unoptimized glamor x11perf -f8text by 1.10201% +/- 0.489869% (n=296). [v1] v2: - Drop binding table packets from Gen8 unit state as well. - Pass _3DSTATE_BINDING_TABLE_POINTERS_XS to brw_upload_binding_table, cutting even more code. v3: Don't forget to drop them from 3DSTATE_GS (botched refactor in v2). Signed-off-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Kenneth Graunke <[email protected]> [v1] Signed-off-by: Kenneth Graunke <[email protected]> [v2, v3] Reviewed-by: Eric Anholt <[email protected]> [v3]
*	i965: Reorganize the code in brw_upload_binding_tables.	Kenneth Graunke	2014-03-10	1	-17/+18
\| \| \| \| \| \| \| \|	This makes both the empty and non-empty binding table paths exit through the bottom of the function, which gives us a place to share code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	meta: Support GenerateMipmaps on 1DArray textures.	Kenneth Graunke	2014-03-07	1	-9/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't know how many people care about this case, but it's easy enough to do, so we may as well. The tricky part is that for some reason Mesa stores the number of array slices in Height, not Depth. I thought the easiest way to handle that here was to make Height = 1 (the actual height), and srcDepth = srcImage->Height. This requires some munging when calling _mesa_prepare_mipmap_level, so I created a wrapper that sorts it out for us. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	meta: Use srcWidth/Height/Depth rather than srcImage->Width and such.	Kenneth Graunke	2014-03-07	1	-3/+3
\| \| \| \| \| \| \| \| \|	This is equivalent for now, and will differ once we add 1DArray support. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	meta: Support GenerateMipmaps on 2DArray textures.	Kenneth Graunke	2014-03-07	1	-35/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is largely a matter of looping over the number of slices/layers, and not minifying depth (presumably that code exists for the unfinished 3D texture support). Normally, I would have made the loop over array slices the outermost loop. I suspect that would make it trickier to support 3D textures someday, though, so I didn't. The advantage is that we would only have one BufferData call per slice, rather than one per miplevel and slice. However, a GenerateMipmaps microbenchmark indicates that either way is basically just as fast. So I'm not sure it's worth bothering. Improves performance in a GenerateMipmaps microbenchmark by nearly 5x. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	meta: Add a 'layer' argument to bind_fbo_image().	Kenneth Graunke	2014-03-07	1	-9/+11
\| \| \| \| \| \| \| \| \| \|	For array textures and 3D textures, this represents the layer to use. Just pass 0 for now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>