mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Factor out virtual GRF allocation to a separate object.	Francisco Jerez	2015-02-10	18	-201/+235
\| \| \| \| \| \| \| \| \| \| \| \| \|	Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depend on the specific visitor that happens to be used to translate from GLSL IR. v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor). Reviewed-by: Matt Turner <[email protected]>
*	i965: Fix integer border color on Haswell.	Kenneth Graunke	2015-02-09	3	-0/+66
\| \| \| \| \| \| \| \| \|	+82 Piglits - 100% of border color tests now pass on Haswell. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: [email protected]
*	i965: Use a gl_color_union for sampler border color.	Kenneth Graunke	2015-02-09	1	-53/+52
\| \| \| \| \| \| \| \| \| \| \|	This should have no effect, but will make it easier to implement other bug fixes. v2: Eliminate "unsigned one" local; just use the value where necessary. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected]
*	i965: Override swizzles for integer luminance formats.	Kenneth Graunke	2015-02-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hardware's integer luminance formats are completely unusable; currently we fall back to RGBA. This means we need to override the texture swizzle to obtain the XXX1 values expected for luminance formats. Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled] on Broadwell - 100% of border color tests now pass on Broadwell. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: [email protected]
*	i965: Add more stringent blitter assertions	Ben Widawsky	2015-02-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Blits to or from a y-tiled surface must always be a multiple of the tile size. From page 16 of the HSW PRM (https://01.org/linuxgraphics/sites/default/files/documentation/intel-gfx-prm-osrc-hsw-memory-views.pdf#16) "The pitch of a tiled enclosing region must be an integral number of tile widths" Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Consolidate some of the intel_blit logic	Ben Widawsky	2015-02-07	1	-20/+8
\| \| \| \| \| \| \| \| \| \| \|	An upcoming patch is going to introduce some code here, and having this code organized as the patch does makes it a bit easier to read later. There should be no functional change here. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/vec4: Correct MUL destination hazard	Ben Widawsky	2015-02-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it turns out, we were over-thinking the cause of the hang on Cherryview. It's simply errata for Cherryview. commit 88fea85f09e2252035bec66ab26c375b45b000f5 Author: Ben Widawsky <[email protected]> Date: Fri Nov 21 10:47:41 2014 -0800 i965/vec4/gen8: Handle the MUL dest hazard exception This is an explanation to why we never saw the hang on BDW. NOTE: The problem the original patch was trying to fix does still exist. It will have to be fixed at some point. v2: Modify commit message, s/CHV/BDW Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Fix INTEL_DEBUG=shader_time for SIMD8 VS (and GS).	Kenneth Graunke	2015-02-05	1	-9/+25
\| \| \| \| \| \| \| \| \| \| \| \|	We were incorrectly attributing VS time to FS8 on Gen8+, which now use fs_visitor for vertex shaders. We don't hit this for geometry shaders yet, but we may as well add support now - the fix is obvious, and we'll just forget later. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/fs: Use inst->eot rather than opcodes in register allocation.	Kenneth Graunke	2015-02-05	1	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, we special cased FB writes and URB writes in the register allocation code. What we really wanted was to handle any message with EOT set. This saves us from extending the list with new opcodes in the future. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
*	i965/fs: Delete is_last_send(); just check inst->eot.	Kenneth Graunke	2015-02-05	1	-14/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This helper function basically just checks inst->eot, but also asserts that only opcodes we expect to terminate threads have EOT set. As far as I'm aware, we've never had such a bug. Removing it means that we don't have to extend the list for new opcodes. Cherryview and Skylake introduce an optimization where sampler messages can have EOT set; scalar GS/HS/DS will likely introduce new opcodes as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
*	i965: Remove now unnecessary Gen8 CMP destination type override.	Matt Turner	2015-02-04	1	-8/+0
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Set CMP's destination type to src0's type.	Matt Turner	2015-02-04	2	-18/+18
\| \| \| \| \| \|	Allows CMP instructions with float sources to be compacted and coissued. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.	Matt Turner	2015-02-04	1	-1/+36
\| \| \| \| \| \|	Prevents piglit regressions from the next patch. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Fix saturate on MAD and LRP with the NIR backend.	Kenneth Graunke	2015-02-04	1	-2/+4
\| \| \| \| \| \| \| \| \|	Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/nir: use redundant phi optimization	Connor Abbott	2015-02-03	1	-0/+2
\| \| \| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
*	i965/fs_nir: Get rid of get_alu_src	Jason Ekstrand	2015-02-03	2	-59/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <[email protected]>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Use NIR's scalarizing abilities and stop handling vectors	Jason Ekstrand	2015-02-03	2	-349/+161
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Add support for constant propagating into sources with modifiers.	Matt Turner	2015-02-03	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \|	All but 16 of the programs helped were ARB fp programs. total instructions in shared programs: 5949286 -> 5945470 (-0.06%) instructions in affected programs: 275162 -> 271346 (-1.39%) helped: 1197 GAINED: 1 Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/vec4: Use abs/negate functions in const propagation.	Matt Turner	2015-02-03	1	-13/+5
\| \| \| \| \| \|	No changes in shader-db. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Add function to take the abs of immediates.	Matt Turner	2015-02-03	2	-0/+40
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Add function to negate immediates.	Matt Turner	2015-02-03	2	-0/+40
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Mark UB/B immediates as unreachable.	Matt Turner	2015-02-03	1	-4/+1
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	glsl: Improve precision of mod(x,y)	Iago Toral Quiroga	2015-02-03	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <[email protected]>
*	i965: Fix negate with unsigned integers	Iago Toral Quiroga	2015-02-03	2	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8) g5<1>.xF -g9<4,4,1>.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 24 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uint_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec2_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec3_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec4_ Reviewed-by: Anuj Phogat <[email protected]>
*	i965/gen6+: enable EXT_polygon_offset_clamp	Ilia Mirkin	2015-02-02	4	-3/+4
\| \| \| \| \| \| \|	Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: add support for GL_EXT_polygon_offset_clamp	Ilia Mirkin	2015-02-02	3	-3/+3
\| \| \| \| \| \| \| \| \| \|	Nothing enables the extension yet, but the values are now available. The spec calls for it to only be exposed for GL 3.3+, which is core-only in mesa. Instead we allow any driver to enable it, including in a compat context for any GL version. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
*	i965: Add a better PRM citation for the IMS dimension mangling.	Kenneth Graunke	2015-02-02	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Paul originally had to reverse engineer these formulas based on the description about how the sampler works. The description here is not the easiest to follow - especially given that it's from the Sandybridge era, when the hardware only did 4x multisampling. Jordan and I recently found another part of the documentation where they simply state that IMS dimensions must be adjusted by a set of formulas. Quoting this section provides an easy to follow explanation for the code, including 2x/4x/8x/16x. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	DD: Refactor BlitFramebuffer.	Laura Ekstrand	2015-02-02	12	-46/+78
\| \| \| \| \| \| \| \| \|	In preparation for glBlitNamedFramebuffer, the DD table function BlitFramebuffer needs to accept two arbitrary framebuffer objects rather than assuming ctx->ReadBuffer and ctx->DrawBuffer. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	i965: Don't use tiled_memcpy to download from RGBX or BGRX surfaces	Jason Ekstrand	2015-02-02	2	-0/+14
\| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Anuj Phogat <[email protected]>
*	dir-locals.el: Don't set variables for non-programming modes	Neil Roberts	2015-02-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <[email protected]>
*	i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY	Iago Toral Quiroga	2015-02-02	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \|	For GL_TEXTURE_1D_ARRAY targets we store the depth of the array in the Height field and leave Depth=1 in the underlying texture object. When we call intel_miptree_copy_teximage in the process of re-creating a miptree (possibily because the number of miplevels has changed) we didn't account for this, so we where only copying texture images for the first slice. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer	Jason Ekstrand	2015-01-31	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	The GL spec guarantees that glGetTexImage will never get a multisampled texture, but this is not true for glReadPixels. If we get a multisampled buffer, we have to do a multisample resolve on it before we can pull the data down for the user. Since this isn't practical to handle in tiled_memcpy, we just fall back to the other paths that can handle this. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Enable L3 caching of buffer surfaces.	Francisco Jerez	2015-01-31	4	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	And remove the mocs argument of the emit_buffer_surface_state vtbl hook. Its semantics vary greatly from one generation to another, so it kind of encourages the caller to pass 0 which is the only valid setting across generations. After this commit the hardware-specific code decides what the best cacheability settings are for buffer surfaces, just like we do for textures. This together with some additional changes coming is expected to improve performance of pull constants, buffer textures, atomic counters and image objects on Gen7 and up. Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/pixel_read: Properly flip the results for window system buffers	Jason Ekstrand	2015-01-30	1	-0/+15
\| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Chad Versace <[email protected]>
*	i965/tiled_memcpy: Support a signed linear pitch	Jason Ekstrand	2015-01-30	2	-17/+17
\| \| \| \|	Reviewed-by: Chad Versace <[email protected]>
*	i965/skl: Force a BINDING_TABLE_POINTER_* after push constant command	Neil Roberts	2015-01-30	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_* command. This patch just makes it set the BRW_NEW_SURFACES state when uploading the push constants to ensure the binding tables will be updated. This fixes the fbo-blending-formats Piglit test and possibly others. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	meta: Don't write depth when decompressing tex-images	Topi Pohjolainen	2015-01-30	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meta: Don't write depth when generating miptrees	Topi Pohjolainen	2015-01-30	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meta/blit: Compile programs with and without depth	Topi Pohjolainen	2015-01-30	2	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When color buffers alone are concerned the depth is not needed. No regression on BDW where meta blit is used instead of blorp. I also disabled blorp temporarily for fbo-blits on IVB and saw no regressions there either. I also compared several graphics benchmarks on BDW and saw neither regressions or improvements. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meta/blit: Write depth only when asked for	Topi Pohjolainen	2015-01-30	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implementing an idea from Ken, on i965 the shader program for 2D blits becomes significantly simpler. Before: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; mov(8) g123<1>F g2<8,8,1>F { align1 1Q compacted }; mov(8) g124<1>F g3<8,8,1>F { align1 1Q compacted }; mov(8) g125<1>F g4<8,8,1>F { align1 1Q compacted }; mov(8) g126<1>F g5<8,8,1>F { align1 1Q compacted }; mov(8) g127<1>F g2<8,8,1>F { align1 1Q compacted }; nop ; sendc(8) null g123<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT }; After: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g124<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; sendc(8) null g124<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT }; v2 (Matt): Removed unintended white-space change Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meta/blit: Add plumbing for shaders without depth	Topi Pohjolainen	2015-01-30	4	-3/+5
\| \| \| \| \| \| \| \| \|	Currently all blit programs are unconditionally compiled with gl_FragDepth. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	Mesa: Advertise GL_OES_texture_float extensions support with i965.	Kalyan Kondapally	2015-01-29	1	-0/+5
\| \| \| \| \| \| \| \| \|	This patch advertises support for GL_OES_texture_float extensions when using i965 drivers. Signed-off-by: Kevin Rogovin <[email protected]> Signed-off-by: Kalyan Kondapally <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	mesa: Move simple_list.h to src/util.	Eric Anholt	2015-01-28	18	-18/+18
\| \| \| \| \| \|	We have two copies of it in the tree, I'm going to delete one. Reviewed-by: Marek Olšák <[email protected]>
*	drirc: set allow_glsl_extension_directive_midshader for Dead Island.	Sven Arvidsson	2015-01-28	1	-0/+4
\| \| \| \| \| \|	Signed-off-by: Sven Arvidsson <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076 Signed-off-by: Marek Olšák <[email protected]>
*	i965/tex: Don't create read-write textures with non-renderable formats	Jason Ekstrand	2015-01-28	1	-0/+5
\| \| \| \| \| \| \| \| \|	I haven't actually seen this bug in the wild, but it's possible that someone could ask to do a S3TC PBO download or something. This protects us from accidentally creating a render target with a compressed or otherwise non-renderable format. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/gen8: Include the buffer offset when emitting renderbuffer relocs	Jason Ekstrand	2015-01-28	1	-1/+1
\| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88792 Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage	Sisinty Sasmita Patra	2015-01-26	3	-1/+271
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy functions. These are the fast paths for glReadPixels and glGetTexImage. On chrome, using the RoboHornet 2D Canvas toDataURL test, this patch cuts amount of time spent in glReadPixels by more than half and reduces the time of the entire test by 10%. v2: Jason Ekstrand <[email protected]> - Refactor to make the functions look more like the old intel_tex_subimage_tiled_memcpy - Don't export the readpixels_tiled_memcpy function - Fix some pointer arithmatic bugs in partial image downloads (using ReadPixels with a non-zero x or y offset) - Fix a bug when ReadPixels is performed on an FBO wrapping a texture miplevel other than zero. v3: Jason Ekstrand <[email protected]> - Better documentation fot the *_tiled_memcpy functions - Add target restrictions for renderbuffers wrapping textures v4: Jason Ekstrand <[email protected]> - Only check the return value of brw_bo_map for error and not bo->virtual v5: Jason Ekstrand <[email protected]> - Don't unnecessarily repeat a comment Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965/tiled_memcpy: Add tiled-to-linear paths	Sisinty Sasmita Patra	2015-01-26	2	-0/+281
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit addes tiled copy functions for coping from tiled memory to linear memory. These are very similar to the existing linear-to-tiled paths. v2: Jason Ekstrand <[email protected]> - New commit message - Various whitespace fixes - Added ptrdiff_t casts as done in commit 225a09790 v3: Jason Ekstrand <[email protected]> - Fixed a comment Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Refactor tiled memcpy functions and move them into their own file	Sisinty Sasmita Patra	2015-01-26	4	-392/+506
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit refactors the tiled_memcpy code in intel_tex_subimage.c and moves it into its own file intel_tiled_memcpy files. Also, xtile_copy and ytile_copy are renamed to linear_to_xtiled and linear_to_ytiled respectively. The *_faster functions are similarly renamed. There was also a bit of logic to select between the the libc provided memcpy function and our custom memcpy that does an RGBA -> BGRA swizzle. This was moved into an intel_get_memcpy function so that rgba8_copy can live (and be inlined) in intel_tiled_memcpy.c. v2: Jason Ekstrand <[email protected]> - Better commit message - Fix up the copyright on the intel_tiled_memcpy files - Various whitespace fixes - Moved a bunch of stuff that did not need to be exposed from intel_tiled_memcpy.h to intel_tiled_memcpy.c - Added proper documentation for intel_get_memcpy - Incorperated the ptrdiff_t tweaks from commit 225a09790 v3: Jason Ekstrand <[email protected]> - Fixed a comment - Move the tile size constants into the .c file Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965/tex_subimage: Use the fast tiled path for rectangle textures	Jason Ekstrand	2015-01-26	1	-1/+2
\| \| \| \| \| \| \| \|	There's no reason why we should be doing this for 2D textures and not rectangles. Just a matter of adding another hunk to the condition. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>