mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	st/mesa/glsl: set early_fragment_tests directly in shader_info	Timothy Arceri	2017-01-19	3	-8/+7
\| \| \| \| \| \| \|	We also move EarlyFragmentTests out of the gl_shader_info struct as it is now only used by gl_shader. Reviewed-by: Lionel Landwerlin <[email protected]>
*	mesa/glsl/i965: set and use tcs vertices_out directly	Timothy Arceri	2017-01-19	2	-9/+3
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965: get outputs_written from gl_program	Timothy Arceri	2017-01-19	1	-2/+2
\| \| \| \| \| \| \|	There is no need to go via the pointer in nir_shader. This change is required for the shader cache as we don't create a nir_shader. Reviewed-by: Lionel Landwerlin <[email protected]>
*	mesa: don't always set _NEW_PROGRAM when linking	Timothy Arceri	2017-01-19	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \|	We only need to set it when linking was successful and the program being linked is currently active. The programs_in_use mask is just used as a flag for now but in a future change we will use it to update the CurrentProgram array. V2: make sure to flush vertices before linking (suggested by Marek) Reviewed-by: Marek Olšák <[email protected]>
*	mesa: change init subroutine defaults helper to work per gl_program	Timothy Arceri	2017-01-19	3	-24/+20
\| \| \| \| \| \| \|	A later patch will result in SSO programs calling this helper per gl_program rather than per gl_shader_program. Reviewed-by: Lionel Landwerlin <[email protected]>
*	mesa/glsl: move ProgramResourceList to gl_shader_program_data	Timothy Arceri	2017-01-19	4	-46/+53
\| \| \| \| \| \| \| \| \| \|	We also move NumProgramResourceList at the same time. GLES does interface validation on SSO at runtime so we need to move this to be able to switch to storing gl_program pointers in CurrentProgram. Reviewed-by: Lionel Landwerlin <[email protected]>
*	glsl: store number of explicit uniform loactions in gl_shader_program	Timothy Arceri	2017-01-19	1	-0/+5
\| \| \| \| \| \| \| \| \|	This allows us to cleanup the functions that pass this count around, but more importantly we will be able to call the uniform linking functions from that backends linker without having to pass this information to the backend directly via Driver.LinkShader(). Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Make post draw flush more explicit	Topi Pohjolainen	2017-01-18	2	-5/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Blits do not need any special treatment as the target buffer object is added to render cache just as one does for normal draw. Color clears and resolves in turn require explicit "end of pipe synchronization". It is not clear what this means exactly but the assumption is that render cache flush with command stream stall should be sufficient. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gen6: Issue direct depth stall and flush after depth clear	Topi Pohjolainen	2017-01-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of calling unconditionally brw_emit_mi_flush() which does: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Make depth clear flushing more explicit	Topi Pohjolainen	2017-01-18	2	-8/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current blorp logic issues unconditional "flush everything" (see brw_emit_mi_flush()) after each render. For example, all blits issue this unconditionally which shouldn't be needed if they set render cache properly so that subsequent renders do necessary flushing before drawing. In case of piglit: ext_framebuffer_multisample-accuracy all_samples depth_draw small intel_hiz_exec() is always preceded by blorb blit and the unconditional flush looks to hide the lack of stall and flushes in depth clears. By removing the brw_emit_mi_flush() I get gpu hangs. This patch adds the stalls and flushes mandated by the spec and gets rid of those hangs. v2 (Jason, Ken): Document the rational for separating depth cache flush and stall on Gen7. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/blorp: Use the render cache mechanism instead of explicit flushing	Topi Pohjolainen	2017-01-18	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush(). The latter splits the flush in two: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); instead of int flags = PIPE_CONTROL_NO_WRITE \| PIPE_CONTROL_RENDER_TARGET_FLUSH; if (brw->gen >= 6) { flags \|= PIPE_CONTROL_INSTRUCTION_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE \| PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_VF_CACHE_INVALIDATE \| PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CS_STALL; } brw_emit_pipe_control_flush(brw, flags); v2 (Jason): Check that destination exists before trying to add to render cache. Depth clears and resolves don't have it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	utils: build sha1/disk cache only with Android/Autoconf	Emil Velikov	2017-01-18	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Earlier commit imported a SHA1 implementation and relaxed the SHA1 and disk cache handling, broking the Windows builds. Restrict things for now until we get to a proper fix. Fixes: d1efa09d342 "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov <[email protected]>
*	util: import sha1 implementation from OpenBSD17.0-branchpoint	Emil Velikov	2017-01-18	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. As an added bonus this will avoid all the 'fun' experiences trying to integrate it with the Android and SCons builds. v2: Manually expand __BEGIN_DECLS/__END_DECLS and document (Tapani). Furthermore it seems that some games (or surrounding runtime) static link against OpenSSL resulting in conflicts. For more information see the discussion thread [3] Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 [3] https://lists.freedesktop.org/archives/mesa-dev/2017-January/140748.html Cc: Mark Janes <[email protected]> Cc: Vinson Lee <[email protected]> Cc: Tapani Pälli <[email protected]> Cc: Jonathan Gray <[email protected]> Tested-by: Jonathan Gray <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Acked-by: Tapani Pälli <[email protected]> (v1) Acked-by: Jason Ekstrand <[email protected]> (v1)
*	i965: Make brw_cache_item structure private to brw_program_cache.c.	Kenneth Graunke	2017-01-18	2	-19/+21
\| \| \| \| \| \| \| \|	struct brw_cache_item is an implementation detail of the program cache. We don't need to make those internals available to the entire driver. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965: Fix SURFACE_STATE to handle non-zero aux offsets	Ben Widawsky	2017-01-18	1	-2/+1
\| \| \| \| \| \| \|	Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Daniel Stone <[email protected]>
*	i965: Don't map/unmap in brw_print_program_cache on LLC platforms.	Kenneth Graunke	2017-01-17	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	We have a persistent mapping. Don't map it a second time or try to unmap it. Just use the pointer. This most likely would wreak havoc except that this code is unused (it's only called from an if (0) debug block). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965: Move program cache printing to brw_program_cache.c.	Kenneth Graunke	2017-01-17	3	-57/+49
\| \| \| \| \| \| \| \|	It makes sense to put a function which prints out the entire contents of the program cache in the file that implements the program cache. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965: Make a helper for finding an existing shader variant.	Kenneth Graunke	2017-01-17	7	-85/+68
\| \| \| \| \| \| \| \| \|	We had five copies of the same "walk the cache and look for an existing shader variant for this program" code. Now we have one helper function that returns the key. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965: Make DCE set null destinations on messages with side effects.	Kenneth Graunke	2017-01-17	1	-13/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Co-authored by Matt Turner.) Image atomics, for example, return a value - but the shader may not want to use it. We assigned a useless VGRF destination. This seemed harmless, but it can actually be quite harmful. The register allocator has to assign that VGRF to a real register. It may assign the same actual GRF to the destination of an instruction that follows soon after. This results in a write-after-write (WAW) dependency, and stall. A number of "Deus Ex: Mankind Divided" shaders use image atomics, but don't use the return value. Several of these were hitting WAW stalls for nearly 14,000 (poorly estimated) cycles a pop. Making dead code elimination null out the destination avoids this issue. This patch cuts one shader's estimated cycles by -98.39%! Removing the message response should also help with data cluster bandwidth. On Skylake: (instruction counts remain identical) total cycles in shared programs: 255413890 -> 248081010 (-2.87%) cycles in affected programs: 12019948 -> 4687068 (-61.01%) helped: 24 HURT: 10 v2: Make can_omit_write independent of can_eliminate (Curro). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Combine some dead code elimination NOP'ing code.	Kenneth Graunke	2017-01-17	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In theory we might have incorrectly NOP'd instructions that write the flag, but where that flag value isn't used, and yet the instruction either writes the accumulator or has side effects. I don't believe any such instructions exist, so this is mostly a code cleanup. Curro pointed out that FS_OPCODE_FB_WRITE has a null destination and actually writes the flag on Gen4-5 to dynamically decide whether to write some payload data. The hunk removed in this patch might have NOP'd it, except that we don't actually mark flags_written() in the IR, so it doesn't think the flag is touched at all. That's sketchy, but it means it wouldn't hit this today (though there are likely other problems!). v2: Properly replace the inst->regs_written() check in the second hunk with the flag being live (mistake caught by Curro). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Make DCE explicitly not eliminate any control flow instructions.	Kenneth Graunke	2017-01-17	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to Matt, the dead code pass explicitly avoided IF and WHILE because on Sandybridge, these could have conditional modifiers and null destination registers. Normally, those instructions use BAD_FILE for the destination register. Nowadays, we don't do that anymore, so we could technically drop these checks. However, it's clearer to explicitly leave control flow instructions alone, so change it to the more generic !inst->is_control_flow(). This should have no actual change. [This patch implements review feedback from Curro and Matt.] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	st/mesa: add support for advanced blend when fb can be fetched from	Ilia Mirkin	2017-01-16	4	-8/+37
\| \| \| \| \| \| \| \| \|	This implements support for emitting FBFETCH ops, using the existing lowering pass for advanced blend logic, and disabling hw blend when advanced blending is enabled. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: add flags parameter to texture barrier	Ilia Mirkin	2017-01-16	1	-1/+1
\| \| \| \| \| \| \| \|	This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	mesa: allow BlendBarrier to be used without support for full fb fetch	Ilia Mirkin	2017-01-16	1	-1/+2
\| \| \| \| \| \| \| \|	The extension spec is not currently published, so it's a bit premature to require it for BlendBarrier usage. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	i965: Make BLORP disable the NP Z PMA stall fix.	Kenneth Graunke	2017-01-16	1	-0/+4
\| \| \| \| \| \| \| \|	This may fix GPU hangs on Gen8. I don't know if it does though. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Enable OpenGL 4.5 on Haswell.	Kenneth Graunke	2017-01-16	2	-2/+2
\| \| \| \| \| \| \| \|	Everything is in place and the test results look solid. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Use align1 mode for barrier messages.	Kenneth Graunke	2017-01-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 7428e6f86ab5 we switched the barrier SEND message's destination type to UW to avoid problems in SIMD16 compute shaders. Tessellation control shaders also use barriers, and in vec4 mode, we were emitting them in align16 mode. The simulator warns that only UD, D, F, and DF are valid destination types - UW is technically illegal. So, switch to align1 mode. Either mode should work fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Move Gen4-5 interpolation stuff to brw_wm_prog_data.	Kenneth Graunke	2017-01-13	11	-70/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes glxgears rendering, which had surprisingly been broken since late October! Specifically, commit 91d61fbf7cb61a44adcaae51ee08ad0dd6b. glxgears uses glShadeModel(GL_FLAT) when drawing the main portion of the gears, then uses glShadeModel(GL_SMOOTH) for drawing the Gouraud-shaded inner portion of the gears. This results in the same fragment program having two different state-dependent interpolation maps: one where gl_Color is flat, and another where it's smooth. The problem is that there's only one gen4_fragment_program, so it can't store both. Each FS compile would trash the last one. But, the FS compiles are cached, so the first one would store FLAT, and the second would see a matching program in the cache and never bother to compile one with SMOOTH. (Clearing the program cache on every draw made it render correctly.) Instead, move it to brw_wm_prog_data, where we can keep a copy for every specialization of the program. The only downside is bloating the structure a bit, but we can tighten that up a bit if we need to. This also lets us kill gen4_fragment_program entirely! Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965/vec4: Fix mapping attributes	Juan A. Suarez Romero	2017-01-13	2	-23/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts 57bab6708f2bbc1ab8a3d202e9a467963596d462, which was causing issues with ILK and earlier VS programs. 1. brw_nir.c: Revert "i965/vec4/nir: vec4 also needs to remap vs attributes" Do not perform a remap in vec4 backend. Rather, do it later when setup attributes 2. brw_vec4.cpp: This fixes mapping ATTRx to proper GRFn. Suggested-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99391 [[email protected]: merge Juan's two patches from bugzilla] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix textureGather with RG32I/UI on Gen7.	Kenneth Graunke	2017-01-13	2	-8/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the "Gather4 R32G32_FLOAT Bug" internal documentation page, the R32G32_UINT and R32G32_SINT formats are affected by the same bug as R32G32_FLOAT. Applying the same workarounds should be viable - apparently the R32G32_FLOAT_LD format shouldn't corrupt integer data which is NaN or other sketchy floating point values. One irritating caveat is that, because it's a FLOAT format, the alpha channel or any set to SCS_ONE return 0x3f8 (1.0) rather than integer 1. So we need shader code to whack those channels to 1. Fixes GL45-CTS.texture_gather.plain-gather-int-cube-rg on Haswell. v2: Fix swizzle component zeroing (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	mesa/get: Remove unused extra_ARB_viewport_array	Boyan Ding	2017-01-13	1	-1/+0
\| \| \| \| \| \| \| \|	Unused since 0a7691ee (mesa: Enable enums for OES_viewport_array). Silence a warning of unused variable. Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	xlib: Unify the style of function pointer calls in structs	Boyan Ding	2017-01-13	1	-74/+74
\| \| \| \| \| \|	Signed-off-by: Boyan Ding <[email protected]> [Emil Velikov: handle the final case in glXCreateContextAttribsARB] Signed-off-by: Emil Velikov <[email protected]>
*	radeon: Unify the style of function pointer calls in structs	Boyan Ding	2017-01-13	3	-17/+17
\| \| \| \| \| \|	Signed-off-by: Boyan Ding <[email protected]> [Emil Velikov: handle the all cases] Signed-off-by: Emil Velikov <[email protected]>
*	nouveau: Unify the style of function pointer calls in structs	Boyan Ding	2017-01-13	1	-3/+3
\| \| \| \|	Signed-off-by: Boyan Ding <[email protected]>
*	i915: Add XRGB8888 format to intel_screen_make_configs	Derek Foreman	2017-01-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a copy of commit 536003c11e4cb1172c540932ce3cce06f03bf44e except for i915. Original log for the i965 commit follows: Some application, such as drm backend of weston, uses XRGB8888 config as default. i965 doesn't provide this format, but before commit 65c8965d, the drm platform of EGL takes ARGB8888 as XRGB8888. Now that commit 65c8965d makes EGL recognize format correctly so weston won't start because it can't find XRGB8888. Add XRGB8888 format to i965 just as other drivers do. Signed-off-by: Derek Foreman <[email protected]> Acked-by: Boyan Ding <[email protected]> Tested-by: Mark Janes <[email protected]>
*	main/fbobject: throw invalid operation when get_attachment fails if needed	Alejandro Piñeiro	2017-01-13	1	-7/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In most cases, if a call to get_attachment fails is because attachment is a INVALID_ENUM. But for some specific cases, if COLOR_ATTACHMENTm (where m >= MAX_COLOR_ATTACHMENTS) is used, it should raise an INVALID_OPERATION exception instead. Fixes: GL45-CTS.direct_state_access.framebuffers_get_attachment_parameter_errors GL45-CTS.direct_state_access.framebuffers_renderbuffer_attachment_errors v2: extra new line before quote block. Include "color attachment" on both new message errors (Nicolai). Reviewed-by: Nicolai Hähnle <[email protected]>
*	main/fboject: return if it is color_attachment on get_attachment	Alejandro Piñeiro	2017-01-13	1	-11/+19
\| \| \| \| \| \| \| \| \|	Some callers would need that info to know if they should raise INVALID_ENUM or INVALID_OPERATION. An alternative would be the caller to check if the attachment is a GL_COLOR_ATTACHMENTm, but that seems redundant as get_attachment is already doing that. Reviewed-by: Nicolai Hähnle <[email protected]>
*	mesa/main: fix version/extension checks in _mesa_ClampColor	Nicolai Hähnle	2017-01-13	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add a proper check for feature support, and raise an invalid enum for GL_CLAMP_VERTEX/FRAGMENT_COLOR unconditionally in core profiles, since those enums were explicitly removed after the extension was promoted to core functionality (not in the profile sense) with OpenGL 3.0. This matches the behavior of the AMD closed source driver and fixes GL45-CTS.gtf30.GL3Tests.half_float.half_float_textures. Cc: "12.0 13.0" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nir/i965: assert first is always less than 64	Juan A. Suarez Romero	2017-01-12	1	-0/+1
\| \| \| \| \| \|	This fixes a defect detected by Coverity Scan. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/gen7: expose OpenGL 4.2 on Haswell when supported	Juan A. Suarez Romero	2017-01-12	2	-2/+2
\| \| \| \| \| \| \| \| \|	GL_ARB_vertex_attrib_64bit was the last piece missing. v2: update docs (Jordan) Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: enable ARB_shader_precision to HSW+	Samuel Iglesias Gonsálvez	2017-01-12	1	-1/+1
\| \| \| \| \| \| \| \|	v2: update docs (Jordan) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: unify the code to enable of ARB_gpu_shader_fp64 and ↵	Samuel Iglesias Gonsálvez	2017-01-12	1	-7/+2
\| \| \| \| \| \| \| \| \|	ARB_vertex_attrib_64bit for HSW+ Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Enable ARB_vertex_attrib_64bit for Haswell	Alejandro Piñeiro	2017-01-12	1	-1/+3
\| \| \| \| \| \| \| \|	v2: update docs (Jordan) Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: check for dual slot attributes on any gen	Juan A. Suarez Romero	2017-01-12	1	-2/+1
\| \| \| \| \| \| \|	Those not supporting 64 bit input vertex attributes will have the dual_slot value as false. Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: emit correctly load_inputs for 64bit data	Juan A. Suarez Romero	2017-01-12	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	For dvec3 and dvec4 types, a single GRF do not have enough space to allocate two inputs from two different vertices (SIMD4x2). So the GRF only contains first two components for the two vertices, and the next GRF has the remaining components. We want to put all the components for the same vertex in the same register. Thus, we do a shuffle to reorder the data. Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: take into account doubles when creating attribute mapping	Alejandro Piñeiro	2017-01-12	1	-4/+9
\| \| \| \| \| \| \| \| \| \|	Doubles needs more that one slot per attribute. So when filling the attribute_map we check if it is a double in order to allocate one extra register. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4/nir: vec4 also needs to remap vs attributes	Alejandro Piñeiro	2017-01-12	1	-10/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Doubles need extra space, so we would need to do a remapping for vec4 too in order to take that into account. We reuse the already existing remap_vs_attrs, but passing is_scalar, so they could remap accordingly. v2: code-format remap_vs_attrs_params initialization (Matt) Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: use attribute slots for first non payload GRF	Alejandro Piñeiro	2017-01-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the payload setup, setup_attributes is called with the first GRF that can be used for the attributes (first ones are used for uniforms for example) and returns the first GRF that is not part of the payload. Before this patch, it adds directly the number of attributes. But as with 64-bit attributes can consume more than one slot, that is not valid anymore. This patch change the addition to use the number of slots consumed. gen >= 8 would not be affected, as they use the scalar mode. For that case, the vs configuration is done at fs_visitor::assign_vs_urb_setup. v2: add explanation in commit log (Jordan) Reviewed-by: Jordan Justen <[email protected]>
*	i965: downsize 64PASSTHRU formats to equivalent 32FLOAT formats on gen < 8	Alejandro Piñeiro	2017-01-12	1	-30/+139
\| \| \| \| \| \| \| \| \| \| \| \|	gen < 8 doesn't support 64PASSTHRU formats when emitting vertices. So in order to provide the equivalent functionality, we need to downsize the format to equivalent 32FLOAT, and in some cases (R64G64B64 and R64G64B64A64) submit two 3DSTATE_VERTEX_ELEMENTS for each vertex element. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: return PASSTHRU surface types also on gen7	Alejandro Piñeiro	2017-01-12	1	-2/+6
\| \| \| \| \| \| \| \|	Although gen7 doesn't include surface types as a valid conversion format, we return it, as it reflects what we want to achieve, even if we need to workaround it on gen < 8. Reviewed-by: Jordan Justen <[email protected]>