summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* mesa: free current ComputeProgram state in _mesa_free_context_dataTapani Pälli2017-09-211-0/+2
| | | | | | | | | | | | This is already done for other programs stages, fixes a leak when using compute programs. Signed-off-by: Tapani Pälli <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102844 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 589457d97fa8e95f227e7179e9c89a01dff495a0)
* st/glsl->tgsi: fix u64 to bool comparisons.Dave Airlie2017-09-201-1/+15
| | | | | | | | | | | | Otherwise we end up using a 32-bit comparison which didn't end well. Timothy caught this while playing around with some opt passes. Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers) Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit a7a7bf21bdf0cf8e59f8c8e17c2580a363be7055)
* i965/blorp: Set r8stencil_needs_update when writing stencilJason Ekstrand2017-09-201-0/+6
| | | | | | | | | | | | | | | | This fixes a crash on Haswell when we try to upload a stencil texture with blorp. It would also be a problem if someone tried to texture from stencil after glBlitFramebuffers. Cc: "17.2 17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (cherry picked from commit a43d379000260485fc4b2b03b069aedc46879557) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/mesa/drivers/dri/i965/brw_blorp.c
* st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D indexNicolai Hähnle2017-09-191-1/+1
| | | | | | | | | | Don't get distracted by record dereferences between array references. Fixes dEQP-GLES31.functional.tessellation.user_defined_io.per_vertex_block.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 03203b74486357c2bc77c53302f0f667f1df3ffa)
* st/mesa: fix view template initialization in try_pbo_readpixelsRoland Scheidegger2017-09-061-1/+1
| | | | | | | | | | | | | I think this is what the code was meant to do, albeit as far as I can tell the redundant initialization some analyzers complain about should work as well just fine (only the first layer will be used, if the view contains one or more layers doesn't really matter). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102467 Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: [email protected] (cherry picked from commit 2b2c61f0df5c18355b65772d21be36339ba5e1d9)
* vbo: fix offset in minmax cache keyCharmaine Lee2017-09-061-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of saving primitive offset in the minmax cache key, save the actual buffer offset which is used in the cache lookup. Fixes rendering artifact seen with GoogleEarth when run with VMware driver. v2: Per Brian's comment, initialize offset to avoid compiler warning. Cc: [email protected] Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 2d93b462b4d978b0da417b35a7470e336bc4e783) [Andres Gomez: resolve trivial conflicts] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/mesa/vbo/vbo_minmax_index.c Squashed with: vbo: fix build errors on android incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int') from 'const char *' [-Werror,-Wint-conversion] offset = indices; ^ ~~~~~~~ Fixes: 2d93b462b4d ("vbo: fix offset in minmax cache key") Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 0986f686328216fa201769c630372fd4b6f8877a)
* st/mesa: fix handling of vertex array double inputsIlia Mirkin2017-09-061-1/+3
| | | | | | | | | | | | | | The is_double_vertex_input needs to be set for arrays of doubles as well. Fixes KHR-GL45.enhanced_layouts.varying_array_locations Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit ae53bff8b13b433ca79904dfbda7264eb7188fa7)
* mesa: only copy requested compressed teximage cubemap facesChristoph Haag2017-08-251-2/+2
| | | | | | | | | | | | | | | This is analogous to commit 2259b11 which only fixed the regular case Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102308 Signed-off-by: Christoph Haag <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 87556a650ad363b41d86f4e25d5c4696f9af4550) [Andres Gomez: helpers had not yet been refactored] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/mesa/main/teximage.c
* i965: Stop looking at NewDriverState when emitting 3DSTATE_URBJason Ekstrand2017-08-253-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | Looking at NewDriverState is not safe in general. The state atom system is set up to ensure that new bits that get added to NewDriverState get accumulated into the set of bits used when emitting atoms but it doesn't go the other way. If we read NewDriverState, we may not get the full picture because the per-pipeline state (3D or compute) does not get added to NewDriverState before state emit is done. It's especially dangerous to do this from BLORP (either explicitly or implicitly when BLORP calls gen7_upload_urb) because that does not happen during one of the normal state upload paths. This commit solves the problem by whacking all of the per-shader-stage URB sizes to zero whenever we change the total URB size. We still have to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but the actual decision in gen7_upload_urb can now be based entirely on URB sizes rather than on state atoms. This also makes BLORP correct because it just asks for a new URB config whenever the vsize is too small and so any change to the total URB size will trigger blorp to re-emit as well because 0 < vs_entry_size. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Bugzilla: https://bugs.freedesktop.org/102289 Cc: [email protected] (cherry picked from commit d5e217dbfda2a87e149bdc8586c25143fc0ac183)
* i965: perf: minimize the chances to spread queries across batchbuffersLionel Landwerlin2017-08-251-0/+8
| | | | | | | | | | | | Counter related to timings will be sensitive to any delay introduced by the software. In particular if our begin & end of performance queries end up in different batches, time related counters will exhibit biffer values caused by the time it takes for the kernel driver to load new requests into the hardware. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]> (cherry picked from commit adafe4b733c0242720ccfe10d391e5d44c0e7401)
* st/mesa: fix a null pointer accessFrank Richter2017-08-191-1/+1
| | | | | | | | | Fixes crash with llvmpipe on Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102148 Cc: [email protected] Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 496a691e3544d082670ac1f33059692510a2a86d)
* i965/blit: Remember to include miptree buffer offset in relocsChris Wilson2017-08-192-3/+3
| | | | | | | | | | | | | | Remember to add the offset to the start of the buffer in the relocation or else we write 0xff into random bytes elsewhere. Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit fb63c43fd1b7adb5cb4f34e7616e7d564ca178e5) [Andres Gomez: resolve trivial conflicts] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/mesa/drivers/dri/i965/intel_pixel_bitmap.c
* i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.Kenneth Graunke2017-08-191-1/+0
| | | | | | | | | | | | | | | | | | | The cacheline alignment restriction is on the base address; the pitch can be anything. Fixes assertion failures when using primus (say, on glxgears, which creates a 300x300 linear BGRX surface with a pitch of 1200): intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed. Cc: [email protected] Reviewed-by: Chris Wilson <[email protected]> (cherry picked from commit 595a47b8293b1d97a3ae7dbfa8db703bfb4e7aae) [Andres Gomez: resolve trivial conflicts] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/mesa/drivers/dri/i965/intel_blit.c
* i965: use strtol to convert the integer deviceID overrideEmil Velikov2017-08-031-1/+1
| | | | | | | | | | | | | | | | | One can override the deviceID, by setting the INTEL_DEVID_OVERRIDE variable. A few symbolic names or a numerical value for the actual device ID is accepted. At the same time we're using strtod (string to double) to convert the string to a decimal numeral. A seeming thinko, made by the original commit that introduces the code in libdrm_intel and got here with the import. Fixes: 514db96c117a ("i965: Import libdrm_intel.") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 647b5a18df6e423e1a15d92bc767ba0cf04493a3)
* i965: Resolve framebuffers before signaling the fenceChris Wilson2017-08-031-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From KHR_fence_sync: When the condition of the sync object is satisfied by the fence command, the sync is signaled by the associated client API context, causing any eglClientWaitSyncKHR commands (see below) blocking on <sync> to unblock. The only condition currently supported is EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR, which is satisfied by completion of the fence command corresponding to the sync object, and all preceding commands in the associated client API context's command stream. The sync object will not be signaled until all effects from these commands on the client API's internal and framebuffer state are fully realized. No other state is affected by execution of the fence command. If clients are passing the fence fd (from EGL_ANDROID_native_fence_sync) to a compositor, that fence must only be signaled once the framebuffer is resolved and not before as is currently the case. v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad) Reported-by: Sergi Granell <[email protected]> Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension") Signed-off-by: Chris Wilson <[email protected]> Cc: Sergi Granell <[email protected]> Cc: Rob Clark <[email protected]> Cc: Chad Versace <[email protected]> Cc: Daniel Stone <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit 618be8cc1ad1760103930b69ffbf528d7b861ab3)
* i965: perf: flush batchbuffers at the beginning of queriesLionel Landwerlin2017-08-031-0/+8
| | | | | | | | | | | | | | | | | | | | | | As Chris commented, it makes more sense to have batch buffer flushes before the query. Usually applications like frame_retrace do a series of queries and in that case, with flushes at the end of the queries, we might still have the first query contained in 2 different batchs. More generally it would be quite usual to have the query contained in 2 batch buffers because we never now what's the fill rate of the current batch buffer. If we move the flushing at the beginning of the queries, it's pretty much guaranteed that queries will be contained in a single batch buffer (unless the amount of commands is huge, but then it's only fair to include reloading request times in the measurements). Fixes: adafe4b733c02 ("i965: perf: minimize the chances to spread queries across batchbuffers") Reported-by: Chris Wilson <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Cc: "17.2 17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 9f439ae1201cb049ffedb9b0e2d4f393fb0a761e)
* swrast: add dri2ConfigQueryExtension to the correct extension listEmil Velikov2017-08-031-1/+1
| | | | | | | | | | | | | | | The extension should be in the list as returned by getExtensions(). Seems to have gone unnoticed since close to nobody wants to change the vblank mode for the software driver. v2: Rebase Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (v1) (cherry picked from commit 7791949dadd5af707055d0076874177e5e8e2133) [Emil Velikov: drop st/dri hunk, squash correct swrast piece] Signed-off-by: Emil Velikov <[email protected]>
* st/mesa: always unconditionally revalidate main framebuffer after SwapBuffersMarek Olšák2017-08-031-0/+10
| | | | | | | | | | This fixes the black Feral launcher window. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101867 Cc: 17.2 <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> (cherry picked from commit 7257c171e9eadc05903140cffa26a253f0d0178a)
* mesa/main: Move NULL pointer check.Plamena Manolova2017-07-121-6/+6
| | | | | | | | | | | | | In blit_framebuffer we're already doing a NULL pointer check for readFb and drawFb so it makes sense to do it before we actually use the pointers. CID: 1412569 Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> (cherry picked from commit b3b61211157ab934f1898d3519e7288c1fd89d80) [Andres Gomez: resolve trivial conflicts] Signed-off-by: Andres Gomez <[email protected]>
* st/mesa: release EGLImage on EGLImageTarget* errorPhilipp Zabel2017-07-081-0/+1
| | | | | | | | | | | | | | The smapi->get_egl_image() call in st_egl_image_get_surface() stores a reference to the EGLImage's texture in stimg.texture. That reference is released via pipe_resource_reference(&stimg.texture, NULL) before stimg goes out of scope at the end of the function, but not in the error path if !is_format_supported(). Fixes: 83e9de25f325 ("st/mesa: EGLImageTarget* error handling") Cc: [email protected] Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 7d7bcd65d6019dfb63f31138a426fe2a043016db)
* i965: Always set AALINEDISTANCE_TRUE on Sandybridge.Kenneth Graunke2017-07-081-2/+1
| | | | | | | | | We set this unconditionally on every other platform. Zero (Manhattan) isn't even listed as an option in the Sandybridge docs - only "true". Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 4878ab9bd4281c4554254bbb0c62faae453bb863)
* i965: Use true AA line distance on G45/Ironlake.Kenneth Graunke2017-07-081-1/+1
| | | | | | | | | | | | | | | | | | | The original Broadwater and Crestline platforms computed antialiased line distances using "manhattan" distance, aka a + b = c. Eaglelake and Cantiga added "true" distance, which apparently does something like max(a, b) + min(a, b) / 4. Not exactly "true", but at least more accurate. The G45 documentation indicates that the old manhattan distance setting is "only for debug purposes" and should never be used. The Ironlake documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it does still contain the narrative about the feature. At any rate, we should use the more accurate mode. Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit b625bcc601a16fab1962f9ed569700d3d08738b9)
* i965: update MaxTextureRectSize to match PRMs and comply with OpenGL 4.1+Iago Toral Quiroga2017-06-281-1/+1
| | | | | | | | We were exposing 4096, but we can do up to 8192 in Gen4-6 and up to 16384 in gen7+. OpenGL 4.1+ requires at least 16384. Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit b72b7c541dd81890e04652373f24840f580123ed)
* i915: Fix wpos_tex vs. -1 comparisonVille Syrjälä2017-06-283-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wpos_tex used to be a GLuint so assigning -1 to it and later comparing with -1 worked correctly, but commit c349031c27b7 ("i915: Fix texcoord vs. varying collision in fragment programs") changed wpos_tex to uint8_t and hence broke the comparison. To fix this define a more explicit invalid value for wpos_tex. gcc warns us: i915_fragprog.c:1255:57: warning: comparison is always true due to limited range of data type [-Wtype-limits] if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) { ^ And clang says: i915_fragprog.c:1255:57: warning: comparison of constant -1 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) { ~~~~~~~~~~~ ^ ~~ Cc: Chih-Wei Huang <[email protected]> Cc: Eric Anholt <[email protected]> Cc: Ian Romanick <[email protected]> Cc: [email protected] Fixes: c349031c27b7 ("i915: Fix texcoord vs. varying collision in fragment programs") Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit c1eedb43f32f6a3733f26e7918eb028f68bd60a4) Squashed with commit: i915: Always emit W on gen3 Unlike the older gen2 hardware, gen3 performs perspective correct interpolation even for the primary/secondary colors. To do that it naturally needs us to emit W for the vertices. Currently we emit W only when at least one texture coordinate set gets emitted. This means the interpolation of color will change depending on whether texcoords/varyings are used or not. That's probably not what anyone would expect, so let's just always emit W to get consistent behaviour. Trying to avoid emitting W seems like more hassle than it's worth, especially as bspec seems to suggest that the hardware will perform the perspective division anyway. This used to be broken until it was accidentally fixed it in commit c349031c27b7 ("i915: Fix texcoord vs. varying collision in fragment programs") by introducing a bug that made the driver always emit W. After fixing that bug in commit c1eedb43f32f ("i915: Fix wpos_tex vs. -1 comparison") we went back to the old behaviour and caused an apparent regression. Fixes: c1eedb43f32f ("i915: Fix wpos_tex vs. -1 comparison") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101451 Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 0eef03a6f2f7fa7968accaa2ab2c3d7431e984b8)
* i965: Clamp clear colors to the representable rangeJason Ekstrand2017-06-281-0/+40
| | | | | | | | | | | | | | | | | | Starting with Sky Lake, we can clear to arbitrary floats or integers. Unfortunately, the hardware isn't particularly smart when it comes sampling from that clear color. If the clear color is out of range for the surface format, it will happily return whatever we put in the surface state packet unmodified. In order to avoid returning bogus values for surfaces with a limited range, we need to do some clamping. Cc: "17.1" <[email protected]> Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit f1fa4be871e13c68b50685aaf64dc095b49ed0b5) [Andres Gomez: override_color still a gl_color_union] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/mesa/drivers/dri/i965/brw_meta_util.c
* i915: Fix gl_Fragcoord interpolationVille Syrjälä2017-06-285-16/+21
| | | | | | | | | | | | | | | | | gl_FragCoord contains the window coordinates so it seems to me that we should not use perspective correct interpolation for it. At least now I get similar output as i965/swrast/llvmpipe produce. This fixes dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w. dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz was already passing, though I'm not quite sure how it managed to do that. v2: Add definitons for the S3 "wrap shortest" bits as well (Ian) Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> (cherry picked from commit 1c409fe4c144f11ce6c6a4548ac5c6ba37980058)
* st/mesa: fix pipe_rasterizer_state::scissor with multiple viewportsMarek Olšák2017-06-281-1/+1
| | | | | | Cc: 17.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit 2ec1e32d11ed788dfed229a569a238743b9b1f9f)
* mesa: flush vertices before updating ctx->_ShaderMarek Olšák2017-06-281-2/+2
| | | | | | | | Cc: 17.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 0b70d6ec568a2c5d7b2ff814e6e26b6d1379c829)
* mesa: flush vertices before changing viewportsMarek Olšák2017-06-281-2/+4
| | | | | | | | | Cc: 17.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit c8363eb0276c863100a457b18fee5ef900cf6f74)
* i965: Ignore anisotropic filtering in nearest mode.Kenneth Graunke2017-06-281-2/+4
| | | | | | | | | | | | | | | | This fixes both Europa Universalis IV and Stellaris rendering on i965. This was tested on SKL. This fix was discovered by Jakub Szuppe at Stream HPC (https://streamhpc.com/). bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96958 bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95530 Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Dylan Baker <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: 17.1 <[email protected]> (cherry picked from commit 6a7c5257cac23cd9767aa4bc8fdab68925b11157)
* i965/gen4: Set depth offset when there is stencil attachment onlyTopi Pohjolainen2017-06-281-0/+6
| | | | | | | | | | | | | | Current version fails to set depthstencil.depth_offset when there is only stencil attachment (it does set the intra tile offsets though). Fixes piglits: g45,g965,ilk: depthstencil-render-miplevels 1024 s=z24_s8 g45,ilk: depthstencil-render-miplevels 273 s=z24_s8 CC: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 69672859814f36e9b8756b8f1c4655c49b9f6f4f)
* i965: Set step_rate = 0 for interleaved vertex buffersJason Ekstrand2017-06-281-0/+1
| | | | | | | | | | | | | | | | | | Before, we weren't setting step rate so we got whatever old value happened to be lying around. This can lead to some interesting rendering errors. In particular, if you run the OpenGL ES CTS with dEQP-GLES3.functional.instanced.types.mat2x4 immediately followed by one of the dEQP-GLES3.functional.transform_feedback.* tests, the transform feedback test gets stale instancing data from the other test and fails. The only thing that is causing this to not be a problem today is that we use meta for clears and meta is setting up vertex buffers via the VBO or non-interleaved path and setting step_rate to 0 for us. When blorp depth/stencil clears are enabled, meta is no longer sitting between the two tests and the stale data starts causing noticeable problems. Cc: "17.1" <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (cherry picked from commit f762962f7ffd280ee1fd4280744800f73e133901)
* i965: Disable the interleaved vertex optimization when instancingJason Ekstrand2017-06-281-5/+6
| | | | | | | | | Instance divisor is a property of the vertex buffer and not the vertex element so if we ever see anything other than 0, bail. Cc: "17.1" <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (cherry picked from commit b3569e74451e3b913a2f3b327db430edbcd8f42e)
* i965: Do an end-of-pipe sync after flushesJason Ekstrand2017-06-281-3/+3
| | | | | | | | | According to the docs, a simple CS stall is insufficient to ensure that the memory from the flush is visible and an end-of-pipe sync is needed. Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit d9261275cc1328d6a30e19b92db21df23adf7219)
* i965/blorp: Do an end-of-pipe sync around CCS opsJason Ekstrand2017-06-281-12/+4
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 314ec7b46ffa1640c0d9448e7752c2d7f6c18734)
* i965: Do an end-of-pipe sync prior to STATE_BASE_ADDRESSJason Ekstrand2017-06-281-6/+12
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 96e7b7ac54bd2220905656a0304eed2a753fceee)
* i965: Add an end-of-pipe sync helperTopi Pohjolainen2017-06-282-1/+100
| | | | | | | | | | | | v2 (Jason Ekstrand): - Take a flags parameter to control the flushes - Refactoring Cc: "17.1" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 7b607aae3fea4c7a3022641115aa01a05b434448)
* i965: Unify the two emit_pipe_control functionsJason Ekstrand2017-06-281-73/+64
| | | | | | | | | | | These two functions contain almost identical logic except for one SNB workaround required for render target cache flushes. They may as well call into the same code so we only have to handle the work-arounds in one place. Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit b771d9a136715fdf8ba0b478380e19b63f1e491b)
* i965: Take a uint64_t immediate in emit_pipe_control_writeJason Ekstrand2017-06-285-18/+15
| | | | | | | | | | | It's a 64-bit value. Splitting it up just makes the function arguments awkward. Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit a8ea68bc930f212dddf78a4e2073bcbd698b9140) [Andres Gomez: modified remaining uses of the new API] Signed-off-by: Andres Gomez <[email protected]>
* i965: Flush around state base addressJason Ekstrand2017-06-282-1/+33
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 86da08367b90a5a4fef90723c97a988e73130389)
* i965: Mark depth surfaces as needing a HiZ resolve after blittingJason Ekstrand2017-06-141-0/+2
| | | | | | | | Cc: "17.0 17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit 5097fcbfdc8dc5aab779af92022f9b5ff16026f0)
* i965: Perform HiZ flush/stall prior to HiZ resolvesJason Ekstrand2017-06-141-13/+26
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (cherry picked from commit acbd02450bfd53f61bbe468a6f0e8bf5e4507095)
* i965: Move the pre-depth-clear flush/stalls to intel_hiz_execJason Ekstrand2017-06-142-56/+58
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (cherry picked from commit acb9a2ef8f5d92002ed7eb7676c4a96db661ba3a)
* i965/blorp: Take a layer range in intel_hiz_execJason Ekstrand2017-06-145-18/+16
| | | | | | Cc: "17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 252b004a51d951391846ec5644abe88bfffb72bd)
* st/mesa: don't load cached TGSI shaders on demandMarek Olšák2017-06-141-1/+6
| | | | | | | | | | | This fixes a performance issue with the shader cache that delayed Gallium shader create calls until draw calls. I'd like this in stable, but it's not a showstopper. Cc: 17.1 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 2ec50f98a9be9ee94aa0dd82fb7560c00153b03f)
* xlib: fix glXGetCurrentDisplay() failureBrian Paul2017-06-144-5/+18
| | | | | | | | | | | | | | | | | | glXGetCurrentDisplay() has been broken for years and nobody noticed until recently. This change adds a new XMesaGetCurrentDisplay() that the GLX emulation API can call, just as we did for glXGetCurrentContext(). Tested by hacking glxgears to call glXGetCurrentContext() before and after glXMakeCurrent() to verify the return value is NULL beforehand and the same as the opened display afterward. Also tested by Tom Hudson with his tests programs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100988 Cc: [email protected] Tested-by: Tom Hudson <[email protected]> Signed-off-by: Brian Paul <[email protected]> (cherry picked from commit c6ba85a8c0f02b3b7058dae7afb6c49f56567319)
* automake: Link all libGL.so variants with -Bsymbolic.Jose Fonseca2017-06-141-0/+1
| | | | | | | | | | | | | | | | | | | We were linking src/glx with -Bsymbolic, but not the classic/gallium X11 libGL.so. But it's always a good idea to build all libGL.so and all DRI drivers with -Bsymbolic, otherwise they might resolve symbols from the 3rd party application executable or shared libraries, which is _never_ what we want. In particular, this can happen when intercepting OpenGL calls with apitrace, before https://github.com/apitrace/apitrace/commit/63194b2573176ef34efce1a5c8b08e624b8dddf5 Cc: [email protected] Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit ce5e83b8a0c757072075e781a090d35d9dc0e285)
* i965/dri: Fix bad GL error in intel_create_winsys_renderbuffer()Chad Versace2017-06-141-5/+1
| | | | | | | | | | | | | | | | | This function never occurs in the callchain of a GL function. It occurs only in the callchain of eglCreate*Surface and the analogous paths for GLX. Therefore, even if a thread does have a bound GL context, emitting a GL error here is wrong. A misplaced GL error, when no GL call is made, can confuse clients. Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 9d996e94fbbfdb3692061009f5441cf61bba36ae) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/mesa/drivers/dri/i965/intel_fbo.c
* i965: Rework Sandy Bridge HiZ and stencil layoutsJason Ekstrand2017-06-025-29/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sandy Bridge does not technically support mipmapped depth/stencil. In order to work around this, we allocate what are effectively completely separate images for each miplevel, ensure that they are page-aligned, and manually offset to them. Prior to layered rendering, this was a simple matter of setting a large enough halign/valign. With the advent of layered rendering, however, things got more complicated. Now, things weren't as simple as just handing a surface off to the hardware. Any miplevel of a normally mipmapped surface can be considered as just an array surface given the right qpitch. However, the hardware gives us no capability to specify qpitch so this won't work. Instead, the chosen solution was to use a new "all slices at each LOD" layout which laid things out as a mipmap of arrays rather than an array of mipmaps. This way you can easily offset to any of the miplevels and each is a valid array. Unfortunately, the "all slices at each lod" concept missed one fundamental thing about SNB HiZ and stencil hardware: It doesn't just always act as if you're always working with a non-mipmapped surface, it acts as if you're always working on a non-mipmapped surface of the same size as LOD0. In other words, even though it may only write the upper-left corner of each array slice, the qpitch for the array is for a surface the size of LOD0 of the depth surface. This mistake causes us to under-allocate HiZ and stencil in some cases and also to accidentally allow different miplevels to overlap. Sadly, piglit test coverage didn't quite catch this until I started making changes to the resolve code that caused additional HiZ resolves in certain tests. This commit switches Sandy Bridge HiZ and stencil over to a new scheme that lays out the non-zero miplevels horizontally below LOD0. This way they can all have the same qpitch without interfering with each other. Technically, the miplevels still overlap, but things are spaced out enough that each page is only in the "written area" of one LOD. Cc: "17.0 17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 10903d228919085cdb160c563c481ed1cc09e34c) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* r100: Use _mesa_get_format_base_format in radeon_update_wrapperIan Romanick2017-06-011-1/+1
| | | | | | | | | | | | | | | | The wrapper is for a renderbuffer around a texture. Textures can have formats (e.g., 3) that aren't valide for API generated renderbuffers. _mesa_base_fbo_format will return 0, but _mesa_get_format_base_format will return the base format of RGB. Fixes a crashes in piglit tests fbo-alphatest-formats (all subtests pass) and fbo-colormask-formats (some subtests pass, some fail). Signed-off-by: Ian Romanick <[email protected]> Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 303b47f253f595ca0f708bef1059cbb4996f83a0) Signed-off-by: Juan A. Suarez Romero <[email protected]>