summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Disable the unlit centroid workaround on Gen7.Matt Turner2016-08-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once upon a time (commit 8313f44409) Paul added code for the unlit centroid workaround (WaCopyUnlitCentroidBarys). His commit message claims it fixed the EXT_framebuffer_multisample/interpolation {2,4} {centroid-deriv,centroid-deriv-disabled} piglit tests but does not say on which platform, though he cites the IVB PRM. "3DSTATE_WM [DevIVB, DevHSW]" says "[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW may produce incorrect interpolation results when a 2X2 pixels have unlit pixels." I later disabled it for Haswell (commit f6db414f3c) with no known ill effects. The Sandybridge page does not have this text, but the workarounds database (see WaCopyUnlitCentroidBarys) says the issues applies *only* to Sandybridge, and in fact in commit 1a2de7dce8fc I note that disabling the workaround on Sandybridge causes the tests Paul originally mentioned to fail. So this is, and always has been, a huge confusing mess. Disabling the workaround indeed causes the tests Paul originally mentioned to fail on Sandybridge but not on Ivybridge/Baytrail. On Ivybridge: total instructions in shared programs: 6914901 -> 6909599 (-0.08%) instructions in affected programs: 106766 -> 101464 (-4.97%) helped: 884 total cycles in shared programs: 70874764 -> 70813774 (-0.09%) cycles in affected programs: 794144 -> 733154 (-7.68%) helped: 688 HURT: 186 LOST: 1 GAINED: 6 Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Avoid aliasing violation.Matt Turner2016-08-011-1/+3
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl_to_tgsi: Avoid aliasing violations.Matt Turner2016-08-011-4/+2
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: silence missing braces warning in st_program.cBrian Paul2016-08-011-1/+1
| | | | | | | | | | | Silence a gcc warning: state_tracker/st_program.c: In function 'st_create_fp_variant': state_tracker/st_program.c:957:10: warning: missing braces around initializer [-Wmissing-braces] nir_lower_drawpixels_options options = {0}; ^ state_tracker/st_program.c:957:10: warning: (near initialization for 'options.texcoord_state_tokens') [-Wmissing-braces] Reviewed-by: Marek Olšák <[email protected]>
* i965: fix comparison warningTimothy Arceri2016-08-011-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Remove set but not used gl_client_array::Stride.Mathias Fröhlich2016-07-318-9/+1
| | | | | | | | The field is only read for printing today and there it was probably a leftover. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Remove set but not used gl_client_array::Enabled.Mathias Fröhlich2016-07-318-10/+2
| | | | | | | | The way it is used today does not care about the Enabled flag anymore. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Use the VAO array enabled flags in vbo_exec_array.Mathias Fröhlich2016-07-311-7/+8
| | | | | | | | | Instead of gl_client_array::Enabled inside a VAO, directly use the gl_vertex_attrib_array::Enabled value which is the origin of the above. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO in check_array_data.Mathias Fröhlich2016-07-311-20/+29
| | | | | | | | | Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO in print_draw_arrays.Mathias Fröhlich2016-07-311-20/+20
| | | | | | | | | Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Walk the VAO in _mesa_print_arrays.Mathias Fröhlich2016-07-311-32/+20
| | | | | | | | | Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO to check for mapped buffers.Mathias Fröhlich2016-07-311-10/+23
| | | | | | | | | Similarily to _mesa_all_varyings_in_vbos walk the VAO to check if we have an illegal mapped buffer object instead of walking all gl_client_arrays. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Walk the VAO to see if all varyings are in vbos.Mathias Fröhlich2016-07-311-2/+2
| | | | | | | | | | | | | | | | In vbo_draw_transform_feedback we currently look at exec->array.inputs to determine if all varying vertex attributes reside in vbos. But the vbo_bind_arrays call only happens past the vbo_all_varyings_in_vbos query. Thus we may work on a stale set of client arrays. Using the current VAOs content for this query feels much more logical to me. Additionally with this change mesa makes more use of the information already tracked in the VAO instead of looping across VERT_ATTRIB_MAX vertex arrays. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Implement _mesa_all_varyings_in_vbos.Mathias Fröhlich2016-07-312-0/+39
| | | | | | | | | | Implement the equivalent of vbo_all_varyings_in_vbos for vertex array objects. v2: Update comment. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Unbind deleted vbo using _mesa_bind_vertex_buffer.Mathias Fröhlich2016-07-311-4/+7
| | | | | | | | | | | | When a vertex buffer object gets deleted, it is unbound at the VAO. To do this use _mesa_bind_vertex_buffer instead of plain unreferencing the buffer object. This keeps the VAOs internal state consistent. In this case it showed up with gl_vertex_array_object::VertexAttribBufferMask getting out of sync. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: remove dd_function_table::UseProgramMarek Olšák2016-07-303-10/+0
| | | | | | finally unused Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: update sampler states when shaders are changedMarek Olšák2016-07-301-6/+12
| | | | | | | This bug seems to have always been there. Applications changing shaders but not textures between draw calls would have gotten undefined behavior. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: don't dirty sample shading on _NEW_PROGRAMMarek Olšák2016-07-301-2/+1
| | | | | | Already done as part of ST_NEW_FRAGMENT_PROGRAM in st_validate_state. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove excessive shader state dirtyingMarek Olšák2016-07-307-57/+33
| | | | | | | | | This just needs to be done by st_validate_state. v2: add "shaders_may_be_dirty" flags for not skipping st_validate_state on _NEW_PROGRAM to detect real shader changes Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: unreference optional shaders when unbindingMarek Olšák2016-07-301-0/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: skip updates of states that have no effectMarek Olšák2016-07-302-9/+28
| | | | | | v2: - also don't check edge flags for GLES Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: completely rewrite state atomsMarek Olšák2016-07-3033-516/+381
| | | | | | | | | | | | | | | | | | | | The goal is to do this in st_validate_state: while (dirty) atoms[u_bit_scan(&dirty)]->update(st); That implies that atoms can't specify which flags they consume. There is exactly one ST_NEW_* flag for each atom. (58 flags in total) There are macros that combine multiple flags into one for easier use. All _NEW_* flags are translated into ST_NEW_* flags in st_invalidate_state. st/mesa doesn't keep the _NEW_* flags after that. torcs is 2% faster between the previous patch and the end of this series. v2: - add st_atom_list.h to Makefile.sources Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove st_tracked_state::nameMarek Olšák2016-07-3020-58/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove atom debugging codeMarek Olšák2016-07-301-67/+3
| | | | | | This won't be needed after the rewrite. Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Fix move_interpolation_to_top() pass.Kenneth Graunke2016-07-291-21/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae was entirely broken. A missing "break" made the load_interpolated_input case always fall through to "default" and hit a "continue", making it not actually move any load_interpolated_input intrinsics at all. It would only move the simple load_barycentric_* intrinsics, which don't emit any code anyway, making it basically useless. The initial version I sent of the pass worked, but I apparently failed to verify that the simplified version in v2 actually worked. With the obvious fix applied (so we actually tried to move load_interpolated_input intrinsics), I discovered a second bug: we weren't moving the offset SSA def to the top, breaking SSA validation. The new version of the pass actually moves load_interpolated_input intrinsics and all their dependencies, as intended. Papers over GPU hangs on Ivybridge and Baytrail caused by the recent NIR FS input rework by restoring the old behavior. (I'm not honestly sure why they hang with PLN not at the top.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st_glsl_to_tgsi: only skip over slots of an input array that are presentNicolai Hähnle2016-07-281-1/+5
| | | | | | | | | | When an application declares varying arrays but does not actually do any indirect indexing, some array indices may end up unused in the consuming shader, so the number of input slots that correspond to the array ends up less than the array_size. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* i965: remove unnecessary null checkTimothy Arceri2016-07-281-4/+1
| | | | | | | | We would have hit a segfault already if this could be null. Fixes Coverity warning spotted by Matt. Reviewed-by: Matt Turner <[email protected]>
* vbo: Fix handling of POS/GENERIC0 attributes.Mathias Fröhlich2016-07-271-3/+16
| | | | | | | | | | | | | In case of split primitives we need to restore the original setting of the vtx.attrsz array to make immediate mode attribute array tracking work. v2: Use bool instead of boolean. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96950
* mesa: standardize naming Mesa3D, MESA -> MesaVedran Miletić2016-07-261-1/+1
| | | | | Signed-off-by: Vedran Miletić <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* mesa: Make MESA_SHADER_CAPTURE_PATH skip shaders with Name == -1.Kenneth Graunke2016-07-261-1/+1
| | | | | | | | Shaders with shProg->Name == ~0 (aka 4294967295) are internal meta shaders that we don't really want to capture. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Avoid aliasing violation in uniform_query.cpp.Matt Turner2016-07-261-14/+31
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Avoid aliasing violation in FXT1.Matt Turner2016-07-261-2/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* swrast: Avoid aliasing violation.Matt Turner2016-07-261-2/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Separate overlapping sentinel nodes in exec_list.Matt Turner2016-07-263-3/+3
| | | | | | | | | | | I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/miptree: Stop multiplying cube depth by 6 in HiZ calculationsJason Ekstrand2016-07-261-17/+2
| | | | | | | | | intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "12.0" <[email protected]>
* i965/miptree/isl: Stop multiplying depth by 6 for cubesJason Ekstrand2016-07-261-5/+0
| | | | | | | | | | Now that the logical_depth0 field is in number of 2D slices, we don't need to be multiplying by 6 when creating the surface. It wasn't hurting anything primarily because we get the actual length from the view which was already handling it correctly. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp/gen8: Stop multiplying depth by 6 for cubesJason Ekstrand2016-07-261-4/+1
| | | | | | | | intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need for us to multiply by 6 when we go to fill out a blorp surface state. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storageNils Wallménius2016-07-251-6/+23
| | | | | | | | | | | | | | | | When possible, do the memcpy on larger blocks. This reduces cycles spent in _mesa_propagate_uniforms_to_driver_storage from 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark. It did not affect the framerate of the benchmark. The system used for testing was an i5 6600K with a Radeon R9 380. Piglit hangs randomly on this system both with and without the patch so i could not make a comparison. v2: fixed whitespace Signed-off-by: Nils Wallménius <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: reuse main extension table to appropriately restrict extensionsIlia Mirkin2016-07-235-26/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) v2 -> v3: integrate glcpp defines into the same mechanism
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-232-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* mesa: Don't call GenerateMipmap if Width or Height == 0.Kenneth Graunke2016-07-221-0/+5
| | | | | | | | | | | | | | | | | One of the WebGL 2.0 conformance tests is trying to call glGenerateMipmaps with a width and height of 0. With the meta implementation, this generates a "framebuffer attachment incomplete" status, and falls back to the CPU path, calling MapTextureImage. Except that there's no actual texture to map, and we assert fail. There's no work to do in this case. The test expects it to succeed, so just return early with no error and avoid hassling the driver. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Get rid of the do_lower_unnormalized_offsets passJason Ekstrand2016-07-224-109/+0
| | | | | | | | | We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* i965/nir: Enable NIR lowering of txf and rect offsetsJason Ekstrand2016-07-221-0/+2
| | | | | | | | | | | | | | This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* gallium: add PIPE_FLUSH_DEFERREDMarek Olšák2016-07-221-1/+1
| | | | | | | | | | | | | There are 2 uses: - Asynchronous flushing for multithreaded drivers. - Return a fence without flushing (mid-command-buffer fence). The driver can defer flushing until fence_finish is called. This is required to make Bioshock Infinite faster, which creates 1000 fences (flushes) per frame. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* i965: fix varying output setupTimothy Arceri2016-07-231-1/+1
| | | | | | | | | | | | Since 7f53fead5c we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <[email protected]>
* mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.Kenneth Graunke2016-07-211-0/+5
| | | | | | | | | | | | | | | | | | | | The GL_EXT_texture_format_BGRA8888 extension specification defines a GL_BGRA_EXT unsized internal format (which is a little odd - usually BGRA is a pixel transfer format). The extension is written against the ES 1.0 specification, so it's a little hard to map, but I believe it's effectively adding it to the table used here, so we should allow it here as well. Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true), so we don't need to check if it's enabled here. This fixes mipmap generation in Skia and ChromeOS. Signed-off-by: Kenneth Graunke <[email protected]> References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371 Reviewed-by: Ian Romanick <[email protected]> Reported-by: Stéphane Marchesin <[email protected]> Cc: [email protected]
* i965: Fix "operation operation" in comment.Kenneth Graunke2016-07-211-1/+1
| | | | | | From the redundant redundant department. Reported-by: Michael Schellenberger Costa <[email protected]>
* i965: Fix shared atomic intrinsics to pay attention to base.Kenneth Graunke2016-07-211-1/+12
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Include VUE handles for GS with invocations > 1.Kenneth Graunke2016-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We always resort to the pull model for instanced GS inputs. So, we'd better include the VUE handles, or else we can't actually pull anything. Ian reports that on his branch with OES_geometry_shader enabled, this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests:: - instanced.draw_2_instances_geometry_2_invocations - instanced.draw_2_instances_geometry_8_invocations - instanced.draw_4_instances_geometry_2_invocations - instanced.draw_4_instances_geometry_8_invocations - instanced.draw_8_instances_geometry_2_invocations - instanced.draw_8_instances_geometry_8_invocations - instanced.geometry_2_invocations - instanced.geometry_32_invocations - instanced.geometry_8_invocations - instanced.geometry_max_invocations - instanced.geometry_output_different_2_invocations - instanced.geometry_output_different_32_invocations - instanced.geometry_output_different_8_invocations - instanced.geometry_output_different_max_invocations - instanced.invocation_output_vary_by_attribute - instanced.invocation_output_vary_by_texture - instanced.invocation_output_vary_by_uniform - query.primitives_generated_instanced Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Tested-by: Ian Romanick <[email protected]>
* i965: print error messages if gs fails to compileTimothy Arceri2016-07-211-0/+6
| | | | | | We do this for all other stages. Reviewed-by: Kenneth Graunke <[email protected]>