summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: stop passing stage as a function parameterTimothy Arceri2016-09-261-5/+3
| | | | | | We already pass the shader so we can just get the stage from this. Reviewed-by: Jason Ekstrand <[email protected]>
* aubinator: fix resource leakNayan Deshmukh2016-09-251-1/+3
| | | | | | | CovID: 1373370 Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* osmesa: Unbind the current context when given a null context and buffer.Emilio Cobos Álvarez2016-09-231-0/+7
| | | | | | | This is needed to be consistent with other drivers. Signed-off-by: Emilio Cobos Álvarez <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/mesa: small optimization in swizzle_swizzle()Brian Paul2016-09-231-0/+5
| | | | | | | Usually, there's no user-specified texture swizzle so we can optimize the swizzle_swizzle() function and skip the loop/switch. Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()Brian Paul2016-09-231-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Some demos, like Heaven, were creating and destroying a large number of sampler views because of a swizzle issue. Basically, we compute the sampler view's swizzle by examining the texture format, user swizzle, depth mode, etc. Later, during validation we recompute that swizzle (in case something like depth mode changes) and see if it matches the view's swizzle. In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle returned SWIZZLE_XYZW but the u_sampler_view_default_template() function was setting the sampler view's swizzle to SWIZZLE_XY01. This mismatch caused the validation step to always "fail" so we'd destroy the old sampler view and create a new one. By removing the conditional, the sampler view's swizzle and the computed texture swizzle match and validation "passes". When creating a new sampler view, we always want to use the texture swizzle which we just computed. Fixes VMware issue 1733389. Cc: [email protected] Reviewed-by: Charmaine Lee <[email protected]>
* svga: set PIPE_BIND_DEPTH_STENCIL flag for new resources when possibleBrian Paul2016-09-231-1/+11
| | | | | | | | When we create a depth/stencil texture, also check if we can render to it and set the PIPE_BIND_DEPTH_STENCIL flag. We were previously doing this for color textures (PIPE_BIND_RENDER_TARGET). Reviewed-by: Charmaine Lee <[email protected]>
* svga: don't special case caps for SVGA3D_R32_FLOATBrian Paul2016-09-231-6/+2
| | | | | | | This may have been needed years ago during development, but not now. Prevents some regressions after introducing the next patch. Reviewed-by: Charmaine Lee <[email protected]>
* svga: use new adjust_z_layer() helper in svga_pipe_blit.cBrian Paul2016-09-231-44/+28
| | | | | | | | | To handle z/layer fix-ups for blitting and copying. Note that we weren't doing this properly in svga_blit() before. Also, remove redundant stex, dtex assignments. Reviewed-by: Charmaine Lee <[email protected]>
* svga: simplify/improve the format compatibility check for region copiesBrian Paul2016-09-231-5/+25
| | | | | | | | The util_is_format_compatible() function didn't quite do what we wanted for vgpu10. This check is more flexible and allows copies between formats such as R32G32B32A32_FLOAT and R32G32B32A32_INT. Reviewed-by: Charmaine Lee <[email protected]>
* svga: add const qualifier on svga_translate_format()Brian Paul2016-09-232-2/+2
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: eliminate unneeded gotos in svga_validate_surface_view()Brian Paul2016-09-231-7/+4
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: disable srgb format related code from svga_blit()Neha Bhende2016-09-231-12/+0
| | | | | | | | | With latest mesa and latest piglit tests srgb<->linear conversion is not required as per GL4.4 rules See commit b662c70aeab6a92751514f30719c13a6de253b40. Reviewed-by: Charmaine Lee <[email protected]>
* Revert "glsl: move xfb BufferStride into gl_transform_feedback_info"Timothy Arceri2016-09-243-9/+8
| | | | | | | This reverts commit f5a6aab4031bc4754756c1773411728ad9a73381. This broke some tests. It seems gl_transform_feedback_info gets memset to 0 so we were losing the values in BufferStride before we used them.
* glsl: Delete linker stuff relating to built-in functions.Kenneth Graunke2016-09-232-58/+16
| | | | | | | | | | Now that we generate built-in functions inline, there's no need to link against the built-in shader, and no built-in prototypes to consider. This lets us delete a bunch of code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by; Ian Romanick <[email protected]>
* glsl: Delete ftransform support from builtin_functions.cpp.Kenneth Graunke2016-09-231-26/+4
| | | | | | | This is now handled directly by ast_function.cpp. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by; Ian Romanick <[email protected]>
* glsl: Immediately inline built-ins rather than generating calls.Kenneth Graunke2016-09-231-24/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past, we imported the prototypes of built-in functions, generated calls to those, and waited until link time to resolve the calls and import the actual code for the built-in functions. This severely limited our compile-time optimization opportunities: even trivial functions like dot() were represented as function calls. We also had no way of reasoning about those calls; they could have been 1,000 line functions with side-effects for all we knew. Practically all built-in functions are trivial translations to ir_expression opcodes, so it makes sense to just generate those inline. Since we eventually inline all functions anyway, we may as well just do it for all built-in functions. There's only one snag: built-in functions that refer to built-in global variables need those remapped to the variables in the shader being compiled, rather than the ones in the built-in shader. Currently, ftransform() is the only function matching those criteria, so it seemed easier to just make it a special case. On Skylake: total instructions in shared programs: 12023491 -> 12024010 (0.00%) instructions in affected programs: 77595 -> 78114 (0.67%) helped: 97 HURT: 309 total cycles in shared programs: 137239044 -> 137295498 (0.04%) cycles in affected programs: 16714026 -> 16770480 (0.34%) helped: 4663 HURT: 4923 while these statistics are in the wrong direction, the number of hurt programs is small (309 / 41282 = 0.75%), and I don't think anything can be done about it. A change like this significantly alters the order in which optimizations are performed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by; Ian Romanick <[email protected]>
* glsl: Check TCS barrier restrictions at ast_to_hir time, not link time.Kenneth Graunke2016-09-232-99/+19
| | | | | | | | | | | | | | | | | | | | | | | We want to check prior to optimization - otherwise we might fail to detect cases where barrier() is in control flow which is always taken (and therefore gets optimized away). We don't currently loop unroll if there are function calls inside; otherwise we might have a problem detecting barrier() in loops that get unrolled as well. Tapani's switch handling code adds a loop around switch statements, so even with the mess of if ladders, we'll properly reject it. Enforcing these rules at compile time makes more sense more sense than link time. Doing it at ast-to-hir time (rather than as an IR pass) allows us to emit an error message with proper line numbers. (Otherwise, I would have preferred the IR pass...) Fixes spec/arb_tessellation_shader/compiler/barrier-switch-always.tesc. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by; Ian Romanick <[email protected]>
* glsl: move xfb BufferStride into gl_transform_feedback_infoTimothy Arceri2016-09-243-8/+9
| | | | | | | | It makes more sense to have this here where we store the other values from xfb qualifiers. The struct it was previously part of is now only used to store values that come from the api. Reviewed-by: Alejandro Piñeiro <[email protected]>
* Revert "mapi: export all GLES 3.2 functions in libGLESv2.so"Dylan Baker2016-09-231-12/+0
| | | | | | | | | | | This reverts commit e66a2b879b73bf48800fec7353dafe8fc693ecdb. Which breaks the scons build in an interesting way, particularly when BlendBarrier and PrimitiveBoundingBox are added to static_data.py's functions list. This seems to be related to the fact that the unsuffixed names are only in GLES3.2, but Desktop GL only has suffixed versions. Signed-off-by: Dylan Baker <[email protected]>
* i965: Enable EGL_KHR_gl_texture_3D_imageAdam Jackson2016-09-231-0/+3
| | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* i915: Enable EGL_KHR_gl_texture_3D_imageAdam Jackson2016-09-231-0/+3
| | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* anv: Check for VK_WHOLE_SIZE in anv_CmdFillBufferNicolas Koch2016-09-231-0/+6
| | | | | | | | | | | From the Vulkan spec: Size is the number of bytes to fill, and must be either a multiple of 4, or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer. If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: get rid of duplicated values from gen_device_infoLionel Landwerlin2016-09-236-43/+28
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: get rid of duplicated values from gen_device_infoLionel Landwerlin2016-09-2326-79/+71
| | | | | | | | Now that we have gen_device_info mutable, we can update its values and drop all copies we had in brw_context. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/i965: make gen_device_info mutableLionel Landwerlin2016-09-2325-106/+111
| | | | | | | | | | | | Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: remove unused PIPE_CC_GCC_VERSIONTimothy Arceri2016-09-231-1/+0
| | | | Acked-by: Edward O'Callaghan <[email protected]>
* util: remove Sun C Compiler supportTimothy Arceri2016-09-231-1/+1
| | | | | | Support for this compiler was dropped in 51564f04b77e6 Acked-by: Edward O'Callaghan <[email protected]>
* st/mesa: turn on OES_viewport_array when dependencies are metIlia Mirkin2016-09-221-0/+5
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* mesa: add implementations for new float depth functionsIlia Mirkin2016-09-221-1/+18
| | | | | | | | This just up-converts them to doubles. Not great, but this is what all the other variants also do. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: move ARB_viewport_array params to a GLES 3.1-accessible sectionIlia Mirkin2016-09-221-6/+6
| | | | | | | This is needed for GL_OES_viewport_array. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add GL_OES_viewport_array to the extension stringIlia Mirkin2016-09-223-0/+3
| | | | | | | | | The expectation is that drivers will set this based on OES_geometry_shader and ARB_viewport_array support. This is a separate enable on the same reasoning as for OES_texture_cube_map_array. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* glsl: add OES_viewport_array enables and use them to expose gl_ViewportIndexIlia Mirkin2016-09-222-3/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add new entrypoints for GL_OES_viewport_arrayIlia Mirkin2016-09-225-6/+85
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mapi: export all GLES 3.2 functions in libGLESv2.soDylan Baker2016-09-221-0/+12
| | | | | | | | | See commit 5921f372c89a68fac6ddefc009442721d9df4db2 for the rational of this commit. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mapi: sort static_data.py functionsDylan Baker2016-09-221-2/+2
| | | | | | | | | | | Sorted by vim's builtin "sort i" (keeping the sorting case insensitive) v2: - uses case insensitive sorting (Ken) Signed-off-by: Dylan Baker <[email protected]> Acked-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]>
* mapi: retab static_data.py to be consistentDylan Baker2016-09-221-1285/+1285
| | | | | | | | | | | | | This file currently uses a mixture of 3 and 4 space indent. I have changed it all to 4 space indent, matching the settings in $ROOT/.editorconfig. This was generated with sed: sed -i -e 's@^ "@ "@g' Signed-off-by: Dylan Baker <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* spirv: fix AtomicLoad/Store on imagesLionel Landwerlin2016-09-221-10/+3
| | | | | | | | | OpAtomicLoad/Store should have pointer to images just like the rest of the atomic operators. These couple of lines were poorly copied from the ssbo/shared_vars cases (the only ones currently tests by the CTS). Fixes 2afb950161f8 ("spirv/nir: Add support for OpAtomicLoad/Store") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Allow opt_peephole_sel to be more aggressive in flattening IFs.Eric Anholt2016-09-224-31/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VC4 was running into a major performance regression from enabling control flow in the glmark2 conditionals test, because of short if statements containing an ffract. This pass seems like it was was trying to ensure that we only flattened IFs that should be entirely a win by guaranteeing that there would be fewer bcsels than there were MOVs otherwise. However, if the number of ALU ops is small, we can avoid the overhead of branching (which itself costs cycles) and still get a win, even if it means moving real instructions out of the THEN/ELSE blocks. For now, just turn on aggressive flattening on vc4. i965 will need some tuning to avoid regressions. It does looks like this may be useful to replace freedreno code. Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47 fps to 95 fps on vc4. vc4 shader-db: total instructions in shared programs: 101282 -> 99543 (-1.72%) instructions in affected programs: 17365 -> 15626 (-10.01%) total uniforms in shared programs: 31295 -> 31172 (-0.39%) uniforms in affected programs: 3580 -> 3457 (-3.44%) total estimated cycles in shared programs: 225182 -> 223746 (-0.64%) estimated cycles in affected programs: 26085 -> 24649 (-5.51%) v2: Update shader-db output. Reviewed-by: Ian Romanick <[email protected]> (v1)
* gallium/util: add comment on util_is_format_compatible()Brian Paul2016-09-211-0/+24
| | | | | | | | | From reading the code, it's not obvious what is src/dest compatible. The list of a->b copy-compatible formats comes from Jose's original check-in message, with some format name updates. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: minor simplification in svga_validate_surface_view()Brian Paul2016-09-211-3/+2
| | | | | | Get rid of unneeded local var. Reviewed-by: Charmaine Lee <[email protected]>
* svga: remove disable_shader debug variableBrian Paul2016-09-213-10/+0
| | | | | | Never used, AFAIK. Reviewed-by: Charmaine Lee <[email protected]>
* i965: Enable ES 3.2 on Skylake.Kenneth Graunke2016-09-211-1/+2
| | | | | | | | | | | It's already advertised because the version.c extension checks are fulfilled, but we didn't actually claim support, so trying to create a ES 3.2 context would fail. It's all done, and the CTS results look good, so let's turn it on. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/spirv/glsl450: Add support for the InterpolateAt opcodesJason Ekstrand2016-09-211-1/+53
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nir/spirv: Claim support for SampleRateShadingJason Ekstrand2016-09-211-1/+1
| | | | | | | We already support all of the decorations that require this capability. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nir/spirv: Bring back the spirv2nir helper binaryJason Ekstrand2016-09-212-0/+73
| | | | | | | | | | This was something that I wrote in the early days of the spirv_to_nir code but deleted once we had a real driver. However, in the absence of a shader_runner equivalent, it's extremely useful for debugging the spirv_to_nir code so let's bring it back. Signed-off-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: implement querying __DRI_IMAGE_ATTRIB_OFFSET.Chuanbo Weng2016-09-211-2/+7
| | | | | | | | | Implement querying this attribute in intelImageExtension and bump version of intelImageExtension. Signed-off-by: Chuanbo Weng <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* egl: return corresponding offset of EGLImage instead of 0.Chuanbo Weng2016-09-211-2/+9
| | | | | | | | | | | | | | | | | | The offset should not always be 0. For example, if EGLImage is created from a 2D texture with EGL_GL_TEXTURE_LEVEL=1, then the offset should be the actual start of miplevel 1 in bo. v2: Add version check of __DRIimageExtension implementation (Suggested by Axel Davy). v3: Don't add version check of __DRIimageExtension implementation. Set the offset only when queryImage() succeeds. (Suggested by Emil Velikov) Signed-off-by: Chuanbo Weng <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [Emil Velikov: coding style fixes] Signed-off-by: Emil Velikov <[email protected]>
* i965/ir: Test thread dispatch packing assumptions.Francisco Jerez2016-09-211-0/+30
| | | | | | | | | | | | | | | | Not [originally] intended for upstream. Should cause a GPU hang if some thread is executed with a non-contiguous dispatch mask breaking assumptions of brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or Piglit regressions, while replacing brw_stage_has_packed_dispatch() with a dummy implementation that unconditionally returns true on top of this patch causes multiple GPU hangs. v2: Refactor into a separate function instead of emitting the test code directly from emit_nir_code(), drop VEC4 test and clean up slightly for upstream. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* i965/ir: Pass identity mask to brw_find_live_channel() in the packed ↵Francisco Jerez2016-09-212-3/+11
| | | | | | | | | dispatch case. This avoids emitting a few extra instructions required to take the dispatch mask into account when it's known to be tightly packed. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/ir: Skip eliminate_find_live_channel() for stages with sparse thread ↵Francisco Jerez2016-09-213-0/+65
| | | | | | | | | | | | | | | | | | dispatch. The eliminate_find_live_channel optimization eliminates FIND_LIVE_CHANNEL instructions in cases where control flow is known to be uniform, and replaces them with 'MOV 0', which in turn unblocks subsequent elimination of the BROADCAST instruction frequently used on the result of FIND_LIVE_CHANNEL. This is however not correct in per-sample fragment shader dispatch because the PSD can dispatch a fully unlit sample under certain conditions. Disable the optimization in that case. Reviewed-by: Jason Ekstrand <[email protected]> v2: Add devinfo argument to brw_stage_has_packed_dispatch() to implement hardware generation check.