summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Add all system variables to the input resource list.Kenneth Graunke2016-04-011-8/+1
| | | | | | | | | | | | | | | | | | | | System values are just built-in input variables that we've opted to special-case out of convenience. We need to consider all inputs, regardless of how we've classified them. Unfortunately, there's one exception: we shouldn't add gl_BaseVertex unless ARB_shader_draw_parameters is enabled, because it doesn't actually exist in the language, and shouldn't be counted in the GL_ACTIVE_RESOURCES query. Fixes dEQP-GLES31.functional.program_interface_query.program_input. resource_list.compute.empty, which expects gl_NumWorkGroups to appear in the resource list. v2: Delete more code Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Make _mesa_choose_tex_format() handle stencil textures.Kenneth Graunke2016-04-011-0/+5
| | | | | | | | This is necessary for ARB_texture_stencil8 support on classic drivers. Presumably Gallium works because it implements its own ChooseTexFormat. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-021-6/+7
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add threads per block TGSI propertyBas Nieuwenhuizen2016-04-021-0/+18
| | | | | | | | | | The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add compute shader IR typeBas Nieuwenhuizen2016-04-021-0/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* i965: Add an implemnetation of nir_op_fquantize2f16Jason Ekstrand2016-04-012-0/+53
| | | | Reviewed-by: Matt Turner <[email protected]>
* Android: fix x86 gallium buildsRob Herring2016-04-015-5/+55
| | | | | | | | | | | | | | | Builds with gallium enabled fail on x86 with linker error: external/mesa3d/src/mesa/vbo/vbo_exec_array.c:127: error: undefined reference to '_mesa_uint_array_min_max' The problem is sse_minmax.c is not included in the libmesa_st_mesa library. Since the SSE4.1 files are needed for both libmesa_st_mesa and libmesa_dricore, move SSE4.1 files into a separate static library that can be used by both. Cc: "11.1 11.2" <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: add GL_OES/EXT_draw_buffers_indexed supportIlia Mirkin2016-03-312-0/+12
| | | | | | | | This is the same ext as ARB_draw_buffers_blend (plus some core functionality that already exists). Add the alias entrypoints. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* i965: Use brw->urb.min_vs_urb_entries instead of 32 for BLORP.Kenneth Graunke2016-03-311-4/+1
| | | | | | | | | | | Haswell GT2 and GT3 have a minimum of 64 entries. Hardcoding 32 is not legal. v2: Delete stale comment (caught by Alejandro). Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6.Kenneth Graunke2016-03-312-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | According to the Sandybridge PRM's description of the resinfo message, the .z value returned will be Depth == 0 ? 0 : Depth + 1. The earlier PRMs have the same table. This means we return 0 for array textures with a single slice, when we ought to return 1. Just override it to max(depth, 1). Fixes 10 dEQP-GLES3.functional tests on Sandybridge: shaders.texture_functions.texturesize.sampler2darray_fixed_vertex shaders.texture_functions.texturesize.sampler2darray_fixed_fragment shaders.texture_functions.texturesize.sampler2darray_float_vertex shaders.texture_functions.texturesize.sampler2darray_float_fragment shaders.texture_functions.texturesize.isampler2darray_vertex shaders.texture_functions.texturesize.isampler2darray_fragment shaders.texture_functions.texturesize.usampler2darray_vertex shaders.texture_functions.texturesize.usampler2darray_fragment shaders.texture_functions.texturesize.sampler2darrayshadow_vertex shaders.texture_functions.texturesize.sampler2darrayshadow_fragment Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* ptn: Fix all users of ptn_swizzleIan Romanick2016-03-311-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | None of the callers actually wanted what it did. In ptn_xpd, you only ever want a vec3 swizzle. In ptn_tex, you want a swizzle that matches the number of required texture coordinates. shader-db results: G45: total instructions in shared programs: 4011240 -> 4010911 (-0.01%) instructions in affected programs: 59232 -> 58903 (-0.56%) helped: 114 HURT: 0 total cycles in shared programs: 84314194 -> 84313220 (-0.00%) cycles in affected programs: 779150 -> 778176 (-0.13%) helped: 110 HURT: 13 Ironlake: total instructions in shared programs: 6397262 -> 6396605 (-0.01%) instructions in affected programs: 117402 -> 116745 (-0.56%) helped: 227 HURT: 0 total cycles in shared programs: 128889798 -> 128888524 (-0.00%) cycles in affected programs: 1214644 -> 1213370 (-0.10%) helped: 179 HURT: 44 Sandy Bridge: total instructions in shared programs: 8467391 -> 8467384 (-0.00%) instructions in affected programs: 3107 -> 3100 (-0.23%) helped: 10 HURT: 6 total cycles in shared programs: 117580120 -> 117573448 (-0.01%) cycles in affected programs: 103158 -> 96486 (-6.47%) helped: 84 HURT: 11 Ivy Bridge: total instructions in shared programs: 7774255 -> 7774258 (0.00%) instructions in affected programs: 1677 -> 1680 (0.18%) helped: 8 HURT: 6 total cycles in shared programs: 65743828 -> 65739190 (-0.01%) cycles in affected programs: 89312 -> 84674 (-5.19%) helped: 78 HURT: 23 Haswell: total instructions in shared programs: 7107172 -> 7107150 (-0.00%) instructions in affected programs: 2048 -> 2026 (-1.07%) helped: 16 HURT: 0 total cycles in shared programs: 64653636 -> 64647486 (-0.01%) cycles in affected programs: 86836 -> 80686 (-7.08%) helped: 85 HURT: 17 Broadwell and Skylake: total instructions in shared programs: 8447529 -> 8447507 (-0.00%) instructions in affected programs: 2038 -> 2016 (-1.08%) helped: 16 HURT: 0 total cycles in shared programs: 66418670 -> 66413416 (-0.01%) cycles in affected programs: 90110 -> 84856 (-5.83%) helped: 83 HURT: 20 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ptn: Silence unused parameter warningIan Romanick2016-03-311-2/+2
| | | | | | | | | | | | | | The KIL instruction doesn't have a destination, so ptn_kil never uses dest. program/prog_to_nir.c: In function ‘ptn_kil’: program/prog_to_nir.c:547:38: warning: unused parameter ‘dest’ [-Wunused-parameter] ptn_kil(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src) ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add GL_EXT_copy_image supportIlia Mirkin2016-03-301-0/+1
| | | | | | | | The extension is identical to GL_OES_copy_image. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: add GL_OES_copy_image supportIlia Mirkin2016-03-306-1/+128
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: remove duplicate MAX_GEOMETRY_SHADER_INVOCATIONS entryIlia Mirkin2016-03-301-3/+0
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: add ES sample-shading supportIlia Mirkin2016-03-301-0/+6
| | | | | | | | | We require the full ARB_gpu_shader5 for now, but in the future some other CAP could get exposed to indicate that only the multisample-related behavior of ARB_gpu_shader5 is available. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: add GL_OES_shader_multisample_interpolation supportIlia Mirkin2016-03-303-3/+14
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add GL_OES_sample_shading supportIlia Mirkin2016-03-304-3/+8
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add OES_sample_variables to extension table, add enable bitIlia Mirkin2016-03-302-0/+2
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't add barrier deps for FB write messages.Matt Turner2016-03-301-1/+2
| | | | | | | Ken did this earlier, and this is just me reimplementing his patch a little differently. Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add and use is_scheduling_barrier() function.Matt Turner2016-03-301-4/+17
|
* i965: Remove NOP insertion kludge in scheduler.Matt Turner2016-03-301-20/+5
| | | | | | | | | | | Instead of removing every instruction in add_insts_from_block(), just move the instruction to its scheduled location. This is a step towards doing both bottom-up and top-down scheduling without conflicts. Note that this patch changes cycle counts for programs because it begins including control flow instructions in the estimates. Reviewed-by: Francisco Jerez <[email protected]>
* i965: Assert that an instruction is not inserted around itself.Matt Turner2016-03-301-0/+4
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* i965: Relax restriction on scheduling last instruction.Matt Turner2016-03-301-20/+3
| | | | | | | | | | | | | | | | | | | | | | I think when this code was written, basic blocks were always ended by a control flow instruction or an end-of-thread message. That's no longer the case, and removing this restriction actually helps things: instructions in affected programs: 7267 -> 7244 (-0.32%) helped: 4 total cycles in shared programs: 66559580 -> 66431900 (-0.19%) cycles in affected programs: 28310152 -> 28182472 (-0.45%) helped: 9577 HURT: 879 GAINED: 2 The addition of the is_control_flow() checks is not a functional change, since the add_insts_from_block() does not put them in the list of instructions to schedule. I plan to change this in a later patch. Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO.Matt Turner2016-03-302-2/+3
| | | | | | | | | | | | | | | | | | Missing this causes an assertion failure in the scheduler with the next patch. Additionally, this gives cmod propagation enough information to optimize code better. total instructions in shared programs: 7112991 -> 7112852 (-0.00%) instructions in affected programs: 25704 -> 25565 (-0.54%) helped: 139 total cycles in shared programs: 64812898 -> 64810674 (-0.00%) cycles in affected programs: 127224 -> 125000 (-1.75%) helped: 139 Acked-by: Francisco Jerez <[email protected]>
* Revert "i965: Don't add barrier deps for FB write messages."Matt Turner2016-03-301-4/+3
| | | | | | | | | | | | | | This reverts commit d0e1d6b7e27bf5f05436e47080d326d7daa63af2. The change in the vec4 code is a mistake -- there's never an FS_OPCODE_FB_WRITE in vec4 code. The change in the fs code had the (harmless) effect of not recognizing an FB_WRITE as a scheduling barrier even if it was marked EOT -- harmless because the scheduler marked the last instruction of a block as a barrier, something I'm changing in the following patches. This will be reimplemented later in the series.
* i965: Simplify full scheduling-barrier conditions.Matt Turner2016-03-301-27/+8
| | | | | | | All of these were simply code for "architecture register file" (and in the case of destinations, "not the null register"). Reviewed-by: Francisco Jerez <[email protected]>
* i965: Remove incorrect cycle estimates.Matt Turner2016-03-301-10/+0
| | | | | | | | These printed the cycle count the last basic block (sched.time is set per basic block!). We have accurate, full program, data printed elsewhere. Reviewed-by: Francisco Jerez <[email protected]>
* st/mesa: fix fallout from xfb changes.Dave Airlie2016-03-311-2/+2
| | | | | | | Failed to update state tracker with new buffer interface. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: add query support for GL_TRANSFORM_FEEDBACK_BUFFER interfaceTimothy Arceri2016-03-313-2/+51
| | | | Reviewed-by: Dave Airlie <[email protected]>
* glsl: add transform feedback buffers to resource listTimothy Arceri2016-03-313-3/+3
| | | | Reviewed-by: Dave Airlie <[email protected]>
* mesa: add support to query GL_TRANSFORM_FEEDBACK_BUFFER_INDEXTimothy Arceri2016-03-312-0/+7
| | | | Reviewed-by: Dave Airlie <[email protected]>
* mesa: add support to query GL_OFFSET for GL_TRANSFORM_FEEDBACK_VARYINGTimothy Arceri2016-03-312-3/+12
| | | | Reviewed-by: Dave Airlie <[email protected]>
* mesa: rename tranform feeback varying macro XFB to XFVTimothy Arceri2016-03-311-6/+6
| | | | | | A latter patch will use XFB for buffers. Reviewed-by: Dave Airlie <[email protected]>
* glsl: validate global out xfb_stride qualifiers and set stride on empty buffersTimothy Arceri2016-03-311-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we use the built-in validation in ast_layout_expression::process_qualifier_constant() to check for mismatching global out strides on buffers in a single shader. From the ARB_enhanced_layouts spec: "While *xfb_stride* can be declared multiple times for the same buffer, it is a compile-time or link-time error to have different values specified for the stride for the same buffer." For intrastage validation a new helper link_xfb_stride_layout_qualifiers() is created. We also take this opportunity to make sure stride is at least a multiple of 4, we will validate doubles at a later stage. From the ARB_enhanced_layouts spec: "If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results." Finally we update store_tfeedback_info() to apply the strides to LinkedTransformFeedback and update the buffers bitmask to mark any global buffers with a stride as active. For example a shader with: layout (xfb_buffer = 0, xfb_offset = 0) out vec4 gs_fs; layout (xfb_buffer = 1, xfb_stride = 64) out; Is expected to have a buffer bound to both 0 and 1. From the ARB_enhanced_layouts spec: "A binding point requires a bound buffer object if and only if its associated stride in the program object used for transform feedback primitive capture is non-zero." Reviewed-by: Dave Airlie <[email protected]>
* mesa: split transform feedback buffer into its own structTimothy Arceri2016-03-316-20/+28
| | | | | | | This will be used in a following patch to implement interface query support for TRANSFORM_FEEDBACK_BUFFER. Reviewed-by: Dave Airlie <[email protected]>
* glsl: use bitmask of active xfb buffer indicesTimothy Arceri2016-03-314-22/+24
| | | | | | | | | | | | | This allows us to print the correct binding point when not all buffers declared in the shader are bound. For example if we use a single buffer: layout(xfb_buffer=2, offset=0) out vec4 v; We now print '2' when the buffer is not bound rather than '0'. Reviewed-by: Dave Airlie <[email protected]>
* i965: Don't inline intel_batchbuffer_require_space().Matt Turner2016-03-302-26/+28
| | | | | | | | | | | | It's called by the inline intel_batchbuffer_begin() function which itself is used in BEGIN_BATCH. So in sequence of code emitting multiple packets, we have inlined this ~200 byte function multiple times. Making it an out-of-line function presumably improved icache usage. Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on Ivybridge. Reviewed-by: Abdiel Janulgue <[email protected]>
* mesa: allow mutable buffer textures to back GL ES imagesIlia Mirkin2016-03-291-1/+6
| | | | | | | | | Since there is no way to create immutable texture buffers in GL ES, mutable buffer textures are allowed to back images. See issue 7 of the GL_OES_texture_buffer specification. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa: make _mesa_prepare_mipmap_level() staticBrian Paul2016-03-292-15/+8
| | | | | | | | | No longer called from any other file. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]>
* meta: use _mesa_prepare_mipmap_levels()Brian Paul2016-03-291-24/+8
| | | | | | | | | | | | | | | | | | | | | | | | The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is not needed. It only served to undo the GL_TEXTURE_1D_ARRAY height/depth change was was made before the call to prepare_mipmap_level() Said another way, regardless of how the meta code manipulates the height/ depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are correctly set up by _mesa_prepare_mipmap_levels(). Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver and testing with piglit. v2 (idr): Early out of the mipmap generation loop with dstImage is NULL. This can occur for immutable textures that have a limited range of levels or in the presense of memory allocation failures. Fixes arb_texture_view-mipgen on Intel platforms. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* xlib: add support for GLX_ARB_create_contextBrian Paul2016-03-293-0/+77
| | | | | | | | | | | | | | | | | This adds the glXCreateContextAttribsARB() function for the xlib/swrast driver. This allows more piglit tests to run with this driver. For example, without this patch we get: $ bin/fbo-generatemipmap-1d -auto piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_ ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL version not equal to the default value 1.0 piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context piglit: info: Failed to create any GL context PIGLIT: {"result": "skip" } Reviewed-by: Jose Fonseca <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* st/mesa: simplify st_generate_mipmap()Brian Paul2016-03-291-78/+24
| | | | | | | | | | | | | | | | | The whole st_generate_mipmap() function was overly complicated. Now we just call the new _mesa_prepare_mipmap_levels() function to prepare the texture mipmap memory, then call the generate function which fills in the texture images. This fixes a failed assertion in llvmpipe/softpipe which is hit with the new piglit generatemipmap-base-change test. Also fixes some device errors (format mismatches) with the VMware svga driver. v2: fix a comment typo, per Sinclair Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: new _mesa_prepare_mipmap_levels() function for mipmap generationBrian Paul2016-03-292-31/+62
| | | | | | | | | | | | Simplifies the loops in generate_mipmap_uncompressed() and generate_mipmap_compressed(). Will be used in the state tracker too. Could probably be used in the meta code. If so, some additional clean-ups can be done after that. v2: use unsigned types instead of GLuint, per Ian Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* i965: Don't use CUBE wrap modes for integer formats on IVB/BYT.Kenneth Graunke2016-03-291-1/+5
| | | | | | | | | | | | | | | | There is no linear filtering for integer formats, so we should always be using CLAMP_TO_EDGE mode. Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit 0faf26e6a0a34c3544644852802484f2404cc83e). This workaround doesn't appear to be necessary on any other hardware; I haven't found any documentation mentioning errata in this area. v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [v1]
* Revert "i965: Set address rounding bits for GL_NEAREST filtering as well."Kenneth Graunke2016-03-291-6/+3
| | | | | | | This reverts commit 60d6a8989ab44cf47accee6bc692ba6fb98f6a9f. It's pretty sketchy, and apparently regressed a bunch of dEQP tests on Sandybridge.
* st/mesa: implement new DMA-buf based VDPAU interop v2Christian König2016-03-291-49/+132
| | | | | | | | | | Avoid using internal structures from another API. v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed. Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Leo Liu <[email protected]>
* st/mesa: enable OES_texture_buffer when all components availableIlia Mirkin2016-03-291-0/+6
| | | | | | | | OES_texture_buffer combines bits from a number of desktop extensions. When they're all available, turn it on. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: add OES_texture_buffer and EXT_texture_buffer supportIlia Mirkin2016-03-287-44/+55
| | | | | | | | Allow ES 3.1 contexts to access the texture buffer functionality. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add OES_texture_buffer and EXT_texture_buffer extension to tableIlia Mirkin2016-03-282-0/+3
| | | | | | | | | | | We need to add a new bit since the GL ES exts require functionality from a combination of texture buffer extensions as well as images (for imageBuffer) support. Additionally, not all GPUs support all the texture buffer functionality (e.g. rgb32 isn't supported by nv50). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>