summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl/ir: Add printing support for doublesDave Airlie2015-02-191-0/+11
| | | | | | | Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl/ir: Add builtin function support for doublesDave Airlie2015-02-194-10/+206
| | | | | | | v2: add d2b, more ir_constant stuff (Ilia) Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: fix uniform linking logic in the presence of structsIlia Mirkin2015-02-193-31/+57
| | | | | | | | | | | | | Add a enter/leave record callback so that the offset may be aligned to the proper value. Otherwise only leaf fields are called, and the first field needs to be aligned to the outer struct's base alignment while the last field needs to be aligned to the inner struct's base alignment. This removes most usage of the last field/record type values passed into visit_field. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl: teach std140_base_alignment about samplersIlia Mirkin2015-02-191-0/+9
| | | | | | | | | These functions are about to be used more aggressively for determining uniform layout. Samplers may be inside of structs, and it's easier to reuse the existing base alignment logic. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl: Uniform linking support for doublesDave Airlie2015-02-191-1/+6
| | | | | Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Add double builtin type generationDave Airlie2015-02-195-23/+151
| | | | | | | Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2)Dave Airlie2015-02-194-0/+7
| | | | | | | | | | | | | v2: add define bit (Tapani Pälli) Patch makes following Piglit tests pass: arb_gpu_shader_fp64/preprocessor/define.vert arb_gpu_shader_fp64/preprocessor/define.frag Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: add double uniform support. (v5)Dave Airlie2015-02-194-32/+228
| | | | | | | | | | | | | | | | This adds support for the new uniform interfaces from ARB_gpu_shader_fp64. v2: support ARB_separate_shader_objects ProgramUniform*d* (Ian) don't allow boolean uniforms to be updated (issue 15) (Ian) v3: fix size_mul v4: Teach uniform update to take into account double precision (Topi) v5: add transpose for double case (Ilia) Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Add double builtin typeDave Airlie2015-02-191-0/+9
| | | | | | | | | | This causes a lot of warnings about unchecked type in switch statements - fix them later. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: add ARB_gpu_shader_fp64 extension info (v2)Dave Airlie2015-02-192-0/+2
| | | | | | | | | | | This just adds the entries to extensions.c and mtypes.h v2: use core profile only (Ian) Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glapi: add ARB_gpu_shader_fp64 (v2)Dave Airlie2015-02-197-37/+465
| | | | | | | | | | | | | | | Just add the xml file covering this extension, and dummy interface files in mesa, and fix up sanity tests. v2: Enable ProgramUniform*d* from ARB_separate_shader_objects (Ian) use 40 instead of 43 for dispatch_sanity.cpp (Chris) uncomment PU sanity tests. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno: add missing PIPE_CAP_RESOURCE_FROM_USER_MEMORY to switchIlia Mirkin2015-02-191-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add ARB_instanced_arrays supportIlia Mirkin2015-02-193-2/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add support for vertexid and instanceid sysvalsIlia Mirkin2015-02-195-16/+120
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: pass number of instances to drawIlia Mirkin2015-02-198-18/+22
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add ETC2 decoding supportIlia Mirkin2015-02-192-4/+17
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: pass etc2 textures to driver if supportedIlia Mirkin2015-02-194-11/+40
| | | | | | | If the driver actually supports ETC2, don't decode it in software. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe,softpipe: only support ETC1, not the upcoming ETC2Ilia Mirkin2015-02-182-0/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add ETC2 format supportIlia Mirkin2015-02-187-114/+104
| | | | | | | No actual decoding is added, similar faking mechanism to bptc. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a3xx: add hardware ETC1 supportIlia Mirkin2015-02-182-0/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/dri: Shut up a compiler warning.Eric Anholt2015-02-181-1/+1
| | | | | | | The compiler doesn't see that buffers is set in the !image case and used in the !image case. Reviewed-by: Matt Turner <[email protected]>
* nir: Recognize and reduce duplicated fsats.Eric Anholt2015-02-181-0/+2
| | | | | | | | No effect on vc4 shader-db. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fsat.Eric Anholt2015-02-182-1/+3
| | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering ffma.Eric Anholt2015-02-182-1/+3
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fneg/ineg.Eric Anholt2015-02-182-0/+12
| | | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)).Eric Anholt2015-02-182-1/+3
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add lowering of POW instructions if the lower flag is set.Eric Anholt2015-02-181-0/+1
| | | | | | | | | | This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. v2: Reorder the comment about the transformation. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Conditionalize the POW reconstruction on shader compiler options.Eric Anholt2015-02-183-2/+6
| | | | | | | | | | | | | Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v2)
* nir: Add an optional expression controlling nir_algebraic xforms.Eric Anholt2015-02-181-7/+32
| | | | | | | | | | | | This will be used so that we can customize the transforms for the target GPU, so we don't un-lower expressions that had already been lowered (or introduce new lowering transformations that not all GPUs want) v2: Drop the complication of having the condition->index dictionary, since we don't actually expect there to be many different conditions (change by Kenneth). Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a nir_shader_compiler_options struct pointed to by the shaders.Eric Anholt2015-02-184-4/+40
| | | | | | | | | This will be used to give the optimization passes a chance to customize behavior for the particular target device. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965/simd8vs: Fix SIMD8 atomics (read-only)Jordan Justen2015-02-181-8/+16
| | | | | | | | | | | | | | | | An update for d9cd982d556be560af3bcbcdaf62b6b93eb934a5. A similar change was needed for CS to allow the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. The previous change (d9cd982d) should fix cases that write atomics, such as atomicCounterIncrement, and this change will fix cases than only read atomics, such as atomicCounter. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* ilo: fix PCB alloc asserts on Gen7.5 GT3Chia-I Wu2015-02-181-1/+5
| | | | GT3 has two slices and all limits are doubled.
* ilo: fix compiler warningsChia-I Wu2015-02-183-8/+12
| | | | | Fix -Wmaybe-uninitialized warnings. The change to ilo_blit_resolve_slices_for_hiz() is a potential bug fix.
* i915: For the love of all that is holy, stop saying "IGD"Adam Jackson2015-02-182-9/+9
| | | | | | | a001 and a011 are pineview chips. Say so. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* auxiliary/vl: honour the DRI2PROTO_CFLAGSEmil Velikov2015-02-181-0/+1
| | | | | | | | Otherwise for non-default installations the build will fail to find the headers and error out. Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* auxiliary/vl: Build vl_winsys_dri.c only when needed.Emil Velikov2015-02-182-1/+13
| | | | | | | | | | | | With commit c39dbfdd0f7(auxiliary/vl: bring back the VL code for the dri targets) we did not fully consider users of dri-swrast alone. Thus we ended up trying to compile the dri2 specific code on platform which lack it - Cygwin for example. Cc: "10.5" <[email protected]> Reported-by: Jon TURNEY <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
* automake: Use AM_DISTCHECK_CONFIGURE_FLAGSEmil Velikov2015-02-181-1/+1
| | | | | | | | | | Currently we use DISTCHECK_CONFIGURE_FLAGS, which is reserved for the user. As with other variables, one should use the AM_ variable within the makefile. Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glx: do not leak the dri2 extension informationEmil Velikov2015-02-181-1/+2
| | | | | | | | | | | | The XExtensionInfo is allocated dynamically (if the pointer is NULL) in the XEXT_GENERATE_FIND_DISPLAY macro. On the other hand the macro XEXT_GENERATE_CLOSE_DISPLAY does not check/free the memory. Follow the example set by dri1 and appledri, and use a static variable. Spotted while hunting "still reachable" leaks in Waffle. Signed-off-by: Emil Velikov <[email protected]>
* Revert "radeon/llvm: enable unsafe math for graphics shaders"Michel Dänzer2015-02-181-4/+0
| | | | | | | | | | | This reverts commit 0e9cdedd2e3943bdb7f3543a3508b883b167e427. It caused the grass to disappear in The Talos Principle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069 Cc: "10.5 10.4" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: add ARB_pipeline_statistics_query supportIlia Mirkin2015-02-183-5/+56
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: implement ARB_pipeline_statistics_queryBen Widawsky2015-02-174-0/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NOTE: The implementation was initially one patch, this. All the history is kept here, even though all the core mesa changes were moved to the parent of this patch. This patch implements ARB_pipeline_statistics_query. This addition to GL does not add a new API. Instead, it adds new tokens to the existing query APIs. The work to hook up the new tokens is trivial due to it's similarity to the previous work done for the query APIs. I've implemented all the new tokens to some degree, but have stubbed out the untested ones at the entry point for Begin(). Doing this should allow the remainder of the code to be left in. The new tokens give GL clients a way to obtain stats about the GL pipeline. Generally, you get the number of things going in, invocations, and number of things coming out, primitives, of the various stages. There are two immediate uses for this, performance information, and debugging various types of misrendering. I doubt one can use these for debugging very complex applications, but for piglit tests, it should be quite useful. Tessellation shaders, and compute shaders are not addressed in this patch because there is no upstream implementation. I've implemented how I believe tessellation shader stats will work for Intel hardware (though there is a bit of ambiguity). Compute shaders are a bit more interesting though, and I don't yet know what we'll do there. For the lazy, here is a link to the relevant part of the spec: https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt Running the piglit tests http://lists.freedesktop.org/archives/piglit/2014-November/013321.html (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) yield the following results: > piglit-run.py -t stats tests/all.py output/pipeline_stats > [5/5] pass: 5 Running Test(s): 5 v2: - Don't allow pipeline_stats to be per stream (Ilia). This may (not sure) be needed for AMD_transform_feedback4, which we do not support. > If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- > EMITTED_ARB counts primitives emitted to any of the vertex streams for > which STREAM_RASTERIZATION_AMD is enabled. - Remove comment from GL3.txt because it is only used for extensions that are part of required versions (Ilia) - Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia) - Add a fallthrough comment (Ilia) - Only divide PS invocations by 4 on HSW+ (Ben) v3: - Add ARB_pipeline_statistics_query to relnotes.html - Add ARB_pipeline_statistics_query.xml to the Makefile.am, and master XML (Ilia) - Correct extension number (Ilia) - Add link to xml in the main GL API xml (Ilia) - remove special GS case from gen6_end_query (Ian) - Make lookup table static so gcc doesn't initialized it on every call (Ian) - Use if (_mesa_has_geometry_shaders(ctx)) instead of explicit checks (Ian) - Core mesa parts moved into a prep patch (Ilia) v4: - Change to 10.6 relnotes since we missed 10.5 window - Moved compute shader stuff into the switch statement (Jordan) - Jordan: Add compute shader support v5: - Fixed relnote style (Ilia) v6: - Rebased on master which beat me to adding the first relnotes - essentially this undoes v5 (which had a typo anyway) - Some code style fixes (Ken) - Remove some excess comments (Ken) - Unify tessellation failure style - unreachable (Ken) - Fix workaround comment for PS invocations (Ken) Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for the ARB_pipeline_statistics_query extensionBen Widawsky2015-02-177-0/+136
| | | | | | | | | | | | | | | | | | | | | | | | | This was originally part of a single patch which added the extension, and implemented it for i965 classic. For information about the evolution of the patch, please see the subsequent commit. One difference here as compared to the original mega patch is this does build support for the compute shader query. Since it cannot be tested on any platform, it will always return NULL for now. Jordan has already written a patch to address this, and when that patch lands, this logic can be modified. v2: Fix typo in subject (Brian Paul) Add checks for desktop gl (Ilia) Fail for any callers for now (Ilia) Update QueryCounterBits for new tokens (Ilia) Jordan: Use _mesa_has_compute_shaders Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> v3: Rebased on patch which adds the proper information to unstub tessellation shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add _mesa_has_compute_shadersJordan Justen2015-02-171-0/+11
| | | | | | | | | | v2 (Ben): Change GLboolean to bool as requested by Ian Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Ben Widawsky <[email protected]>
* mesa: Add ARB_tessellation_shader to extension table.Fabian Bieler2015-02-172-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Prefer Meta over the BLT for BlitFramebuffer.Kenneth Graunke2015-02-171-7/+7
| | | | | | | | | | | | | | | | | There's some debate about whether we should use Meta or BLORP, but either should run circles around the BLT engine. In particular, this means that Gen8+ will use the 3D engine for blits, like we do on Gen6-7. Improves performance in "copypixrate -blit -back" (from Mesa demos) by 232.037% +/- 3.15795% (n=10) on Broadwell GT3e. v2: Rebase on Laura's changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.5" <[email protected]>
* i965/fs: Add algebraic optimizations for MAD.Matt Turner2015-02-171-0/+43
| | | | | | | | | total instructions in shared programs: 5764176 -> 5763808 (-0.01%) instructions in affected programs: 25121 -> 24753 (-1.46%) helped: 164 HURT: 2 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Emit MAD instructions when possible.Matt Turner2015-02-172-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we didn't emit MAD instructions since they cannot take immediate arguments, but with the opt_combine_constants() pass we can handle this properly. total instructions in shared programs: 5920017 -> 5733278 (-3.15%) instructions in affected programs: 3625153 -> 3438414 (-5.15%) helped: 22017 HURT: 870 GAINED: 91 LOST: 49 Without constant pooling, this patch is a complete loss: total instructions in shared programs: 5912589 -> 5987888 (1.27%) instructions in affected programs: 3190050 -> 3265349 (2.36%) helped: 1564 HURT: 17827 GAINED: 27 LOST: 101 And since the constant pooling patch by itself hurt a bunch of things, from before constant pooling to this patch the results are: total instructions in shared programs: 5895414 -> 5747946 (-2.50%) instructions in affected programs: 3617993 -> 3470525 (-4.08%) helped: 20478 HURT: 4469 GAINED: 54 LOST: 146 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Allow immediates in MAD and LRP instructions.Matt Turner2015-02-172-3/+33
| | | | | | | | | | | | | | | And then the opt_combine_constants() pass will pull them out into registers. This will allow us to do some algebraic optimizations on MAD and LRP. total instructions in shared programs: 5946656 -> 5931320 (-0.26%) instructions in affected programs: 778247 -> 762911 (-1.97%) helped: 3780 HURT: 6 GAINED: 12 LOST: 12 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add pass to combine immediates.Matt Turner2015-02-174-0/+287
| | | | | | | | | | | | | total instructions in shared programs: 5885407 -> 5940958 (0.94%) instructions in affected programs: 3617311 -> 3672862 (1.54%) helped: 3 HURT: 23556 GAINED: 31 LOST: 165 ... but will allow us to always emit MAD instructions. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Remove force_writemask_all assertion for execsize < 8.Matt Turner2015-02-171-1/+0
| | | | | | | This doesn't seem to be necessary. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974 Reviewed-by: Kenneth Graunke <[email protected]>