summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* r600: Use DMA transfers in r600_copy_global_bufferNiels Ole Salscheider2014-10-072-17/+43
| | | | | | v2: Do not demote items that are already in the pool Signed-off-by: Niels Ole Salscheider <[email protected]>
* glsl: Optimize min/max expression treesIago Toral Quiroga2014-10-074-0/+478
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original patch by Petri Latvala <[email protected]>: Add an optimization pass that drops min/max expression operands that can be proven to not contribute to the final result. The algorithm is similar to alpha-beta pruning on a minmax search, from the field of AI. This optimization pass can optimize min/max expressions where operands are min/max expressions. Such code can appear in shaders by itself, or as the result of clamp() or AMD_shader_trinary_minmax functions. This optimization pass improves the generated code for piglit's AMD_shader_trinary_minmax tests as follows: total instructions in shared programs: 75 -> 67 (-10.67%) instructions in affected programs: 60 -> 52 (-13.33%) GAINED: 0 LOST: 0 All tests (max3, min3, mid3) improved. A full shader-db run: total instructions in shared programs: 4293603 -> 4293575 (-0.00%) instructions in affected programs: 1188 -> 1160 (-2.36%) GAINED: 0 LOST: 0 Improvements happen in Guacamelee and Serious Sam 3. One shader from Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of dropping of a (constant float (0.00000)) operand, which was compiled to a saturate modifier. Version 2 by Iago Toral Quiroga <[email protected]>: Changes from review feedback: - Squashed various cosmetic changes sent by Matt Turner. - Make less_all_components return an enum rather than setting a class member. (Suggested by Mat Turner). Also, renamed it to compare_components. - Make less_all_components, smaller_constant and larger_constant static. (Suggested by Mat Turner) - Change mixmax_range to call its limits "low" and "high" instead of "range[0]" and "range[1]". (Suggested by Connor Abbot). - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by Connor Abbot). - Make the logic more clearer by rearrenging the code and commenting. (Suggested by Connor Abbot). - Added comment to explain why we need to recurse twice. (Suggested by Connor Abbot). - If we cannot prune an expression, do not return early. Instead, attempt to prune its children. (Suggested by Connor Abbot). Other changes: - Instead of having a global "valid" visitor member, let the various functions that can determine this status return a boolean and check for its value to decide what to do in each case. This is more flexible and allows to recurse into children of parents that could not be prunned due to invalid ranges (so related to the last bullet in the review feedback). - Make sure we always check if a range is valid before working with it. Since any use of get_range, combine_range or range_intersection can invalidate a range we should check for this situation every time we use any of these functions. Version 3 by Iago Toral Quiroga <[email protected]>: Changes from review feedback: - Now we can make get_range, combine_range and range_intersection static too (suggested by Connor Abbot). - Do not return NULL when looking for the larger or greater constant into mixed vector constants. Instead, produce a new constant by doing a component-wise minmax. With this we can also remove of the validations when we call into these functions (suggested by Connor Abbot). - Add a comment explaining the meaning of the baserange argument in prune_expression (suggested by Connor Abbot). Other changes: - Eliminate minmax expressions operating on constant vectors with mixed values by resolving them. No piglit regressions observed with Version 3. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861 Reviewed-by: Connor Abbott <[email protected]>
* glsl: do not emit error for non written varyings on OpenGL ESTapani Pälli2014-10-071-2/+16
| | | | | | | | | | | Patch fixes following test case from 'shaders-with-varyings' WebGL conformance suite: "vertex shader with unused varying and fragment shader with used varying must succeed" v2: emit still a warning if the condition happens (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* radeonsi: Use dummy pixel shader if compilation of the real shader failedMichel Dänzer2014-10-073-7/+22
| | | | | | | Instead of crashing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5 Reviewed-by: Marek Olšák <[email protected]>
* ilo: let shaders determine surface countsChia-I Wu2014-10-069-202/+267
| | | | | | | | When a shader needs N surfaces, we should upload N surfaces and not depend on how many are bound. This commit is larger than it should be because we did not export how many surfaces a surface uses before. Signed-off-by: Chia-I Wu <[email protected]>
* ilo: let shaders determine sampler countsChia-I Wu2014-10-0413-87/+98
| | | | | | | When a shader needs N samplers, we should upload N samplers and not depend on how many are bound. Signed-off-by: Chia-I Wu <[email protected]>
* tgsi: change tgsi_shader_info::properties to a one-dimensional arrayMarek Olšák2014-10-0413-24/+23
| | | | | | Reviewed-by: Roland Scheidegger <[email protected]> v2: fix svga too
* radeonsi: set number of userdata SGPRs of GS copy shader to 4Marek Olšák2014-10-043-10/+23
| | | | | | | It only needs the constant buffer with clip planes and read-write resources for the GS->VS ring and streamout. That's 2 pointers. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: pass the GS shader directly to si_generate_gs_copy_shaderMarek Olšák2014-10-041-3/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set LLVMByValAttribute for all descriptor arraysMarek Olšák2014-10-041-10/+7
| | | | | | I hope this is correct. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: make the vertex shader key smallerMarek Olšák2014-10-041-1/+2
| | | | | | We only support 16 vertex attribs, not 32. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't flush shader caches when building PM4 shader statesMarek Olšák2014-10-041-8/+0
| | | | | | | | | This is a wrong place to flush caches to say the least. I don't think we need to flush the instruction caches if we don't patch shaders with DMA. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLEMarek Olšák2014-10-043-5/+2
| | | | | | | | | st/mesa has the same flag in its shader key, we don't need to do it in the driver anymore. Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move geometry shader properties from si_shader to si_shader_selectorMarek Olšák2014-10-044-29/+38
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: always compile shaders on demandMarek Olšák2014-10-041-13/+3
| | | | | | | The first compiled shader is sometimes useless, because the key doesn't match the key for the draw call where it's used. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unused variable si_shader::gs_input_primMarek Olšák2014-10-042-3/+0
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: remove some not so useful variables from tgsi_shader_infoMarek Olšák2014-10-046-20/+14
|
* radeonsi: get fs_write_all from tgsi_shader_info directlyMarek Olšák2014-10-043-16/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: simplify shader properties in tgsi_shader_infoMarek Olšák2014-10-049-136/+70
| | | | Use an array of properties indexed by TGSI_PROPERTY_* definitions.
* radeonsi: get tgsi_shader_info only once before compilationMarek Olšák2014-10-043-21/+16
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* gallium/util: add util_bitcount64Marek Olšák2014-10-041-0/+12
| | | | | | | | I'll need this in radeonsi. v2: use __builtin_popcountll if available Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix CS tracing and remove excessive CS dumpingMarek Olšák2014-10-043-35/+25
|
* gk110/ir: add dnz flag emission for fmul/fmadIlia Mirkin2014-10-031-0/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* gm107/ir: add dnz emission for fmulIlia Mirkin2014-10-031-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.3" <[email protected]>
* st/wgl: add WINAPI qualifiers on wgl function typedefsBrian Paul2014-10-031-2/+2
| | | | | | | | Fixes a release build segfault when wglCreateContextAttribsARB() calls the wglCreateContext() function. Cc: "10.3" <[email protected]> Reviewed-by: Matthew McClure <[email protected]>
* freedreno: query fixesRob Clark2014-10-033-8/+13
| | | | | | | Fixes a few issues, including a potential empty-IB (which triggers gpu hangs in piglit occlusion_query_meta_no_fragments) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: handle VS only outputting BCOLORRob Clark2014-10-031-2/+10
| | | | | | | Possibly we should map the front color to black (zeroes). But not sure there is a way to do that without generating a shader variant. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix lockups with lame FRAG shadersRob Clark2014-10-034-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | Shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D 1: MOV OUT[0], IMM[0].xyxx 2: END cause unhappyness. They have an IN[], but once this is compiled the useless TEX instruction goes away. Leaving a varying that is never fetched, which makes the hw unhappy. In the process fix a signed vs unsigned compare. If the vertex shader has max_reg=-1, MAX2() vs an unsigned would not give the desired result. Signed-off-by: Rob Clark <[email protected]>
* i965/compaction: Disable compaction on SNB temporarily.Matt Turner2014-10-031-0/+6
| | | | Will investigate after XDC.
* Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7."Matt Turner2014-10-031-2/+2
| | | | | | | | This reverts commit 54e30dbf4db437748509d1319c3f6e4185f76c69. Will investigate after XDC. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557
* i965/fs: Remove dead generate_rep_fb_write prototype.Matt Turner2014-10-031-1/+0
| | | | Added in commit f9dc7aab.
* mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION errorBrian Paul2014-10-031-1/+35
| | | | | | | | | | | | | | | | | | On Windows, the Piglit primitive-restart test was failing a glGetError()==0 assertion when it was run w/out any command line arguments. Piglit's all.py script only runs primitive-restart with arguments so this case isn't normally hit during a full piglit run. The basic problem is Microsoft's opengl32.dll calls glFlush from wglGetProcAddress() and Piglit uses wglGetProcAddress() to resolve glPrimitiveRestartNV() which is called inside glBegin/End. See comments in the code for more info. Plus, improve the comments for _mesa_alloc_dispatch_table(). Cc: <[email protected]> Acked-by: Sinclair Yeh <[email protected]>
* freedreno/ir3: add TXF supportIlia Mirkin2014-10-021-1/+39
| | | | | | | Still failing a bunch of the fairly picky texelFetch tests, but the 1D(Array) ones are full passes. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add TXD support and expose ARB_shader_texture_lodIlia Mirkin2014-10-023-9/+56
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add texture offset supportIlia Mirkin2014-10-021-4/+45
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: shadow comes before arrayIlia Mirkin2014-10-021-2/+2
| | | | | | | Experimentally, this makes *ArrayShadow tex-miplevel-selection tests pass. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: make TXQ return integers, not floatsIlia Mirkin2014-10-021-1/+1
| | | | | | We're still doing something wrong for array textures. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add UMAD supportIlia Mirkin2014-10-021-4/+15
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add ISSG supportIlia Mirkin2014-10-021-0/+39
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add MOD supportIlia Mirkin2014-10-021-8/+12
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add UMOD support, based on UDIVIlia Mirkin2014-10-021-6/+31
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add IDIV/UDIV supportIlia Mirkin2014-10-021-3/+197
| | | | | | Logic shamelessly copied from nv50 lowering pass. Signed-off-by: Ilia Mirkin <[email protected]>
* radeonsi: Clear sampler view flags when binding a bufferMichel Dänzer2014-10-031-0/+5
| | | | | | | | | Fixes assertion failure while running the Unreal Engine 4 Elemental demo: .../si_blit.c:322:si_decompress_color_textures: Assertion `tex->cmask.size || tex->fmask.size' failed. Cc: "10.2 10.3" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Add support for framebuffer sRGB encoding.Eric Anholt2014-10-021-2/+31
|
* vc4: Add support for sampling from sRGB.Eric Anholt2014-10-022-9/+51
| | | | | | | | This isn't perfect -- the filtering is happening on the srgb values, and we're decoding afterwards, which is not what you want. I think that's the cause of some additional texwrap(GL_CLAMP, LINEAR) failures, though many other texwrap tests on srgb start to pass since unfiltered values come out correct.
* freedreno/ir3: avoid fan-in sources referring to same instructionIlia Mirkin2014-10-021-2/+10
| | | | | | | | | | Since the RA has to be done s.t. each one gets its own (adjacent) register, it would complicate matters if instructions were allowed to be repeated. This enables copy-propagation use in situations where previously that might have happened. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: emit all immediates in one shotRob Clark2014-10-021-8/+16
| | | | | | | Makes the command stream a bit tighter when there are lots of immediates. Signed-off-by: Rob Clark <[email protected]>
* freedreno: instanced drawing/compute not yet supportedIlia Mirkin2014-10-021-3/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* mesa: fix GetTexImage for 1D array depth texturesDave Airlie2014-10-031-2/+7
| | | | | | | | | | | | | | | | While running piglit in virgl, I hit an assert in intel driver. "qemu-system-x86_64: intel_tex.c:219: intel_map_texture_image: Assertion `tex_image->TexObject->Target != 0x8C18 || h == 1' failed." Thanks to Eric and Ken for pointing me in the right direction, Fix the get_tex_depth to do the same fixup as get_tex_rgba does for 1D array textures. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: Fix paths used in Android buildsTomasz Figa2014-10-033-0/+6
| | | | | | | | | | | | | | | | | | With current makefiles the build fails because source and build paths are generated incorrectly. With Android build system the top_srcdir and top_builddir variables are undefined and all paths are relative to where Android.mk is located. This ends up with path likes external/mesa/src/mesa/src/mesa/ for both source and build paths, which are obviously wrong. This patch fixes this by overriding resulting SRCDIR and BUILDDIR variables with empty string, so that paths end up being relative to Android.mk file again. Appending correct build path to generated files is already done in Android.gen.mk. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Emil Velikov <[email protected]>