summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/phi_builder: Don't recurse in value_get_block_defJason Ekstrand2016-08-251-29/+36
| | | | | | | | | | | | | In some programs, we can have very deep dominance trees and the recursion can cause us to risk stack overflows. Instead, we replace the recursion with a pair of loops, one at the start and one at the end. This is functionally equivalent to what we had before and it's actually a bit easier to read in the new form without the recursion. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* .mailmap: Update my address againChad Versace2016-08-251-4/+5
| | | | I joined Google's Chrome OS graphics team.
* nir: Walk blocks in source code order in lower_vars_to_ssa.Matt Turner2016-08-252-106/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this commit rename_variables_block() is recursively called, performing a depth-first traversal of the control flow graph. The function uses a non-trivial amount of stack space for local variables, which puts us in danger of smashing the stack, given a sufficiently deep dominance tree. XCOM: Enemy Within contains a shader with such a dominance tree (1574 nir_blocks in total, depth of at least 143). Jason tells me that he believes that any walk over the nir_blocks that respects dominance is sufficient (a DFS might have been necessary prior to the introduction of nir_phi_builder). In fact, the introduction of nir_phi_builder made the problem worse: rename_variables_block(), walks to the bottom of the dominance tree before calling nir_phi_builder_value_get_block_def() which walks back to the top of the dominance tree... In any case, this patch ensures we avoid that problem as well. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <[email protected]>
* radeonsi: don't use allocas for arrays with LLVM 3.8Marek Olšák2016-08-251-1/+3
| | | | | | It crashes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413
* gallium/radeon: unify and simplify checking for an empty gfx IBMarek Olšák2016-08-253-27/+23
| | | | | | | We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.
* mesa: Drop sed of now dead Plo files.Matt Turner2016-08-251-3/+0
| | | | | gen6/7/8_blorp.c were removed in commits c8bc1ae96a, e198983c61, and 16a9fcbbb6 respectively.
* meta: Always do GenerateMipmaps in linear colorspace.Kenneth Graunke2016-08-251-2/+10
| | | | | | | | | | | | When generating mipmaps for sRGB textures, force both decode and encode, so the filtering is done in linear colorspace, regardless of settings. Fixes a WebGL conformance test in Chrome: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/misc/tex-srgb-mipmap.html?webglVersion=2 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97322 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* configure.ac: raise Mako required version to 0.8.0Eric Engestrom2016-08-251-1/+1
| | | | | | | | | | | | | | | | It seems [0] old versions of Mako are no longer supported. Emil mentioned it might need v0.8.0 [1] for isl_format_layout [2], although I didn't get a confirmation that it's really the minimum. Let's raise it to that to avoid getting other bugs. We might lower it a bit again later if it turns out we can. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122772.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122775.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123278.html Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Dave Airlie <[email protected]>
* swrast: fix incorrectly positioned putImage() in swrast driverBrian Paul2016-08-251-2/+2
| | | | | | | | | | | | | | | Some front buffer rendering was in the wrong position. This included scissored clears, glDrawPixels and glCopyPixels. The problem was the y coordinate passed to putImage() didn't match the y coordinate passed to getImage(). We fix this by setting xrb->map_y to the inverted coordinate in swrast_map_renderbuffer() which is used later by the putImage() call. Also pass xrb->map_y to getImage() to be symmetric. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426 Cc: <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeonsi: disable SDMA texture copying on CarrizoMarek Olšák2016-08-251-0/+6
| | | | | Cc: 12.0 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium/noop: use 3-space indentationMarek Olšák2016-08-252-292/+292
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium: add a pipe_context parameter to resource_get_handleMarek Olšák2016-08-2521-16/+46
| | | | | | | | radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: fix sRGB BlitFramebuffer regressionNicolai Hähnle2016-08-251-16/+18
| | | | | | | | | Broken since: 3190c7ee9727161d627f107c2e7f8ec3a11941c1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97285 Tested-by: Edmondo Tommasina <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* loader/dri3: Overhaul dri3_update_num_backMichel Dänzer2016-08-251-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Always use 3 buffers when flipping. With only 2 buffers, we have to wait for a flip to complete (which takes non-0 time even with asynchronous flips) before we can start working on the next frame. We were previously only using 2 buffers for flipping if the X server supports asynchronous flips, even when we're not using asynchronous flips. This could result in bad performance (the referenced bug report is an extreme case, where the inter-frame stalls were preventing the GPU from reaching its maximum clocks). I couldn't measure any performance boost using 4 buffers with flipping. Performance actually seemed to go down slightly, but that might have been just noise. Without flipping, a single back buffer is enough for swap interval 0, but we need to use 2 back buffers when the swap interval is non-0, otherwise we have to wait for the swap interval to pass before we can start working on the next frame. This condition was previously reversed. Cc: "12.0 11.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260 Reviewed-by: Frank Binns <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* anv: Include the pipeline layout in the shader hashJason Ekstrand2016-08-244-4/+40
| | | | | | | | | | | | The pipeline layout affects shader compilation because it is what determines binding table locations as well as whether or not a particular buffer has dynamic offsets. Since this affects the generated shader, it needs to be in the hash. This fixes a bunch of CTS tests now that the CTS is using a pipeline cache. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Cc: "12.0" <[email protected]>
* anv: Add a --disable-vulkan-icd-full-driver-path optionJason Ekstrand2016-08-253-2/+15
| | | | | | | | This option makes installed Vulkan ICD files contain only a driver library name and not a path. This is intended for distros to help them work around multi-arch issues. Reviewed-by: Dave Airlie <[email protected]>
* i965/fs: Don't consider the stencil output to be a color output.Francisco Jerez2016-08-241-1/+2
| | | | | | | | | This would cause gl_FragStencilRef to be counted as a color output incorrectly during the precompile phase, which leads to unnecessary recompilation on master and could trigger an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Keep track of the set of fragment outputs read by a GL program.Francisco Jerez2016-08-242-0/+4
| | | | | | | | This is the set of shader outputs whose initial value is provided to the shader by some external means when the shader is executed, rather than computed by the shader itself. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Don't consider read-only fragment outputs to be written to.Francisco Jerez2016-08-241-1/+1
| | | | | | | | Since they cannot be written. This prevents adding fragment outputs to the OutputsWritten set that are only read from via the gl_LastFragData array but never written to. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/linker: Allow fragment output overlap for gl_LastFragData.Francisco Jerez2016-08-241-0/+3
| | | | | | gl_LastFragData overlaps gl_FragData by definition. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/ast: Allow redeclaration of gl_LastFragData with different precision ↵Francisco Jerez2016-08-241-0/+12
| | | | | | | | qualifier. v2: No need to check the GLSL version. (Ken) Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Don't attempt to do dead varying elimination on gl_LastFragData arrays.Francisco Jerez2016-08-241-3/+4
| | | | | | | | | | Apparently this pass can only handle elimination of a single built-in fragment output array, so the presence of gl_LastFragData (which it wouldn't split correctly anyway) could prevent it from splitting the actual gl_FragData array. Just match gl_FragData by name since it's the only built-in it can handle. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Define a gl_LastFragData built-in for older GLSL versions.Francisco Jerez2016-08-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | The EXT_shader_framebuffer_fetch extension defines alternative language for GLES2 shaders where user-defined fragment outputs are not allowed. Instead of using inout user-defined fragment outputs the shader is expected to read from the gl_LastFragData built-in array. In addition this allows using the same language on desktop GLSL versions prior to 4.2 that support the deprecated gl_FragData built-in in preparation for the MESA_shader_framebuffer_fetch desktop GL extension. Both legacy and user-defined inout outputs have a common representation at the GLSL IR level, so it shouldn't make any difference for optimization passes and back-ends whether the application is using gl_LastFragData or user-defined outputs, all they'll see is a variable dereference of a fragment output at a certain interface location with the fb_fetch_output bit set to one. v2: Don't define the built-in variable on GLSL versions for which gl_FragData exists but is deprecated. (Ken) Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Handle the inout qualifier in fragment shader output declarations.Francisco Jerez2016-08-242-1/+16
| | | | | | | | | | | | According to the EXT_shader_framebuffer_fetch extension the inout qualifier can be used on ESSL 3.0+ shaders to declare a special kind of fragment output that gets implicitly initialized with the previous framebuffer contents at the current fragment coordinates. In addition we allow using the same language to define FB fetch outputs in GLSL 1.3+ shaders in preparation for the desktop MESA_shader_framebuffer_fetch extensions. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add support for representing framebuffer fetch in the GLSL IR.Francisco Jerez2016-08-242-0/+9
| | | | | | | | | | The GLSL IR representation of framebuffer fetch amounts to a single bit in the ir_variable object applicable to fragment shader outputs. The flag indicates that the variable will be implicitly initialized to the previous contents of the render buffer at the same fragment coordinates and sample index. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add parser state enables for the framebuffer fetch extensions.Francisco Jerez2016-08-242-0/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add blend barrier entry point and driver hook.Francisco Jerez2016-08-243-0/+29
| | | | | | | | | | | | | Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent variant of KHR_blend_equation_advanced will use this driver hook to request coherency between framebuffer reads and writes. This intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer since the extension isn't exposed to applications yet, see [1] for more details. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Move shader memory barrier functions into barrier.c.Francisco Jerez2016-08-244-57/+57
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Rename "texturebarrier" source files to "barrier".Francisco Jerez2016-08-245-15/+15
| | | | | | | In preparation for collecting all pipeline barrier GL entry points into a single source file. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT.Francisco Jerez2016-08-243-0/+14
| | | | | | | | | | | This can currently only give true as result since the only way you can expose EXT_shader_framebuffer_fetch right now is by flipping the MESA_shader_framebuffer_fetch bit, but that could potentially change in the future, see [1] for an explanation. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add extension enables for framebuffer fetch extensions.Francisco Jerez2016-08-242-0/+3
| | | | | | | | | | | This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+ contexts if desired. Note that this adds boolean flags for two MESA extensions, but only the EXT GLES-only extension is exposed for the moment, see the cover letter of this series [1] for the rationale. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <[email protected]>
* glapi: Add XML for GL_EXT_shader_framebuffer_fetch.Francisco Jerez2016-08-241-0/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0: invalidate textures/samplers on GK104+Samuel Pitoiset2016-08-242-12/+22
| | | | | | | | | | | | | | Like Fermi, textures and samplers are aliased between 3D and compute, especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate these resources when switching between the two pipelines. This fixes a GPU hang with Elemental (and most likely with other UE4 demos). Tested on GK107 and GM107. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> CC: <[email protected]>
* gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initializationRhys Kidd2016-08-241-1/+0
| | | | | | | | | Duplicate line is currently on 1535. Identified by Clang, when run through Eric Anholt's Travis harness. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.Eric Anholt2016-08-241-6/+7
| | | | | Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Rhys Kidd <[email protected]>
* travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.Eric Anholt2016-08-241-1/+1
| | | | | Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Rhys Kidd <[email protected]>
* travis: Update to the Ubuntu Trusty image.Eric Anholt2016-08-241-1/+2
| | | | | | | | | This will hopefully fix wget from x.org (no real reason explained in Travis CI bug reports), and may also mean that we can enable LLVM driver builds. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Rhys Kidd <[email protected]>
* travis: Parse configure.ac to pick an updated LIBDRM_VERSION.Eric Anholt2016-08-241-0/+10
| | | | | | | | | | | Travis has been broken a couple of times by configure.ac updates. To make it useful, auto-update the version necessary. This could potentially be used for other dependencies, too, but those get bumped less frequently. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Rhys Kidd <[email protected]>
* anv: meta_blit2d: adapt texel fetch pitch for fake w-tiledLionel Landwerlin2016-08-241-1/+3
| | | | | | | | | | | | | We need to compute detiling coordinates using the physical size of W tiling (128x32) rather than the logical size (64x64). v2: Correct comment (Jason) Fixes dEQP-VK.api.copy_and_blit.image_to_image_stencil Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97448 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vc4: Fix GPU hangs with >16 varying values.Eric Anholt2016-08-242-19/+68
| | | | Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.
* vl/rbsp: fix another three byte not detectedLeo Liu2016-08-241-1/+1
| | | | | | | | | | | | This happens when three byte "00 00 03" is partly loaded to vlc->buffer, thus at the bottom of buffer with valid bits is "00" or "00 00" and left like "00 03" or "03" in the data, so that it will not be detected by three byte emulation check. The reason for that is the escaped bit was set to 0 from the rbsp init. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: fix VM faults due NULL internal const buffers on CIKMarek Olšák2016-08-241-2/+11
| | | | | | | | They are harmless, but the interrupts do decrease performance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039 Cc: 12.0 <[email protected]>
* gallium/winsys/kms: Look up the GEM handle after importing a prime FDTomasz Figa2016-08-241-0/+4
| | | | | | | | | | | | | | | | | | | | drmPrimeHandleToFD() will return the same GEM handle every time the same buffer is imported, even from a different prime FD. Since GEM handles are not reference counted, we need to make sure that each GEM handle is referenced only by one display target struct, by looking it up in kms_sw->bo_list first and bumping the refcount of the found dt on hit and falling back to creating a new dt only on miss. v2: Split into separate function. Use helper function for lookup. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Hans de Goede <[email protected]> (v2) Signed-off-by: Emil Velikov <[email protected]>
* gallium/winsys/kms: Move display target handle lookup to separate functionTomasz Figa2016-08-241-9/+24
| | | | | | | | | | | | | | | | | As a preparation to use the lookup in more than once place, move the code that looks up given KMS/GEM handle to a separate function. This change should not introduce any functional changes. v2: Split into separate patch. Move lookup code into separate function. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Hans de Goede <[email protected]> (v2) Signed-off-by: Emil Velikov <[email protected]>
* gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)Tomasz Figa2016-08-241-7/+11
| | | | | | | | | | | | | | | Currently kms_sw_displaytarget_add_from_prime() allocates the struct and fills in only some of the fields, resulting in a half-baked struct that needs to be further completed by the caller. To make this a bit more consistent, pass width, height and stride to this function and fill in everything there, so that caller can take the returned struct as is. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)Tomasz Figa2016-08-241-1/+0
| | | | | | | | | | | | | Currently the code creates a display target struct with refcount field initialized to 1 and then the caller again increments it, leading to a leaked reference. Let's remove the unnecessary increment. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* shaderapi: don't generate not linked error on GetProgramStage in generalAlejandro Piñeiro2016-08-241-1/+18
| | | | | | | | | | Both ARB_shader_subroutine and the GL core spec doesn't list any error when the program is not linked. We left a error generation for the uniform location, in order to be consistent with other methods from the spec that generate them. Reviewed-by: Tapani Pälli <[email protected]>
* gallium/cso: avoid unnecessary null dereferenceEric Engestrom2016-08-241-1/+1
| | | | | | | | | | | The label `out:` calls `destroy()` which dereferences `ctx`. This is unnecessary as there is nothing to destroy. Immediately return instead. CovID: 1258255 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* .gitignore: Ignore tags generated by `make tags`Eric Engestrom2016-08-241-0/+1
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> [Emil Velikov: rebase] Reviewed-by: Emil Velikov <[email protected]>
* st/xvmc: fix a couple 'unused-but-set-variable' warningsEric Engestrom2016-08-241-2/+3
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>