summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: generate an explicit switch instruction over vertex streamsNicolai Hähnle2016-12-121-8/+13
| | | | | | | | | | | | SimplifyCFG generates a switch instruction anyway when all four streams are present, but is simultaneously not smart enough to eliminate some redundant jumps that it generates. The generated assembly is still a bit silly, probably because the control flow annotation doesn't know how to handle a switch with uniform condition. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fetch only outputs of current vertex stream from the GSVS ringNicolai Hähnle2016-12-121-16/+25
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only export from GS copy shader for vertex stream 0Nicolai Hähnle2016-12-121-12/+19
| | | | | | | | When running the copy shader for vertex streams != 0, the SX does not need any data from us (there is no rasterization for the higher vertex streams, only streamout). Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: do not export VS outputs from vertex streams != 0Nicolai Hähnle2016-12-121-0/+6
| | | | | | | | | | | | This affects for GS copy shaders. When an output is meant for vertex stream != 0, then we don't have to make it available to the pixel shader. There is a minor inefficiency here because the GLSL varying packing pass does not group varyings of the same vertex stream together, but it shouldn't be important in practice. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pull iteration over vertex streams into GS copy shader logicNicolai Hähnle2016-12-121-25/+37
| | | | | | The iteration is not needed for normal vertex shaders. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: group streamout writes by vertex streamNicolai Hähnle2016-12-121-10/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: load the streamout buf descriptors closer to their useNicolai Hähnle2016-12-121-14/+11
| | | | | | LLVM can still decide to hoist the loads since they're marked invariant. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract writing of a single streamout outputNicolai Hähnle2016-12-121-39/+52
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: separate the call to si_llvm_emit_streamout from exportsNicolai Hähnle2016-12-121-4/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: plumb the output vertex_stream through to si_shader_output_valuesNicolai Hähnle2016-12-121-1/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename members of si_shader_output_valuesNicolai Hähnle2016-12-121-8/+8
| | | | | | Be a bit more verbose and avoid confusion in future patches. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix an off-by-one error in the bounds check for max_verticesNicolai Hähnle2016-12-121-1/+1
| | | | | | | | | | | The spec actually says that calling EmitStreamVertex is undefined when you exceed max_vertices. But we do need to avoid trampling over memory outside the GSVS ring. Cc: [email protected] Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: do not kill GS with memory writesNicolai Hähnle2016-12-121-8/+22
| | | | | | | | | | | Vertex emits beyond the specified maximum number of vertices are supposed to have no effect, which is why we used to always kill GS that reached the limit. However, if the GS also writes to memory (SSBO, atomics, shader images), then we must keep going and only skip the vertex emit itself. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: update all GSVS ring descriptors for new buffer allocationsNicolai Hähnle2016-12-121-1/+6
| | | | | | | | Fixes GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced. Cc: [email protected] Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_tgsi: plumb the GS output stream qualifier through to TGSINicolai Hähnle2016-12-123-1/+31
| | | | | | Allow drivers to emit GS outputs in a smarter way. Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: collect information about output usagemasksNicolai Hähnle2016-12-122-0/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: collect information about output vertex streamsNicolai Hähnle2016-12-122-0/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium: extract individual streamout output structureNicolai Hähnle2016-12-121-8/+13
| | | | | | So that we can pass pointers to individual array entries around. Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add Stream{X,Y,Z,W} fields to tgsi_declaration_semanticNicolai Hähnle2016-12-124-3/+81
| | | | | | | | | | | This is for geometry shader outputs. Without it, drivers have no way of knowing which stream each output is intended for, and have to conservatively write all outputs to all streams. Separate stream numbers for each component are required due to output packing. Reviewed-by: Marek Olšák <[email protected]>
* glsl: remember per-component vertex streams for packed varyingsNicolai Hähnle2016-12-123-2/+24
| | | | Reviewed-by: Marek Olšák <[email protected]>
* i965/blorp: fix release build unused variable warningGrazvydas Ignotas2016-12-121-3/+1
| | | | | Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* virgl: Fix a strict-aliasing violation in the encoderEdward O'Callaghan2016-12-121-1/+7
| | | | | | | | | | | | | | | | As per the C spec, it is illegal to alias pointers to different types. This results in undefined behaviour after optimization passes, resulting in very subtle bugs that happen only on a full moon.. Use a memcpy() as a well defined coercion between the double to uint64_t interpretations of the memory. V.2: Use static_assert() instead of assert(). V.3: Use C99 compat STATIC_ASSERT() over C11 static_assert(). Signed-off-by: Edward O'Callaghan <[email protected]> Acked-by: Dave Airlie <[email protected]>
* i965: Print out cycle estimates at the start of block annotations.Kenneth Graunke2016-12-111-1/+1
| | | | | | | | | | | | We now print START B15 <-B14 (42774 cycles) indicating that we estimate B15 will take 42,774 cycles. Printing this should make it easier where time is spent in the program. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Return LINEAR encoding for winsys FBO depth/stencil.Kenneth Graunke2016-12-111-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GetFramebufferAttachmentParameteriv should return GL_LINEAR for the window system default framebuffer's GL_DEPTH or GL_STENCIL attachments when there are zero depth or stencil bits. The GL 4.5 spec's GetFramebufferAttachmentParameteriv section says: "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is not NONE, these queries apply to all other framebuffer types: [...] If attachment is not a color attachment, or no data storage or texture image has been specified for the attachment, then params will contain the value LINEAR." Note that we already return LINEAR for the case where there is an actual depth or stencil renderbuffer attached. In the case modified by this patch, FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE returns FRAMEBUFFER_DEFAULT rather than NONE. Fixes a CTS test when run in a visual without depth / stencil buffers: GL45-CTS.gtf30.GL3Tests.framebuffer_srgb.framebuffer_srgb_default_encoding Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/aubinator: fix 32bit shift overflow warningGrazvydas Ignotas2016-12-111-1/+1
| | | | | | | | Doesn't look like this can work on 32bit, just rids of annoying warning. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* anv: fix release build unused variable warningsGrazvydas Ignotas2016-12-112-2/+3
| | | | | Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* radv/ac: some fix maybe-uninitialized warningsGrazvydas Ignotas2016-12-101-1/+4
| | | | | | | | Mark some paths unreachable so that compiler knows variables are initialized in all valid paths. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/meta: use VK_NULL_HANDLE for handlesGrazvydas Ignotas2016-12-103-4/+4
| | | | | | | | Otherwise we get 32bit warnings because handle is plain uint64_t there and NULL is not suited to initialize that. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix release build unused variable warningsGrazvydas Ignotas2016-12-102-19/+21
| | | | | | | Just mark with MAYBE_UNUSED. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* softpipe: fix release build unused variable warningGrazvydas Ignotas2016-12-101-1/+1
| | | | | Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: fix release build unused variable warningsGrazvydas Ignotas2016-12-102-2/+2
| | | | | Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* i965/mt: Disable HiZ when sharing depth buffer externally (v2)Chad Versace2016-12-101-7/+22
| | | | | | | | | | | | | | | | | intel_miptree_make_shareable() discarded and disabled CCS. Fix it so that it discards and disables HiZ too. Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer on Skylake. v2: Actually do what the commit message says. Discard the HiZ buffer. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: Nanley Chery <[email protected] Cc: Haixia Shi <[email protected]> Cc: [email protected]
* i965/mt: Disable aux surfaces after making miptree shareableChad Versace2016-12-101-0/+2
| | | | | | | | | | | | | | The entire goal of intel_miptree_make_shareable() is to permanently disable the miptree's aux surfaces. So set intel_mipmap_tree:disable_aux_buffers after the function's done with discarding down the aux surfaces. References: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: Nanley Chery <[email protected] Cc: Haixia Shi <[email protected]> Cc: [email protected]
* spirv: Use a simpler and more correct implementaiton of tanh()Jason Ekstrand2016-12-091-9/+14
| | | | | | | | | | The new implementation is more correct because it clamps the incoming value to 10 to avoid floating-point overflow. It also uses a much reduced version of the formula which only requires 1 exp() rather than 2. This fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* glsl: Use a simpler formula for tanhJason Ekstrand2016-12-091-8/+10
| | | | | | | | | | | | The formula we have used in the past is a trivial reduction from the definition by simply multiplying both the numerator and denominator of the formula by 2. However, multiplying by e^x, you can further reduce it. This allows us to get rid of one side of the clamp and two of exponential functions which should make it faster. The new formula still passes the dEQP precision tests for tanh so it should be fine. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Clean up some unused variablesEdward O'Callaghan2016-12-101-15/+0
| | | | | | | Following on from the spirit of commit 011e5570f. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* swr: [rasterizer common/core/jitter] fetch support for GL_FIXEDTim Rowley2016-12-095-34/+188
| | | | | | v2: use fmul(1/65536) instead of fdiv(65535) Reviewed-by: Bruce Cherniak <[email protected]>
* configure: cleanup GLX_USE_TLS handlingEmil Velikov2016-12-091-2/+3
| | | | | | | | | | Mesa requires ax_pthread_ok = yes, thus we can fold/rewrite the conditional to follow the more common "if test" pattern. No functional change intended. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* configure: enable glx-tls by defaultEmil Velikov2016-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the (not too) distant future we'd want to remove this option and effectively drop the other codepath(s) we have in our dispatch. Linux distributions have been using --enable-glx-tls for a number of years. Some/most BSD platforms still don't support this, yet this should serve as an encouragement to move things forwards. Note: we had many bug reports were opened due to the wrong default option. See the list below for details. v2: - Correct default option in help string (Andreas) - Add bugzilla references. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70623 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73778 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89043 Cc: Jean-Sébastien Pédron <[email protected]> Cc: Jonathan Gray <[email protected]> Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* docs: document how to (self-) reject stable patchesEmil Velikov2016-12-091-0/+7
| | | | | | | | | Document what has been the unofficial way to self-reject stable patches. Namely: drop the mesa-stable tag and push the commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* egl: add and enable EGL_KHR_config_attribsEmil Velikov2016-12-092-0/+7
| | | | | | | Extension is already implemented in the main code. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl/surfaceless: remove duplicate KHR_image_base enablementEmil Velikov2016-12-091-2/+0
| | | | | | | | Already set by the core code - dri2_create_screen/dri2_setup_screen Cc: Chad Versace <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: unexport _eglConvertIntsToAttribsEric Engestrom2016-12-092-4/+1
| | | | | | | | Nobody else makes use of this function. We can always re-export it if someone ever needs it. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl: rename static functions to match conventionEric Engestrom2016-12-091-9/+9
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* compiler/glsl: fix precision problem of tanhHaixia Shi2016-12-091-2/+10
| | | | | | | | | | | | | | | | Clamp input scalar value to range [-10, +10] to avoid precision problems when the absolute value of input is too large. Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test failures. v2: added more explanation in the comment. v3: fixed a typo in the comment. Signed-off-by: Haixia Shi <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* swr: [rasterizer core/memory] Finish R24_UNORM_X8_TYPELESS for AVX512Tim Rowley2016-12-092-26/+24
| | | | | | This one-off specialization was missed. Reviewed-by: Bruce Cherniak <[email protected]>
* radv: Use enum for memory types.Bas Nieuwenhuizen2016-12-092-28/+21
| | | | | | | | | Inspired by patches from Eric Engestrom. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: Eric Engestrom <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use enum for memory heaps.Bas Nieuwenhuizen2016-12-092-8/+17
| | | | | | | | | Inspired by patches from Eric Engestrom. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: Eric Engestrom <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Clean up some unused variables.Bas Nieuwenhuizen2016-12-091-16/+0
| | | | | | | | Leftovers from anv? Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* i965: delay adding built-in uniforms to Parameters listTimothy Arceri2016-12-091-23/+19
| | | | | | | | | | This is a step towards using NIR optimisations over GLSL IR optimisations. Delaying adding built-in uniforms until after we convert to NIR gives it a chance to optimise them away. V2: move the new code back to brw_link_shader() Reviewed-by: Kenneth Graunke <[email protected]>