summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a5xx: more formatsRob Clark2016-12-061-41/+41
| | | | | | Bunch of stuff we can at least turn on for vbo formats. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix fragfaceRob Clark2016-12-061-2/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix fragcoordRob Clark2016-12-061-4/+11
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2016-12-067-20/+129
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix alpha testRob Clark2016-12-063-5/+1
| | | | | | | | GRAS_SU_DEPTH_PLANE_CNTL doesn't in fact seem to be anything to do with alpha test. This fixes xonotic and (other than some iommu faults) gets gnome-shell working. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix VPC_VAR[n].DISABLE bitsRob Clark2016-12-061-13/+13
| | | | | | | | We don't need varying interpolators enabled for pos/psize out of the VS (despite the fact that they show up in VS_OUT map), so emit these before we append pos/psize to the linkage. Signed-off-by: Rob Clark <[email protected]>
* anv/TODO: Document sampling from HiZNanley Chery2016-12-061-0/+1
| | | | Acked-by: Jason Ekstrand <[email protected]>
* i965: Don't force SSO layout for VS->TCS.Kenneth Graunke2016-12-062-4/+3
| | | | | | | | | | | | | | | | | | This was a hack which worked around the VS and TCS disagreeing on their shared interface due to the lack of varying packing. In particular, it was needed by Piglit's tcs-input-read-array-interface test. However, that was just one case where things could go awry, so the previous commit forcibly made interfaces match. This hack is no longer necessary. It also seems to be broken, though I'm not sure why. It fixes Piglit regressions in spec/arb_shader_image_load_store/semantics from commit ec1f159ac81ed964415d102eed4a0a29be8e7937. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98893 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Unify shader interfaces explicitly.Kenneth Graunke2016-12-061-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A while ago, I made i965 start compiling shaders independently. The VUE map layouts were based entirely on each shader's input/output bitfields. Assuming the interfaces match, this works out well - both sides will compute the same layout, and outputs are correctly routed to inputs. At the time, I had assumed that the linker would guarantee that the interfaces match. While it usually succeeds, it unfortunately seems to fail in some cases. For example, Piglit's tcs-input-read-array-interface test has a VS output array with two elements, but the TCS only reads one. The linker isn't able to eliminate the unused element from the VS, which makes the interfaces not match. Another case is where a shader other than the last writes clip/cull distances. These should be demoted to ordinary varyings, but they currently aren't - so we think they still have some special meaning, and prevent them from being eliminated. Fixing the linker to guarantee this in all cases is complicated. It needs to be able to optimize out dead code. It's tied into varying packing and other messiness. While we can certainly improve it---and should---I'd rather not rely on it being correct in all cases. This patch ORs adjacent stages' input/output bitfields together, ensuring that their interface (and hence VUE map layout) will be compatible. This should safeguard us against linker insufficiencies. Fixes line rendering in Dolphin, and the Piglit test based on it: spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97232 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleModeJason Ekstrand2016-12-061-1/+1
| | | | | | | | | | We would really like it to be false as that's what you get on hardware that doesn't have RegisterPoleMode (Sky Lake for example). While we're at it, we change it to a boolean. This fixes dEQP-VK.synchronization.smoke.events on Broxton. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* gallivm: optimize 16bit->32bit gather path a bitRoland Scheidegger2016-12-061-3/+39
| | | | | | | | | | LLVM can't really optimize anything which crosses scalar/vector boundaries, so help a bit with some particular gather operations when the width is expanded (only do it for 16->32bit expansion for now), by doing expansion after fetch. That is probably a better solution anyway even if llvm would recognize it, makes for cleaner IR... Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: handle 16bit float fetches in lp_build_fetch_rgba_soaRoland Scheidegger2016-12-061-4/+18
| | | | | | | | | | | | | | | | | | | | Note that we really want to _never_ reach the bottom of the function, which resorts to AoS fetch. Half floats can be handled just like other formats which fit into 32bit vectors (so, only 1x16 and 2x16 formats, albeit with more channels things are not THAT bad), with minimal plumbing. I've seen code size go down nearly by a factor of 3 for a complete texture sampling function (including bilinear filtering) using R16F. (What we should do for everything not special cased is to do AoS gather, shuffle/shift things into SoA vectors, and then do the conversion there. Otherwise it's particularly bad with 1 or 2 channel formats - that r16f format with either 4 or 8-wide vectors was still doing one element at a time, essentially doing exactly the same work as for rgba16f. Also replacing the channels with SWIZZLE0/1 (particularly the latter) adds even more work, as it has to be done per aos vector, and not just straightforward at the end with the SoA vector.) Reviewed-by: Jose Fonseca <[email protected]>
* util: (trivial) ETC1 meets the criteria for fitting into unorm8Roland Scheidegger2016-12-061-0/+5
| | | | | | Just like other similar compressed formats. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Emit proper NOPs.Matt Turner2016-12-061-4/+2
| | | | | | | | | | | The PRMs for HSW and newer say that other than the opcode and DebugCtrl bits of the instruction word, the rest must be zero. By zeroing the instruction word manually, we avoid using any of the state inherited through brw_codegen. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96959 Reviewed-by: Ian Romanick <[email protected]>
* glsl: (trivial) fix type typoRoland Scheidegger2016-12-061-1/+1
| | | | | Accidentally changed the type of a constant in df33f11b39abf313a0db7b9fefaf739b88133161 causing assertion failures.
* i965: Allocate at least some URB space even when max_vertices = 0.Kenneth Graunke2016-12-051-1/+7
| | | | | | | | | | | | | | | | | | | | Allocating zero URB space is a really bad idea. The hardware has to give threads a handle to their URB space, and threads have to use that to terminate the thread. Having it be an empty region just breaks a lot of assumptions. Hence, why we asserted that it isn't possible. Unfortunately, it /is/ possible prior to Gen8, if max_vertices = 0. In theory a geometry shader could do SSBO/image access and maybe still accomplish something. In reality, this is tripped up by conformance tests. Gen8+ already avoids this problem by placing the vertex count DWord in the URB entry header. This fixes things on earlier generations. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Tested-by: Ian Romanick <[email protected]>
* main: allow NEAREST_MIPMAP_NEAREST for stencil texturingRoland Scheidegger2016-12-061-15/+8
| | | | | | | | | | | | As per GL 4.5 rules, which fixed a spec mistake in GL_ARB_stencil_texturing. The extension spec wasn't updated, but just allow it with older GL versions as well, hoping there aren't any crazy tests which want to see an error there... (Compile tested only.) Reported by Józef Kucia <[email protected]> Acked-by: Józef Kucia <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix ldexp lowering if bitfield insert lowering is also requestedRoland Scheidegger2016-12-061-5/+16
| | | | | | | | | | Trivial, this just resurrects the code which was there once upon a time (the code can't lower instructions generated in the lowering pass there, and even if it could it would probably be suboptimal). This fixes piglit mesa_shader_integer_functions fs-ldexp.shader_test and vs-ldexp.shader_test with llvmpipe. Reviewed-by: Matt Turner <[email protected]>
* radv: fix resource leak in radv_amdgpu_ctx_createNayan Deshmukh2016-12-061-0/+1
| | | | | | | | | CovID: 1396387 V2. Fixup bad whitespace. Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* st/omx/enc Raise default encode levelAndy Furniss2016-12-051-1/+1
| | | | | | | | Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281 Signed-off-by: Andy Furniss <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/vce Handle H.264 level 5.2Andy Furniss2016-12-051-1/+2
| | | | | | | | | Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281 v2: explicitly add case 52 Signed-off-by: Andy Furniss <[email protected]> Reviewed-by: Christian König <[email protected]>
* nir: Remove some unused fields from nir_variableJason Ekstrand2016-12-053-43/+0
| | | | | | | All of these are happily set from glsl_to_nir or spirv_to_nir but their values are never used for anything. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Delete most of the constant_initializer supportJason Ekstrand2016-12-055-146/+12
| | | | | | | | | | | Constant initializers have been a constant (ha!) pain for quite some time. While they're useful from a language perspective, people writing passes or backends really don't want deal with them most of the time. This commit removes most of the constant initializer support from NIR. It is expected that you call nir_lower_constant_initializers VERY EARLY to ensure that they're gone before you do anything interesting. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Simplify nir_lower_gs_intrinsicsJason Ekstrand2016-12-051-21/+16
| | | | | | | It's only ever called on single-function shaders. At this point, there are a lot of helpers that can make it all much simpler. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/lower_returns: Stop using constant initializersJason Ekstrand2016-12-051-4/+5
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl/nir: Call nir_lower_constant_initializersJason Ekstrand2016-12-051-0/+2
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/pipeline: Call nir_lower_constant_initializersJason Ekstrand2016-12-051-0/+13
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add a pass for lowering away constant initializersJason Ekstrand2016-12-053-0/+115
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* Revert "i965: use nir_lower_indirect_derefs() for GLSL"Jason Ekstrand2016-12-053-10/+23
| | | | | This reverts commit 9404439a754e5640ccd98df40fa694835c0d8759. I didn't intend to push it and it breaks clip and cull distance.
* i965: Delete the meta-base CopyImageSubData implementationJason Ekstrand2016-12-054-328/+0
| | | | | | | | | | | | | | | | | | | When I originally implemented the ARB_copy_image extension, the fast-path was written in meta using texture views. This path only worked if both images were uncompressed color images. All of the other cases fell back to the blitter or, in the worst case, mapping and memcpy on the CPU. Now that we have the blorp path, it handles all copies ever and the old meta, blitter, and CPU paths are only used on gen5 and below. The primary reason why we needed the meta path (apart from having a slow blitter on later hardware) was to handle multisampling which gen5 and earlier don't support anyway. Since the blitter is reasonably fast on gen5, we can just delete the meta path and get rid of all that terrible code. If we decide that we're ok with just disabling ARB_copy_image on gen5 and earlier (I personally am), then we could get rid of another 300 lines or so of semi-hairy code. Reviewed-by: Anuj Phogat <[email protected]>
* i965/copy_image: Re-implement the blitter path with emit_miptree_blitJason Ekstrand2016-12-053-97/+80
| | | | | | | | | | By using emit_miptree_blit which does chunking, this fixes the blitter path for the case where the image is too tall to blit normally. We also pull it into intel_blit as intel_miptree_copy. This matches the naming of the blorp blit and copy functions brw_blorp_blit and brw_blorp_copy. Reviewed-by: Anuj Phogat <[email protected]> Cc: "13.0" <[email protected]>
* i965/blit: Break the guts of intel_miptree_blit into a helperJason Ekstrand2016-12-051-67/+84
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Cc: "13.0" <[email protected]>
* i965: use nir_lower_indirect_derefs() for GLSLTimothy Arceri2016-12-053-23/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0
* swr: mark PIPE_CAP_NATIVE_FENCE_FD unsupportedTim Rowley2016-12-051-0/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: include llvm version and vector width in renderer stringTim Rowley2016-12-051-1/+11
| | | | | | Uses llvmpipe's string formating. Reviewed-by: Bruce Cherniak <[email protected]>
* gallivm: use getHostCPUFeatures on x86/llvm-4.0+.Tim Rowley2016-12-051-0/+15
| | | | | | | Use llvm provided API based on cpuid rather than our own manually mantained list of mattr enabling/disabling. Reviewed-by: Roland Scheidegger <[email protected]>
* st/va: declare vlVaBuffer before vlVaContextJuan A. Suarez Romero2016-12-051-15/+15
| | | | | | | | | | And declare coded_buf in vlVaContext as "vlVaBuffer *" instead of "struct vlVaBuffer *". This fixes several warnings later about assignment from incompatible pointer type. Reviewed-by: Emil Velikov <[email protected]>
* st/va: remove unused variable pbuffJuan A. Suarez Romero2016-12-051-1/+0
| | | | | Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* st/va: automake: cleanup C{PP,}FLAGSEmil Velikov2016-12-051-12/+0
| | | | | | | | Remove some transitional left overs from the gallium pipe-loader rework and kill off unneeded AM_CPPFLAGS. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Christian König <[email protected]>
* add EGL_TEXTURE_EXTERNAL_WL to WL_bind_wayland_display specRob Clark2016-12-052-0/+6
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* docs: add news item and link release notes for 12.0.5Emil Velikov2016-12-052-0/+11
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 12.0.5Emil Velikov2016-12-051-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 6b1c3c3aa0a2b643dbb9964b7001097eed3c4888)
* docs: add release notes for 12.0.5Emil Velikov2016-12-051-0/+137
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 01579a9d007830f2f905646c9d1f9bd0a03caa89)
* configure.ac: Create correct LLVM_VERSION_INT with minor >= 10Tobias Droste2016-12-051-1/+5
| | | | | | | This makes sure that we handle LLVM minor version >= 10 correctly. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Get complete LLVM version from headerTobias Droste2016-12-051-6/+4
| | | | | | | | | | | | | | Major and minor version are included in the header file since LLVM version 3.1.0. Since the minimal required version is 3.3.0 we can remove the workaround if no values for major/minor were found in the header. Since LLVM 3.6.0 the patch version is inside the header file of LLVM. Only radeon drivers need the patch version and they depend on LLVM >= 3.6.0, so this is safe too. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Add required LLVM versions to the topTobias Droste2016-12-051-14/+54
| | | | | | | | | | | Consolidate the required LLVM versions at the top where the other versions for dependencies are listed. v5: Splitted out separate changes (see patch 19 and 20) Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Only add default LLVM components if neededTobias Droste2016-12-051-3/+12
| | | | | | | | | | | | LLVM components are only added when LLVM is needed. This means gallium adds this as soon as "--enable-gallium-llvm" is "yes" and radv + opencl add it explicitly. v5: Removed hunk that disabled LLVM for gallium if it was not found. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Reorder arguments in radeon_llvm_checkTobias Droste2016-12-051-8/+8
| | | | | | | Use the same order as llvm_check_version_for. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Move radv check to the Vulkan sectionTobias Droste2016-12-051-4/+1
| | | | | | | | This moves the LLVM check for radv to the corresponding driver section. No functional change. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Move LLVM ac_subst closer to usageTobias Droste2016-12-051-15/+14
| | | | | | | | | This moves llvm_set_environment_variables to its final destination and moves all the LLVM AC_SUBST() below the function call. No functional change. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Emil Velikov <[email protected]>