summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radv: add support for Vega12Samuel Pitoiset2018-03-283-1/+6
| | | | | | | Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* autotools: Include intel/dev/meson.build in tarballDylan Baker2018-03-281-0/+1
| | | | | | | Fixes: 272bef0601a1bdb5292771aefc8d62fcbdf4c47f ("intel: Split gen_device_info out into libintel_dev") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radeonsi: add support for Vega12Marek Olšák2018-03-287-8/+35
| | | | Reviewed-by: Alex Deucher <[email protected]>
* amd/addrlib: update to the latest version for Vega12Marek Olšák2018-03-2817-148/+439
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gbm: remove never-implemented functionEric Engestrom2018-03-282-3/+0
| | | | | | | | | | I assume this was implemented in a previous version of that commit, but was removed in the version that actually landed. Fixes: 8430af5ebe1ee8119e14 "Add support for swrast to the DRM EGL platform" Cc: Giovanni Campagna <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* android: Use new nir intrinsics python scriptsStefan Schake2018-03-281-0/+9
| | | | | | | Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics") Signed-off-by: Stefan Schake <[email protected]> Acked-by: Rob Clark <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* broadcom/vc5: Fix padding of NPOT miplevels >= 2.Eric Anholt2018-03-271-3/+8
| | | | | | | The power-of-two padded size that gets minified is based on level 1's dimensions, not level 0's, which starts to differ at a width of 9. Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1
* ac/radeonsi: pass bindless bool to load_sampler_desc()Timothy Arceri2018-03-284-5/+14
| | | | | | | | We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_nir: set driver location for bindless images and samplersTimothy Arceri2018-03-281-1/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set uses_bindless_samplers for samplersTimothy Arceri2018-03-281-0/+3
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: add bindless to nir dataTimothy Arceri2018-03-282-0/+7
| | | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* i965: Drop unnecessary bo->align field.Kenneth Graunke2018-03-273-10/+0
| | | | | | | | bo->align is always 0; there's no need to waste 8 bytes storing it. Thanks to C99 initializers zeroing fields, we can completely drop the only read of the field altogether. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Drop unused alignment parameter from brw_bo_alloc().Kenneth Graunke2018-03-2714-26/+25
| | | | | | brw_bo_alloc no longer uses this parameter, so there's no point. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Drop alignment parameter from bo_alloc_internal().Kenneth Graunke2018-03-271-7/+6
| | | | | | | | | | | Buffers are always page aligned on 965+ hardware; I believe this extra parameter is a vestige from the Gen2-3 era. All callers pass 0, and in fact we assert that the alignment is 0 unless BO_ALLOC_BUSY is set (for some reason). We can just drop the parameter and set the value to 0 explicitly. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo().Kenneth Graunke2018-03-271-2/+2
| | | | | | | intel_miptree_create_for_bo does not actually allocate a BO, so specifying allocation flags accomplishes nothing and is confusing. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Drop PIPE_CONTROL_NO_WRITE from various calls.Kenneth Graunke2018-03-274-11/+4
| | | | | | | This is just zero - passing nothing already gives us a post-sync operation of "nothing". Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/intrinsics: Don't report negative dest_componentsJason Ekstrand2018-03-271-1/+1
| | | | | | | | I have no idea why but having dest_components == -1 was causing a memory leak somewhere. Without this, you can't get through a full shader-db run without running out of memory. Reviewed-by: Rob Clark <[email protected]>
* intel/fs: Don't emit a des copy for image ops with has_dest == falseJason Ekstrand2018-03-271-3/+6
| | | | | | | | | | This was causing us to walk dest_components times over a thing with no destination. This happened to work because all of the image intrinsics without a destination also happened to have dest_components == 0. We shouldn't be reading dest_components if has_dest == false. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nvc0/ir: fix INTERP_* with indirect inputsIlia Mirkin2018-03-271-3/+4
| | | | | | | | | | | | There were two problems, both of which are fixed now: - The indirect address was not being shifted by 4 - The indirect address was being placed as an argument in the offset case This fixes some of the new interpolateAt* piglits which now test for these situations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: fix crash in loop unroll corner caseTimothy Arceri2018-03-281-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an if nesting inside anouther if is optimised away we can end up with a loop terminator and following block that looks like this: if ssa_596 { block block_5: /* preds: block_4 */ vec1 32 ssa_601 = load_const (0xffffffff /* -nan */) break /* succs: block_8 */ } else { block block_6: /* preds: block_4 */ /* succs: block_7 */ } block block_7: /* preds: block_6 */ vec1 32 ssa_602 = phi block_6: ssa_552 vec1 32 ssa_603 = phi block_6: ssa_553 vec1 32 ssa_604 = iadd ssa_551, ssa_66 The problem is the phis. Loop unrolling expects the last block in the loop to be empty once we splice the instructions in the last block into the continue branch. The problem is we cant move phis so here we lower the phis to regs when preparing the loop for unrolling. As it could be possible to have multiple additional blocks/ifs following the terminator we just convert all phis at the top level of the loop body for simplicity. We also add some comments to loop_prepare_for_unroll() while we are here. Fixes: 51daccb289eb "nir: add a loop unrolling pass" Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
* st/glsl_to_nir: correctly handle arrays packed across multiple varsTimothy Arceri2018-03-281-1/+23
| | | | | | | Fixes piglit test: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix input processing for packed varyingsTimothy Arceri2018-03-281-3/+2
| | | | | | | | | The location was only being incremented the first time we processed a location. This meant we would incorrectly skip some elements of an array if the first element was packed and proccessed previously but other elements were not. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix component packing for double outputsTimothy Arceri2018-03-281-1/+3
| | | | | | | | | | We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_nir: fix driver location for dual-slot packed doublesTimothy Arceri2018-03-281-6/+16
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix scanning of multi-slot output varyingsTimothy Arceri2018-03-281-109/+127
| | | | | | | | | | This fixes tcs/tes varying arrays where we dont lower indirects and therefore don't split arrays. Here we also fix useagemask for dual slot doubles. Fixes a number of arb_tessellation_shader piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* broadcom/vc5: Fix RG16I/UI texture sampling.Eric Anholt2018-03-271-2/+2
| | | | | | | How many times did I look at this table without noticing the missing 'G' in the texture column? Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
* nir: fix generated nir_intrinsics.c for MSVCRob Clark2018-03-271-0/+4
| | | | | | | | | | Apparently it is not happy about things like: .foo = {} So skip over initializers for empty lists. Fixes: 76dfed8ae2d5c6c509eb2661389be3c6a25077df Reported-by: Roland Scheidegger <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* nir: mako all the intrinsicsRob Clark2018-03-2711-619/+727
| | | | | | | | | | | | | | | | | | | | | | | I threatened to do this a long time ago.. I probably *should* have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and *mostly* remove side-effects, add autotools build support v4: scons build Signed-off-by: Rob Clark <[email protected]> Acked-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir: fix per_vertex_output intrinsicRob Clark2018-03-271-1/+1
| | | | | | | | | This is supposed to have both BASE and COMPONENT but num_indices was inadvertantly set to 1. Cc: <[email protected]> Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl_types: fix build break with intel/msvc compilerRob Clark2018-03-271-83/+24
| | | | | | | | | | | | | | | | | | | | The VECN() macro was taking advantage of a GCC specific feature that is not available on lesser compilers, mostly for the purposes of avoiding a macro that encoded a return statement. But as suggested by Ian, we could just have the macro produce the entire method body and avoid the need for this. So let's do that instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740 Fixes: f407edf3407396379e16b0be74b8d3b85d2ad7f0 Cc: Emil Velikov <[email protected]> Cc: Timothy Arceri <[email protected]> Cc: Roland Scheidegger <[email protected]> Cc: Ian Romanick <[email protected]> Signed-off-by: Rob Clark <[email protected]> Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add GL_HALF_FLOAT as supported type to readpixelsLin Johnson2018-03-271-0/+2
| | | | | | | | | | | | | | | | | | | EXT_color_buffer_float spec states: "An INVALID_OPERATION error is generated ... if the color buffer is a floating-point format and type is not FLOAT, HALF FLOAT, or UNSIGNED_INT_10F_11F_11F_REV." This means that GL_HALF_FLOAT type should be supported when color buffer has floating-point format. Fixes Android CTS test android.view.cts.PixelCopyTest. v2: remove comments of EXT_color_buffer_half_float as EXT_color_buffer_float can use type GL_HALF_FLOAT Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.Eric Anholt2018-03-261-1/+1
| | | | | | | This is the actual hardware layout, and we were only swizzling R/B back around in texturing. Fixes part of KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in simulation.
* broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.Eric Anholt2018-03-261-0/+1
| | | | Just like TLB without a config uniform, we don't have a register index.
* broadcom/vc5: Implement workaround for GFXH-1431.Eric Anholt2018-03-261-1/+5
| | | | | This should fix some blending errors, but doesn't impact any testcases in the CTS.
* broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.Eric Anholt2018-03-265-21/+111
| | | | | | | Once we've disabled EZ for some draws, we need to not use EZ on future draws. Implementing that made implementing the GT/GE direction trivial. Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
* broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.Eric Anholt2018-03-262-0/+8
| | | | | | On 3.x, we just don't flag the primitive as needing TF, but those primitive bits are now allocated to the new primitive types. Now we need to actually update the enable flag at draw time.
* broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.Eric Anholt2018-03-263-5/+29
| | | | | | The next job from this client will turn it back on unless TF gets disabled, but we don't want the state to leak from this client to another (which causes GPU hangs).
* broadcom/vc5: Move the BCL epilogue code to a per-version compile.Eric Anholt2018-03-265-24/+67
| | | | I need to do some new packets for transform feedback on 4.1.
* broadcom/vc5: Fix transform feedback in the presence of point size.Eric Anholt2018-03-263-4/+23
| | | | | | | I had this note to myself, and it turns out that a lot of CTS tests use XFB with points to get data out without using a fragment shader. Keep track of two sets of precomputed TF specs (point size in VPM prologue or not), and switch between them when we enable/disable point size.
* broadcom/vc5: Split transform feedback specs update from buffers.Eric Anholt2018-03-261-27/+32
| | | | | The specs update will be changing based on additional state flags in the next commit, and this unindents the buffer update code.
* broadcom/vc5: Limit each transform feedback data spec to 16 dwords.Eric Anholt2018-03-262-14/+31
| | | | | | | | | The length-1 field only has 4 bits, so we need to generate separate specs when there's too much TF output per buffer. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type and transform_feedback_max_interleaved.
* gallium/u_vbuf: Protect against overflow with large instance divisors.Eric Anholt2018-03-261-1/+10
| | | | | | | | | | | | GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. v2: Use some more obvious logic, and explain why we don't use the normal round_up(). Reviewed-by: Brian Paul <[email protected]>
* st: Allow accelerated CopyTexImage from RGBA to RGB.Eric Anholt2018-03-261-6/+26
| | | | | | | | | | | | | | | | There's nothing to worry about here -- the A channel just gets dropped by the blit. This avoids a segfault in the fallback path when copying from a RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an RGBA16_SINT texture (the fallback path tries to get/fetch to float buffers, but the float pack/unpack functions are NULL for SINT/UINT). Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5. v2: Extract the logic to a helper function and explain what's going on better. v3: const-qualify args Reviewed-by: Brian Paul <[email protected]>
* winsys/amdgpu: always allow GTT placements on APUsMarek Olšák2018-03-261-7/+5
| | | | Reviewed-by: Christian König <[email protected]>
* radeonsi: don't reallocate on DMABUF export if local BOs are disabledMarek Olšák2018-03-264-5/+9
|
* glsl: fix infinite loop caused by bug in loop unrolling passTimothy Arceri2018-03-271-1/+1
| | | | | | | | | | | | | | | | | | Just checking for 2 jumps is not enough to be sure we can do a complex loop unroll. We need to make sure we also have also found 2 loop terminators. Without this we were attempting to unroll a loop where the second jump was nested inside multiple ifs which loop analysis is unable to detect as a terminator. We ended up splicing out the first terminator but failed to actually unroll the loop, this resulted in the creation of a possible infinite loop. Fixes: 646621c66da9 "glsl: make loop unrolling more like the nir unrolling path" Tested-by: Gert Wollny <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
* gallium: Do not add -Wframe-address option for gcc <= 4.4.Vinson Lee2018-03-261-1/+1
| | | | | | | | | | | | | | | | This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Correct minor typo in header commentsAlyssa Rosenzweig2018-03-261-1/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* intel/aubinator_error_decode: Decode more registers.Rafael Antognolli2018-03-261-0/+12
| | | | | | Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add SAMPLER_INSTDONE register.Rafael Antognolli2018-03-266-0/+139
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>