summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR inputMarek Olšák2018-02-264-20/+53
| | | | | | | | so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set correct num_input_sgprs for VS prolog in merged shadersMarek Olšák2018-02-261-24/+24
| | | | | | | We need to take num_input_sgprs from VS, not the second shader. No apps suffered from this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow fewer input SGPRs in 2nd shader of merged shadersMarek Olšák2018-02-261-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't use struct si_descriptors for vertex buffer descriptorsMarek Olšák2018-02-266-33/+46
| | | | | | VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <[email protected]>
* r600: fix tgsi clock last settingDave Airlie2018-02-261-0/+1
| | | | | | | On cayman this was hitting an assert later, which probably wasn't see on non-cayman due to having the t slot. Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
* r600: add time lo/hi debugging output.Dave Airlie2018-02-262-0/+12
| | | | This just adds the these to the debug prints.
* radeonsi/nir: enable lowering of fpowTimothy Arceri2018-02-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Lowering fpow in NIR rather than LLVM can be beneficial. Polaris results: Totals from affected shaders: SGPRS: 124928 -> 124896 (-0.03 %) VGPRS: 68616 -> 68332 (-0.41 %) Spilled SGPRs: 394 -> 413 (4.82 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3668912 -> 3658368 (-0.29 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 18575 -> 18593 (0.10 %) Wait states: 0 -> 0 (0.00 %) Fixes: d6b753920677 "ac/nir: remove emission of nir_op_fpow" Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_infoTimothy Arceri2018-02-262-7/+0
| | | | | | | Seems to have not been used since 16be87c90429 Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/nir: fix loading of doubles for tess varyingsTimothy Arceri2018-02-261-2/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix lds store in tcs outputs handlingTimothy Arceri2018-02-261-1/+1
| | | | | | We were ignoring the channel offset. Reviewed-by: Marek Olšák <[email protected]>
* r600: Take ALU_EXTENDED into account when evaluating jump offsetsGert Wollny2018-02-261-2/+7
| | | | | | | | | | | ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump offset needs to be adjusted accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654 Cc: <[email protected]> Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: remove si_descriptors parameter from emit_shader_pointer functionsMarek Olšák2018-02-241-12/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: preload the tess offchip ring in TESMarek Olšák2018-02-242-12/+10
| | | | | | so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRsMarek Olšák2018-02-245-91/+70
| | | | | | | TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move 2nd-shader descriptor pointers into s[0:1]Marek Olšák2018-02-243-74/+140
| | | | | | | | | | | If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: change si_descriptors::shader_userdata_offset type to shortMarek Olšák2018-02-242-9/+9
| | | | | | | We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: put both tessellation rings into 1 bufferMarek Olšák2018-02-244-29/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move tessellation ring info into si_screenMarek Olšák2018-02-243-45/+52
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bitsMarek Olšák2018-02-243-5/+6
| | | | | | For a later patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* nvir: dont optimize mad with subops to shladdKarol Herbst2018-02-241-1/+2
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* broadcom/vc5: Fix layout of 3D textures.Eric Anholt2018-02-232-32/+81
| | | | | | Cube maps are entire miptrees repeated, while 3D textures have each level have all of its layers next to each other. Fixes tex3d and tex-miplevel-selection GL2:texture() 3D.
* broadcom/vc5: Ignore unused usage flags in is_format_supported.Eric Anholt2018-02-231-27/+16
| | | | | | | | Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to match. Just drop the whole retval == usage thing and return early when we hit a known unsupported case. Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
* swr: remove dead LLVM code pathsEmil Velikov2018-02-233-28/+0
| | | | | | | | | LLVM requirement was bumped to 4.0.0 with earlier commit. Hence any code tailored for older versions is now unreachable. Signed-off-by: Emil Velikov <[email protected]> Reviewed-By: George Kyriazis <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* broadcom/vc4: Remove the retval==usage check in is_format_supported().Eric Anholt2018-02-231-26/+13
| | | | This got us into trouble recently, so just remove it entirely.
* broadcom/vc4: Add support for YUV textures using unaccelerated blits.Eric Anholt2018-02-233-3/+35
| | | | | Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.
* broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.Eric Anholt2018-02-231-6/+11
| | | | | | | When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.
* broadcom/vc4: Add pipe_reference debugging for vc4_bos.Eric Anholt2018-02-232-5/+24
| | | | | Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.
* broadcom/vc4: Remove dead vc4_bo_set_reference().Eric Anholt2018-02-231-8/+0
| | | | | It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.
* broadcom/vc4: Use pipe_resource_reference in sampler views.Eric Anholt2018-02-231-2/+2
| | | | Improves u_debug_refcount output.
* broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.Eric Anholt2018-02-231-8/+25
| | | | | | This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.
* broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().Eric Anholt2018-02-231-0/+2
| | | | | | We were failing the retval == usage check at the end. Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
* etnaviv: fix in-place resolve tile countLucas Stach2018-02-232-2/+4
| | | | | | | | | | | TS tiles map to a fixed amount of bytes in the color/depth surface, so the blocksize of the format needs to be taken into account when calculating the number of tiles to fill. The simplest fix is to just use the layer stride, which is the surface size in bytes. Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: switch magic single buffer state to "3"Lucas Stach2018-02-231-1/+1
| | | | | | | | Some of the 16bit formats misrender with missing tiles with the current "2" state. As all the previously working formats also work with the "3" state, just always use that one. Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: add debug switch to disable single buffer featureLucas Stach2018-02-232-0/+4
| | | | | | | | | This feature has caused some trouble already. Add a debug switch to allow users to quickly check if a specific issue is caused by this feature. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: npot_tex_any_wrap needs one bit onlyChristian Gmeiner2018-02-231-1/+1
| | | | | | | Reduces size of struct etna_specs from 100 to 94 bytes. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* nv50,nvc0: fix integer MS resolves using 2d engineIlia Mirkin2018-02-221-1/+2
| | | | | | | | We don't want filtering for integer textures, same as depth/stencil. Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nvc0: fix writing query results into bufferIlia Mirkin2018-02-221-4/+10
| | | | | | | | | | | We need to mark the range as valid, and validate the resource using a helper to ensure that the buffer status is marked properly. Fixes some CTS pipeline stats query tests, and KHR-GL45.direct_state_access.queries_functional Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nv50,nvc0: fix clear buffer accelerationIlia Mirkin2018-02-222-28/+17
| | | | | | | | | | | | Two things were off: - valid range was not updated, which could affect waiting for future maps - fencing was done manually instead of using the *_resource_validate helper, which resulted in a missed dirty buffer flag being set Fixes: KHR-GL45.direct_state_access.buffers_clear Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* glx: Properly handle cases where screen creation failsChuck Atkins2018-02-223-30/+33
| | | | | | | | | | | | | This fixes a segfault exposed by a29d63ecf7 which occurs when swr is used on an unsupported architecture. v2: re-work to place logic in xmesa_init_display Signed-off-by: Chuck Atkins <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: [email protected] Cc: George Kyriazis <[email protected]> Cc: Bruce Cherniak <[email protected]>
* radeonsi/nir: collect more accurate output_usagemaskTimothy Arceri2018-02-221-13/+43
| | | | | | | | | | | Fixes assert in the glsl-1.50-gs-max-output-components piglit test. Note that the double handling will only work for doubles that don't take up multiple slots i.e. double and dvec2. However dual slot double handling is an existing bug which is made no worse by this patch. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: disable GLSL IR loop unrollingTimothy Arceri2018-02-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Delaying unrolling and allowing NIR to do it instead has been shown to result in better code in drivers such as i965. shader-db results appear to show the same is true for radeonsi. The other advantage is that using NIR unrolling improves compile times significantly. Totals from affected shaders: SGPRS: 9624 -> 10016 (4.07 %) VGPRS: 6800 -> 6464 (-4.94 %) Spilled SGPRs: 0 -> 2 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 359176 -> 332264 (-7.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1355 -> 1432 (5.68 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix tess varying loads for doublesTimothy Arceri2018-02-221-2/+2
| | | | | | | | | Fixes the following piglit tests: tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: pass type to load_tess_varyings()Timothy Arceri2018-02-222-0/+3
| | | | | | We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't flush when si_eliminate_fast_color_clear is no-opMarek Olšák2018-02-211-1/+5
|
* radeonsi: make texture_discard_cmask/eliminate functions non-staticMarek Olšák2018-02-212-11/+13
|
* radeonsi: enable uvd encode for HEVC mainJames Zhu2018-02-211-1/+3
| | | | | | | Enable UVD encode for HEVC main profile Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* radeonsi:create uvd hevc enc entryJames Zhu2018-02-211-3/+12
| | | | | | | Add UVD hevc encode pipe video codec creation entry Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* radeon/uvd:add uvd hevc enc functionsJames Zhu2018-02-213-0/+383
| | | | | | | Implement UVD hevc encode functions Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* radeon/uvd:add uvd hevc enc hw ib implementationJames Zhu2018-02-213-0/+1134
| | | | | | | Implement required IBs for UVD HEVC encode. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* radeon/uvd:add uvd hevc enc hw interface headerJames Zhu2018-02-213-0/+471
| | | | | | | Add hevc encode hardware interface for UVD Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>