summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeon/vcn: support picture parameters for HEVCBoyuan Zhang2018-02-053-21/+64
| | | | | | | | | Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that it can be used for different codecs. Add functions to handle picture parameters that will be used for HEVC encode. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add vcn encode interface for HEVCBoyuan Zhang2018-02-051-2/+79
| | | | | | | | Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* broadcom/vc5: Ignore samplers for finding uniform offsets.Eric Anholt2018-02-051-1/+12
| | | | | | | | Fixes: KHR-GLES3.shaders.struct.uniform.sampler_array_fragment KHR-GLES3.shaders.struct.uniform.sampler_array_vertex KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex
* broadcom/vc5: Fix non-mipfiltered sampling.Eric Anholt2018-02-051-1/+6
| | | | | We need to clamp the LOD to 0 if mip filtering is disabled. This is part of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.
* r600: fix resq for buffer images.Dave Airlie2018-02-051-1/+4
| | | | | | | | | | | If this is an image buffer, we need to calculate the correct resource id. Fixes: KHR-GL45.shader_image_size.* Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/eg: fix cube map array buffer images.Dave Airlie2018-02-051-1/+1
| | | | | | | | This fixes a crash in: KHR-GL45.texture_cube_map_array.texture_size_compute_sh. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* broadcom/vc5: Enable UIF XOR on textures.Eric Anholt2018-02-023-7/+40
| | | | | | | | | | This should increase performance by reducing SDRAM bank conflicts when crossing between UIF columns (particularly on power-of-two height textures). The uif_xor_disable setup is dropped, since we need to allow XOR on lower miplevels even when level 0 is XOR. The level 0 force UIF and level 0 XOR flags should handle setting XOR properly on imported buffers.
* broadcom/vc5: Fix alignment of miplevel 1 with UIF.Eric Anholt2018-02-023-15/+31
| | | | | | | The alignment here means that we can't get back the padded height from the size/stride any more, so it's now a field in the slice as well. Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.
* broadcom/vc5: Switch our RGBA4 support to the new gallium format.Eric Anholt2018-02-021-2/+1
| | | | | Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for GL_RGBA4, GL_RGB4, GL_RGBA2, etc.
* gallium: Add a new A4B4G4R4 pipe format for Broadcom.Eric Anholt2018-02-021-0/+1
| | | | | | | | The VC5 HW puts A in the low bits and R in the high bits. We can't just swizzle in the shaders because the blending HW can't pick what channel A is in, so make a new format to match it. Reviewed-by: Marek Olšák <[email protected]>
* meson/swr: Updated copyright datesGeorge Kyriazis2018-02-024-4/+4
| | | | | | | cc: [email protected] cc: [email protected] Reviewed-by: Dylan Baker <[email protected]>
* meson/swr: re-shuffle generated filesGeorge Kyriazis2018-02-024-80/+126
| | | | | | | | | | | | Move generated files from codegen/meson.build to other directories, in order to satisfy generated include file dependencies Add correct file lists for architecture-specific libraries. cc: [email protected] cc: [email protected] Reviewed-by: Dylan Baker <[email protected]>
* amd: remove support for LLVM 3.9Marek Olšák2018-02-024-16/+5
| | | | | | | | | | | Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsicsMarek Olšák2018-02-021-113/+39
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* r600/eg: add crap indirect compute support.Dave Airlie2018-02-021-7/+19
| | | | | | | | I think the cp packets can be made work, but I think it might need a kernel change, so for now just do the worst thing. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: don't do stack workarounds for hemlockRoland Scheidegger2018-02-021-0/+1
| | | | | | | | | By the looks of it it seems hemlock is treated separately to cypress, but certainly it won't need the stack workarounds cedar/redwood (and seemingly every other eg chip except cypress/juniper) need. (Discovered by accident.) Acked-by: Alex Deucher <[email protected]>
* r600: initial attempt at gl_HelperInvocation (v3)Dave Airlie2018-02-026-3/+108
| | | | | | | | | | | | | This passes the CTS and piglit tests. This also disable sb for helper invocations until it doesn't mess up the VPM flags. Thanks to Ilia and Glenn for advice, and Roland for working out the working evergreen path. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallivm/llvmpipe: add const qualifiers on sampler variablesBrian Paul2018-02-011-1/+1
| | | | | | | Once a lp_build_sampler_soa or lp_build_sampler_aos object is created, it should never be modified. Found by inspection. Reviewed-by: Roland Scheidegger <[email protected]>
* svga: remove unneeded #includes in svga_pipe_draw.cBrian Paul2018-02-011-7/+0
| | | | Reviewed-by: Neha Bhende <[email protected]>
* svga: whitespace/formatting fixes in svga_pipe_draw.cBrian Paul2018-02-011-33/+34
| | | | Reviewed-by: Neha Bhende <[email protected]>
* svga: clean up retry_draw_range_elements(), retry_draw_arrays()Brian Paul2018-02-011-54/+27
| | | | | | | Get rid of a bunch of goto spaghetti. Remove unneeded do_retry parameter. No Piglit changes. Also tested w/ Google Earth and other apps. Reviewed-by: Neha Bhende <[email protected]>
* svga: remove unused min/max_index params to draw_vgpu10()Brian Paul2018-02-011-4/+3
| | | | Reviewed-by: Neha Bhende <[email protected]>
* broadcom/vc5: Fix image_h setup for both loads and stores.Eric Anholt2018-02-011-3/+2
| | | | | | | The image_h for the tiling algorithm needs to be the padded-to-a-uifblock height of the level, not the unpadded height or the height of level 0. Fixes some cases of KHR-GLES3.texture_repeat_mode.* and depthstencil-render-miplevels.
* broadcom/vc5: Add appropriate height padding for bank conflicts.Eric Anholt2018-02-014-0/+63
| | | | | | | I thought I didn't need this because I was doing level-0-always-UIF and that the pad there would propagate down, but it turns out that for level 1 the padding ends up being chosen by the HW. This brings us closer to being able to turn on UIF XOR for increased performance, as well.
* broadcom/vc5: Simplify separate stencil surface setup.Eric Anholt2018-02-013-99/+77
| | | | | | | | | If we just make another gallium surface for the separate stencil, it's a lot easier to keep track of which set of fields we're using in RCL setup. This also incidentally fixes a little bug in setting up the surface's padded height for separate stencil when the UIF-ness changes at different levels of Z versus stencil.
* broadcom/vc5: Rename the UIFCFG register in the UAPI.Eric Anholt2018-02-012-2/+2
| | | | | | This matches the naming of the other hub regs we get, and I don't know for sure if UIFCFG will be the same register between the hub and the cores on all versions.
* broadcom/vc5: Skip over missing color buffers for a couple of checks.Eric Anholt2018-02-012-0/+6
| | | | Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2
* broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL.Eric Anholt2018-02-011-0/+1
|
* radeonsi: use ac_build_buffer_load_format for image buffer loadsMarek Olšák2018-02-011-4/+10
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add glc parameter to ac_build_buffer_load_formatMarek Olšák2018-02-012-2/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: load the right number of components for VS inputs and TBOsMarek Olšák2018-02-012-5/+16
| | | | | | | | | | | | | | | | | | | | | | | The supported counts are 1, 2, 4. (3=4) The following snippet loads float, vec2, vec3, and vec4: Before: buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904 buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507 After: buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04 buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805 buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307 Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove unused si_shader_context membersMarek Olšák2018-02-012-11/+0
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* r600/eg: make sure we allow vpm bit on other CF ops.Dave Airlie2018-02-011-0/+1
| | | | | | | the vpm bit wasn't being applied to the push/pop instructions. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/sb: just add some missing debug bitsDave Airlie2018-02-011-0/+15
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600: fix buffer resinfo opcode translation.Dave Airlie2018-02-012-2/+2
| | | | | | | | | The vtx operations never got translated, so things worked by 0 being equal to 0, translate them so we can use the proper buffer resinfo code. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* svga: use opcode local var to simplify some codeBrian Paul2018-01-311-4/+2
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: s/unsigned/VGPU10_OPCODE_TYPE/Brian Paul2018-01-311-10/+11
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* virgl: also remove dimension on indirect.Dave Airlie2018-01-311-1/+0
| | | | | | | | | This fixes some dEQP tests that generated bad shaders. Fixes: b6f6ead19 (virgl: drop const dimensions on first block.) Reviewed-by: Gurchetan Singh <[email protected]> Tested-by: Gurchetan Singh <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: remove DBG_PRECOMPILEMarek Olšák2018-01-313-51/+0
| | | | | | it's useless and shader-db stats only report the main shader part. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print shader-db stats for main parts, not final binariesMarek Olšák2018-01-313-13/+23
| | | | | | This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move max_simd_waves computation into a separate functionMarek Olšák2018-01-312-12/+23
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: add lower_all_io_to_temps flagTimothy Arceri2018-01-312-0/+2
| | | | | | | This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: add input support for arrays that have not been copied to ↵Timothy Arceri2018-01-311-67/+79
| | | | | | | | | temps and split We need this to be able to support the interpolateAt builtins in a sane way. It also leads to the generation of more optimal code. Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add lookup_interp_param and load_sample_position to the abiTimothy Arceri2018-01-311-0/+2
| | | | | | | This will enable the interpolateAt builtins to work on the radeonsi nir backend. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: add prim_mask to the abiTimothy Arceri2018-01-311-3/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: adjust load_sample_position() to be shared between backendsTimothy Arceri2018-01-311-2/+3
| | | | | | | With this interface change it can be shared between the tgsi and nir backends. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: add si_nir_lookup_interp_param() helperTimothy Arceri2018-01-312-0/+40
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: move the interpolation qualifier scanningTimothy Arceri2018-01-311-16/+36
| | | | | | | | We need to collect this when scanning over the instruction rather than when scanning over the inputs otherwise we might get confliting values for inputs that are use by the interpolateAt* builtins. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: add interpolate at intrinsics to scan_instruction()Timothy Arceri2018-01-311-0/+30
| | | | | | V2: use the uses_*_opcode_interp_* flags Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix fence_server_sync() holding up extra work v2Andres Rodriguez2018-01-302-25/+28
| | | | | | | | | | | | | | | When calling si_fence_server_sync(), the wait operation is associated with the next kernel submission. Therefore, any unflushed work submitted previous to fence_server_sync() will also be affected by the wait. To avoid adding the dependency to the unflushed work, we flush before emitting the fence dependency. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]>