aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* st/omx_bellagio: add picture profile and entry pointBoyuan Zhang2018-03-021-0/+2
| | | | | | | | | Profile and entry point were missing in the picture structure. Therefore, add them back. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: fix radeon create encoder returnBoyuan Zhang2018-03-021-1/+1
| | | | | | | | | | | Previous patch missed a "return" when trying to modify the create encoder function, which made the whole logic fail. Therefore, add the return back. Fixes: b38b208ff8886e799d6a2 "radeonsi:create uvd hevc enc entry" Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* r600/cayman: fix fragcood loading recip generation.Dave Airlie2018-03-021-1/+1
| | | | | | | | This fixes some hangs seen where the recip_ieee opcodes would end up split across the wrong slots. Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/util: use sockets on PIPE_OS_UNIX in u_networkJonathan Gray2018-03-012-10/+4
| | | | | | | | Instead of listing all the UNIX PIPE_OS platforms just use PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi/nir: increase values to 8 for gs fetch.Dave Airlie2018-03-011-1/+1
| | | | | | | | This stops a crash when running (still fails): tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: set some context vars for nir pathTimothy Arceri2018-03-011-6/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium: remove llvm from ir structTimothy Arceri2018-03-011-1/+0
| | | | | | | This was added in 425dc4c4b366 but never used. Also since 100796c15c3a native has superseded llvm. Acked-by: Dave Airlie <[email protected]>
* broadcom/vc5: Fix regression in the page-cache slice size alignment.Eric Anholt2018-02-281-3/+6
| | | | | | | We need to align the size of the slice, not the offset of the next slice. Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge. Fixes: b4b4ada7616d ("broadcom/vc5: Fix layout of 3D textures.")
* r600/shader: when using images always load thread id gpr at start (v2)Dave Airlie2018-02-281-15/+7
| | | | | | | | | | | | The delayed loading code was fail if we had control flow. This fixes: tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test v2: don't use temp_reg before setting temp_reg up. Tested-by: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: fix whitespace in recent 1d texture commit.Dave Airlie2018-02-281-1/+1
| | | | trivial fix.
* swr/rast: revert clip distance precisionGeorge Kyriazis2018-02-282-4/+17
| | | | | | Fixes piglit tests that broke with 8a64593bde Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: Faster frustum prim cullingGeorge Kyriazis2018-02-281-3/+7
| | | | | | | | | Fix clipper validMask setting. We don't need to run frustum rejected primitives through the clipper. Perform frustum culling with only frustum clip codes. Guardband clip codes cannot be used because they overlap frustum codes. Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: Consolidate TRANSLATE_ADDRESSGeorge Kyriazis2018-02-284-6/+28
| | | | | | | | Translate is now part of an overloaded LOAD call which required a change to the code gen to skip the load functions in order to handle them manually to make them virtual. Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: Code generation cleanupGeorge Kyriazis2018-02-281-15/+21
| | | | | | Generate more compact code from gen_llvm.hpp. Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: Remove draw type from event definitionsGeorge Kyriazis2018-02-283-12/+8
| | | | | | | | | | | - Have the draw type sent to DrawInfoEvent in handlers created in archrast.cpp. The draw type no longer needs to be sent during during AR_API_EVENT() call in api.cpp. - Remove draw type from event defintions in events_private.proto, no longer needed Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: whitespace changeGeorge Kyriazis2018-02-281-1/+1
| | | | Reviewed-By: Bruce Cherniak <[email protected]>
* swr/rast: Fix index buffer overfetch issue for non-indexed drawsGeorge Kyriazis2018-02-281-0/+15
| | | | | | | | Populate pLastIndex, even for the non-indexed case. An zero pLastIndex can cause the index offsets inside the fetcher to have non-sensical values that can be either very large positive or very large negative numbers. Reviewed-By: Bruce Cherniak <[email protected]>
* softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWSRoland Scheidegger2018-02-281-2/+2
| | | | | | | | We were setting view to NULL if the iteration was larger than i. But in fact if the view is NULL the code did nothing anyway... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroyRoland Scheidegger2018-02-281-1/+3
| | | | | | | There's no point, we know the highest non-null one. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: don't needlessly iterate through all sampler view slotsRoland Scheidegger2018-02-281-1/+1
| | | | | | | We already stored the highest (potentially) used number. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* winsys/amdgpu: request high addressesChristian König2018-02-281-4/+12
| | | | | | | | | We now have hopefully fixed all bugs regarding high addresses on Vega10 and Raven. Start to use the high range to make room for SVM in the low range. Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600: partly revert disabling tiling for 1d texture.Dave Airlie2018-02-281-0/+5
| | | | | | | | | | | | Previously we had a check for 1d of narrow 2D textures, however narrow 2d textures caused gpu hangs, but it was correct for 1d textures. This fixes a bunch of 1D image piglits for me. Fixes: 7b8e1c089d (r600/texture: drop lowering 1d/2d images to linear.) Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: add lower_ldexp to nir compiler optionsTimothy Arceri2018-02-282-0/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add load_base_vertex() to the abiTimothy Arceri2018-02-281-0/+1
| | | | | | | | | | Fixes the following piglit tests: ./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo ./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: create get_base_vertex() helperTimothy Arceri2018-02-281-14/+20
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: disable vertex_id_zero_based loweringTimothy Arceri2018-02-281-1/+0
| | | | | | | | | | The lowering is incompatible with how the radeonsi backend works. Fixes piglit test: ./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: collapse output slots to have adjacent registersIlia Mirkin2018-02-271-2/+12
| | | | | | | | | | The hardware skips over unallocated slots, so we have to make sure those registers are packed together. Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107Karol Herbst2018-02-261-1/+14
| | | | | | | | | | | | | | | | | | | | currently while insterting barriers, writes and reads to FILE_FLAGS aren't considered. This can lead to WaR hazards in some situations. With the previous commit fixes shaders with intstructions like this: mad u32 $r2 $r4 $r11 $r2 mad u32 { $r5 $c0 } $r4 $r10 $r6 mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0 Affects OpenCL CTS tests on Maxwell+: basic/test_basic intmath_long basic/test_basic intmath_long2 basic/test_basic intmath_long4 v2: only put barriers on instructions which actually read flags Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUseKarol Herbst2018-02-261-16/+18
| | | | | | | | | | In the sched data calculator we have to track first use of defs by iterating over all defs of an instruction, not just the first one. v2: fix minGRP and maxGRP values Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointersMarek Olšák2018-02-264-11/+35
| | | | | | The effect of the last 13 commits on user SGPR counts: Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR inputMarek Olšák2018-02-264-20/+53
| | | | | | | | so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set correct num_input_sgprs for VS prolog in merged shadersMarek Olšák2018-02-261-24/+24
| | | | | | | We need to take num_input_sgprs from VS, not the second shader. No apps suffered from this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow fewer input SGPRs in 2nd shader of merged shadersMarek Olšák2018-02-261-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't use struct si_descriptors for vertex buffer descriptorsMarek Olšák2018-02-266-33/+46
| | | | | | VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <[email protected]>
* r600: fix tgsi clock last settingDave Airlie2018-02-261-0/+1
| | | | | | | On cayman this was hitting an assert later, which probably wasn't see on non-cayman due to having the t slot. Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
* r600: add time lo/hi debugging output.Dave Airlie2018-02-262-0/+12
| | | | This just adds the these to the debug prints.
* radeonsi/nir: enable lowering of fpowTimothy Arceri2018-02-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Lowering fpow in NIR rather than LLVM can be beneficial. Polaris results: Totals from affected shaders: SGPRS: 124928 -> 124896 (-0.03 %) VGPRS: 68616 -> 68332 (-0.41 %) Spilled SGPRs: 394 -> 413 (4.82 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3668912 -> 3658368 (-0.29 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 18575 -> 18593 (0.10 %) Wait states: 0 -> 0 (0.00 %) Fixes: d6b753920677 "ac/nir: remove emission of nir_op_fpow" Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_infoTimothy Arceri2018-02-262-7/+0
| | | | | | | Seems to have not been used since 16be87c90429 Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/nir: fix loading of doubles for tess varyingsTimothy Arceri2018-02-261-2/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix lds store in tcs outputs handlingTimothy Arceri2018-02-261-1/+1
| | | | | | We were ignoring the channel offset. Reviewed-by: Marek Olšák <[email protected]>
* r600: Take ALU_EXTENDED into account when evaluating jump offsetsGert Wollny2018-02-261-2/+7
| | | | | | | | | | | ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump offset needs to be adjusted accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654 Cc: <[email protected]> Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: remove si_descriptors parameter from emit_shader_pointer functionsMarek Olšák2018-02-241-12/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: preload the tess offchip ring in TESMarek Olšák2018-02-242-12/+10
| | | | | | so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRsMarek Olšák2018-02-245-91/+70
| | | | | | | TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move 2nd-shader descriptor pointers into s[0:1]Marek Olšák2018-02-243-74/+140
| | | | | | | | | | | If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: change si_descriptors::shader_userdata_offset type to shortMarek Olšák2018-02-242-9/+9
| | | | | | | We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: put both tessellation rings into 1 bufferMarek Olšák2018-02-244-29/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move tessellation ring info into si_screenMarek Olšák2018-02-243-45/+52
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bitsMarek Olšák2018-02-243-5/+6
| | | | | | For a later patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* nvir: dont optimize mad with subops to shladdKarol Herbst2018-02-241-1/+2
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>