aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* virgl: Fix flush in virgl_encoder_inline_write.Lepton Wu2018-07-171-1/+1
| | | | | | | | | The current code is buggy: if there are only 12 dwords left in cbuf, we emit a zero data length command which will be rejected by virglrenderer. Fix it by calling flush in this case. Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* virgl: implement set_min_samplesErik Faye-Lund2018-07-175-0/+28
| | | | | | | | This allows us to implement glMinSampleShading correctly, which up until now just got ignored. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600: fix build after the removal of RADEON_PRIO_* flagsMarek Olšák2018-07-167-21/+12
|
* radeonsi: rework RADEON_PRIO flags to be <= 31Marek Olšák2018-07-162-22/+23
| | | | | | This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes. Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: merge DCC/CMASK/HTILE priority flagsMarek Olšák2018-07-166-12/+8
| | | | | | For a later simplification. Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove non-GFX BO priority flagsMarek Olšák2018-07-168-19/+7
| | | | | | For a later simplification. Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa/virgl: Fix off-by-one and copy-paste error in multisample position ↵Gert Wollny2018-07-161-3/+3
| | | | | | | | | | | | | | | evaluation Converting from a switch statement that would not allow intermediate sample counts to use an if-else chain went a bit wrong, so that in some cases the range that should be inclusive was exclusive and the line for 16 samples was copies wrongly. v2: elaborate commit message. Fixes: 91f48cdfe5c817158c533a8f67c60e9aabbe4479 virgl: Add support for glGetMultisample Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]> (v1)
* nouveau: fix 3D blitter for unsigned to signed integer conversionsKarol Herbst2018-07-152-10/+22
| | | | | | | | | fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run. v2: simplify the changes a bit Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* vc4: Tell NIR to lower fdiv instructionsJason Ekstrand2018-07-131-0/+1
| | | | | | This should allow us to use them in nir_lower_tex Reviewed-by: Eric Anholt <[email protected]>
* vc4: Switch to using u_transfer_helper for MSAA maps.Eric Anholt2018-07-132-100/+16
| | | | No requirement, just reduces code duplication.
* v3d: Work around GFXH-1461 bug losing our Z/S clears.Eric Anholt2018-07-131-0/+30
| | | | | | | If you load S and clear Z or vice versa, the clear may get lost. Just fall back to drawing a quad. Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
* r600: Add spill output to group only if register or target index changesGert Wollny2018-07-131-24/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current spill code checks in each instruction of an instruction group whether spilling is needed and if so, it adds spilling for each component as a seperate instruction and it allocates a new temporary for each component and since it takes the write mask from the TGSI representation, all components might be written each time and as a result already written components might be overwritten with garbage like: ... y: MOV R9.y, [0x42140000 37].x t: MOV R8.x, [0x42040000 33].y ... MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3 MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3 ... To resolve this isse accumulate spills to the same memory location so that only one memory write instruction is emitted for an instruction group that writes up to all four components. This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/): spec/glsl-1.30/execution fs-large-local-array-vec2.shader_test fs-large-local-array-vec3.shader_test fs-large-local-array-vec4.shader_test v2: fix some typos and add comment about piglits (Roland Scheidegger) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1)
* radeonsi: add support for Vega20Marek Olšák2018-07-124-1/+5
| | | | Reviewed-by: Alex Deucher <[email protected]>
* u_blitter: Add an option to draw the triangles using an index buffer.Eric Anholt2018-07-121-0/+1
| | | | | | | | | | | | | | | For V3D, the HW will interpolate slightly differently along the shared edge of the trifan. The conformance tests manage to catch this in the nearest_consistency_* group. To get interpolation to match, we need the last vertex of the triangle to be shared. I first tried implementing draw_rectangle to do triangles instead, but that was quite a bit (147 lines) of code duplication from u_blitter, and this seems much simpler and less likely to break as u_blitter changes. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D. Reviewed-by: Marek Olšák <[email protected]>
* vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.Eric Anholt2018-07-121-1/+1
| | | | | | | I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes: a2014c2eb9e0 ("vc4: Simplify the DISCARD_RANGE handling")
* v3d: Don't automatically reallocate a PERSISTENT-mapped buffer.Eric Anholt2018-07-121-1/+1
| | | | | | | I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes piglit bufferstorage-persistent read
* v3d: Fix stride of 1D_ARRAY mappings.Eric Anholt2018-07-121-1/+1
| | | | | | All of our other texture arrays will be tiled, but 1D is an array of raster mappings and we had the wrong value plugged in here. Fixes piglit getteximage-targets 1D_ARRAY
* v3d: Fix MRT blending with independent blending disabled.Eric Anholt2018-07-122-6/+14
| | | | | | | | We were only emitting the RT blend state for RT 0 and only enabling it for RT 0, when the gallium API for !independent_blend is for rt0's state to apply to all of them. Fixes piglit fbo-drawbuffers-blend-add.
* v3d: Implement noperspective varyings on V3D 4.x.Eric Anholt2018-07-093-0/+31
| | | | | Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.
* v3d: Refactor flat shade/centroid flag emission.Eric Anholt2018-07-091-64/+76
| | | | | | The logic was duplicated in a pretty gross way, when what we really need is just a helper function for stuffing the values in the packet. This will make implementing noperspective easier.
* r600: report incorrect max-vertex-attrib for GL 4.4Erik Faye-Lund2018-07-091-1/+2
| | | | | | | | | | | | | | OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but r600 only supports 2047. Technically, this makes it an GL4.3 GPU, but it's currently exposing GL4.4. To avoid regressing the GL version supported in the following patches, let's just lie and pretend like we support 2048. Any applications using 2048 are already broken anyway. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* r600/sb: fix crash in fold_alu_op3Roland Scheidegger2018-07-091-0/+2
| | | | | | | | | | | | | | | | | | | | fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* nv50/ir: fix Instruction::isActionEqual for PHI instructionsKarol Herbst2018-07-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | phi instructions don't have the same results by simply having the same sources. They need to be inside the same BasicBlock or share an equal condition resulting into a path through the shader selecting equal sources as well. short example: cond = ...; const0 = 0; const1 = 1; if (cond) { ssa_1 = const0; } else { ssa_2 = const1; } ssa_3 = phi ssa_1 ssa_2; if (!cond) { ssa_4 = const0; } else { ssa_5 = const1; } ssa_6 = phi ssa_4 ssa_5; allthough both phis actually have sources with equal results, merging them would be wrong due to having a different condition selecting which source to take. For now we also stick an assert into GlobalCSE, because it should never end up having to merge phi instructions. Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: use the combined tid special registerRhys Perry2018-07-079-0/+61
| | | | | | | | | | | | | | total instructions in shared programs : 5804448 -> 5804690 (0.00%) total gprs used in shared programs : 670065 -> 670065 (0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 0 5 5 hurt 0 0 0 191 191 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-062-28/+31
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* v3d: Fix leak of the default attributes BOs.Eric Anholt2018-07-051-1/+10
| | | | The GLES3 CTS makes a lot more progress on a run now.
* v3d: Fix leak of the spill BO on context destruction.Eric Anholt2018-07-051-0/+2
|
* v3d: Skip emitting per-RT blend state for RTs with blend disabled.Eric Anholt2018-07-051-2/+8
| | | | | | Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on more complicated cases by noticing if multiple RTs have the same blend state and emitting them in a single packet.
* v3d: Add proper support for GL_EXT_draw_buffers2's blending enables.Eric Anholt2018-07-054-25/+46
| | | | | I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.
* r600: compare structure elements instead of doing a memcmpGert Wollny2018-07-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Structures might be padded by the compiler and these padding bytes remain un-initialized which in turn makes memcmp return a difference where from the logical point of view there is none. Fixes valgrind: Conditional jump or move depends on uninitialised value(s) at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099) by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573) by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129) by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153) by 0xB3B92CB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3948FD: st_update_array (st_atom_array.c:388) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formatsGert Wollny2018-07-051-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Below tests would fail with an error message "Vertex format (R4G4B4A4|R5G5B5A1) not supported." Add the formate to the translation routine to enable these formats. Fixes: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: force LOD range to be only one value when mip.min filter is NONEGert Wollny2018-07-051-1/+9
| | | | | | | | | | | | | | | | For a texture that has only one LOD defined, but for which GL_TEXTURE_MAX_LEVEL is the default (1000) and GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does not properly resolve the LOD level and texture lookup might fail. Hence, when no mipmap filter is given (indicating that no mip-mapping takes place), force the LOD range to contain only value. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture*.(i|u)sampler2d* dEQP-GLES3.functional.texture.format.sized.cube.rgb* out of VK_GL_CTS/android/cts/master/gles3-master.txt Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Shorten u_queue namesMarek Olšák2018-07-041-2/+2
| | | | | | | | There is a 15-character limit for thread names shared by the queue name and process name. Shorten the thread name to make space for the process name. Reviewed-by: Timothy Arceri <[email protected]>
* ac: fold LLVMContext creation into ac_llvm_context_initMarek Olšák2018-07-041-4/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: reorder code in si_llvm_context_initMarek Olšák2018-07-041-13/+13
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: use ac_compile_module_to_binary to reduce compile timesMarek Olšák2018-07-042-31/+4
| | | | | | | Compile times of simple shaders are reduced by ~20%. Compile times of prologs and epilogs are reduced by up to 40%. Reviewed-by: Dave Airlie <[email protected]>
* nvc0: implement multisampled images on Maxwell+Rhys Perry2018-07-046-39/+48
| | | | | | | | | | Changes in v2: - make loadSuInfo32() protected without making the rest protected - move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating NVC0_SU_INFO_MS Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* r600/sb: cleanup if_conversion iterator to be legal C++Dave Airlie2018-07-041-7/+4
| | | | | | | | | | | | | | The current code causes: /usr/include/c++/8/debug/safe_iterator.h:207: Error: attempt to copy from a singular iterator. This is due to the iterators getting invalidated, fix the reverse iterator to use the return value from erase, and cast it properly. (used Mathias suggestion) Cc: <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* radeonsi: fix compiler breakageMarek Olšák2018-07-041-0/+1
| | | | Broken by d853d3a59bd5f8720a5b021bcd64a193d370b623.
* ac/radv: move llvm compiler info to struct and init in one placeDave Airlie2018-07-041-1/+1
| | | | | | | | | | | | This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: port compiler init/destroy out of radeonsi.Dave Airlie2018-07-041-25/+2
| | | | | | | | | | We want to share this code with radv in the future, so port it out of radeonsi. Add a return value as radv will want that to know if this succeeds Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: add a check ir tm optionsDave Airlie2018-07-041-2/+3
| | | | | | | This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename si_compiler -> ac_llvm_compilerDave Airlie2018-07-048-36/+30
| | | | | | | | As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: refactor out pass manager init to common code.Dave Airlie2018-07-041-25/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radv: split the non-common init_once code from the common target code. (v2)Dave Airlie2018-07-041-0/+1
| | | | | | | | This just splits out the non-shared code and reuses ac_get_llvm_target in radv. v2: rebase on Marek's patch - fixup brace position/whitespace Reviewed-by: Marek Olšák <[email protected]>
* v3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS.Eric Anholt2018-07-021-0/+1
| | | | | | Fixes warning at screen creation. We store our outputs in normal temps and just emit them to shader I/O at the end, due to our I/O ordering requirements, so reading "outputs" in NIR is fine.
* ac: move all LLVM module initialization into ac_create_moduleMarek Olšák2018-07-023-17/+4
| | | | | | This removes some ugly code around module initialization. Reviewed-by: Dave Airlie <[email protected]>
* v3d: Emit a TF flush after each draw using TF.Eric Anholt2018-07-021-0/+7
| | | | | This fixes GPU hangs on 7278 in transform feedback tests such as GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic
* nv50/ir: handle clipvertex for geom and tess shaders as wellKarol Herbst2018-07-021-1/+6
| | | | | | | | | this will be needed for compatibility profiles v2: handle tess shaders Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* virgl: Add support for glGetMultisampleGert Wollny2018-07-022-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use caps to obtain the multisample sample positions for up to 16 positions and implement the according Gallium interface. This implemenation (plus its counterpart in virglrenderer) assume that the fixed sample position are always the same for a given number of samples over the whole live time of a qemu session. It also assumes that sample series are only given for 2, 4, 8, and 16 samples, and for intermediate numbers N of samples the next higher supported set from above list is picked and the sample positions for the first N samples are returned accordingly. Fixes (when run on GL host): dEQP-GLES31.functional.texture.multisample.samples_1.sample_position dEQP-GLES31.functional.texture.multisample.samples_2.sample_position dEQP-GLES31.functional.texture.multisample.samples_3.sample_position dEQP-GLES31.functional.texture.multisample.samples_4.sample_position dEQP-GLES31.functional.texture.multisample.samples_8.sample_position dEQP-GLES31.functional.texture.multisample.samples_10.sample_position dEQP-GLES31.functional.texture.multisample.samples_12.sample_position dEQP-GLES31.functional.texture.multisample.samples_13.sample_position dEQP-GLES31.functional.texture.multisample.samples_16.sample_position v2: remove unrelated chunk (thanks Ilia Mirkin) v3: - also return positions for intermediate sample counts - fix unused varible warning - update description v4: explain better what this patch assumes and how it handles sample numbers that are not directly advertised (thanks go to Erik Faye-Lund for making me aware that this should be documented) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>