summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fsAlejandro Piñeiro2019-02-2124-81/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: e68871f6a ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop
* blorp: introduce helper method blorp_nir_init_shaderAlejandro Piñeiro2019-02-213-7/+16
| | | | | | | | This initializes the nir shader that will be used by blorp. Right now it doesn't do too much beyond calling nir_builder_init_simple_shader, and setting a name. More stuff will be added on following patches. v2: there is a case were it is used a VERTEX_SHADER (Alejandro)
* panfrost: Verify and print brx condition in disasmAlyssa Rosenzweig2019-02-211-1/+10
| | | | | | | | | | | | The condition code in extended branches is repeated 8 times for unclear reasons; accordingly, the code would be disassembled as "unknown5555", "unknownAAAA", etc. This patch correctly masks off the lower two bits to find the true code to print, verifying that the code is repeated as believed to be necessary (providing some assurance for compiler quality and an assert trip in case we encounter a shader in the wild that breaks the convention). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Dynamically set discard branch targetsAlyssa Rosenzweig2019-02-211-20/+46
| | | | | | | | | | | | | discard and discard_if are both implemented with the branching pipeline on Midgard; essentially, we branch to the end of the fragment shader in a special "discard" mode, setting the condition as necessary. Previously, we hardcoded the form of this instruction, which worked for very simple shaders but was incorrect for anything remotely interesting. This patch instead emits logical branches in the IR, which are flattened to real discard ops the same way other branches are, allowing targets to be computed correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Emit extended branchesAlyssa Rosenzweig2019-02-212-17/+84
| | | | | | | | | | | | | | | Previously, we only emitted compact branches; however, the offset range of these branches is too small for many real world shaders. This patch implements support for emitting extended branches and switches to always using them for control flow. This incurs a code size and possibly performance penalty, but expands the range of working shaders and provides opportunity for further optimization. Support for emitting compact branches is retained but this code path is presently unused. In the future, we'll want to heuristically determine which type of branch should be emitted for optimal codegen. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Rectify doubleplusungood extended branchAlyssa Rosenzweig2019-02-212-8/+8
| | | | | | | | | | | | | | Midgard features "compact branches" and "extended branches", i.e. corresponds to short jumps and far jumps. The form of the extended branch was previously incorrect in the ISA headers; this patch corrects it and updates the disassembler (simultaneous to preserve bisectability). Additionally, we fix some a corner case in the disassembly of extended branches, and we now prefix extended branches with "brx", to visually differentiate from compact branches prefixed with "br". Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix nested/chained if-elseAlyssa Rosenzweig2019-02-211-3/+3
| | | | | | | | | | | | An if-else statement is compiled to a conditional branch (from the start to the second block) and an unconditional branch (from the end of the first block to the end of the else). We previously incorrectly computed the block index of the unconditional branch to be exactly one after that of the conditional branch, valid for a single if-else statement but nothing fancier. This patch correctly computes the unconditional branch target, fixing more complex if-else chains. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Refactor tag lookahead codeAlyssa Rosenzweig2019-02-211-27/+31
| | | | | | | | | | | | Each Midgard instruction is scheduled to a particular instruction type ("tag"). Presumably the hardware prefetches memory based on tag, so it is required to report out the first tag to the command stream and the next tag of a branch target. This procedure was implemented in two separate parts of the compiler (one time with a slight bug relating to empty blocks); this patch refactors to unite the two routines and solve the bug when branching to empty blocks. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement pantrace (command stream dump)Alyssa Rosenzweig2019-02-215-0/+188
| | | | | | | | | | Historically, Panfrost debugging entailed the use of the LD_PRELOADable `panwrap` tool. This setup is a tad fragile; Panfrost can be traced directly without the intermediate layer. pantrace implements the quivalent functionality of panwrap into Panfrost proper, allowing dumps to work regardless of the kernel layer in use. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add pandecode (command stream debugger)Alyssa Rosenzweig2019-02-216-3/+2289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting communication between the driver and the kernel. Modern panwrap versions do no processing of their own; instead, they create a trace directory. This directory contains the following files: - control.log: a line-by-line plain text file, denoting important syscalls (mmaps and job submits) along with their arguments - memory_*.bin, shader_*.bin: binary dumps of mapped memory Together, these files contain enough information to reconstruct the command stream and shaders of (at minimum) a single frame. The `pandecode` utility takes this directory structure as input, reconstructing the mapped memory and using the job submit command as an entrypoint. It then walks the descriptors as the hardware would, parsing and pretty-printing. Its final output is the pretty-printed command stream interleaved with the disassembled shaders, suitable for driver debugging. For instance, the behaviour of two driver versions (one working, one broken) can be compared by diff'ing their decoded logs. pandecode/decode.c was originally a part of `panwrap`; it is the oldest living code in the project. Its history is generally not worth preserving. panwrap itself will continue to live downstream for the foreseeable future, as it is specifically written for the vendor kernel. It is possible, however, to produce equivalent traces directly from Panfrost, bypassing the intermediate wrapping layer for well-behaved drivers. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stub out separate stencil functionsAlyssa Rosenzweig2019-02-212-4/+25
| | | | | | | This is not yet functional, but it resolves a crash in various apps and provides a framework for further work. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* radeonsi: use SDMA for uploading data through const_uploaderMarek Olšák2019-02-205-28/+143
| | | | | | | | v2: use tc.stream_uploader in si buffer_transfer_map if not called from the driver thread Reviewed-by: Nicolai Hähnle <[email protected]> (v1) Tested-by: Dieter Nützel <[email protected]>
* gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with persistent mappingsMarek Olšák2019-02-202-6/+31
| | | | | | | for radeonsi Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/u_threaded: always unmap const_uploaderMarek Olšák2019-02-201-0/+1
| | | | | | | | radeonsi will require this. It's a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/mesa: always unmap the uploader in st_atom_array.cMarek Olšák2019-02-201-8/+6
| | | | | | | This is a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* nir/xfb: Handle compact arrays in gather_xfb_infoJason Ekstrand2019-02-211-11/+22
| | | | | | | This makes us properly handle gl_ClipDistance and gl_CullDistance. Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir/xfb: Work in terms of components rather than slotsJason Ekstrand2019-02-211-5/+5
| | | | | | | | | | | | | | | | | | We needed to better handle cases where a chunk of a variable starts at some non-zero location_frac and rolls over into the next slot but may not be more than 4 dwords. For example, if gl_CullDistance is an array of 3 things and has location_frac = 2, it will span across two vec4s but is not, itself, bigger than a vec4. If you ignore the clip/cull special case, it's not allowed to happen for anything else because the only things that can span more than one slot is dvec3 and dvec4 and they're both bigger than a vec4. The current code uses this attrib_slot thing where we count attribute slots and iterate over them. However, that doesn't work in the case above because gl_CullDistance will have an attrib_slot count of 1 even though it does span two slots. We could fix this by adjusting attrib_slot but we already have comp_mask and it's easier to just handle it that way. Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Rewrite lower_clip_cull_distance_arrays to do a lot less loweringJason Ekstrand2019-02-211-113/+19
| | | | | | | | | | Instead of going to all the work of to combine them into one array, just make two arrays and use location_frac to colocate them within CLIP0. Then the back-end can sort things out and stack them on top of each other. Thanks to ef99f4c8, we also don't need to set compact anymore. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/xfb: Properly align 64-bit valuesJason Ekstrand2019-02-211-0/+4
| | | | | Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <[email protected]>
* compiler/types: Add a contains_64bit helperJason Ekstrand2019-02-214-0/+29
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* freedreno/a6xx: samplerBuffer fixesRob Clark2019-02-203-8/+17
| | | | | | | | | | | | | | Use the 'UNK31' bit (which should probably be called 'BUFFER') for samplerBuffer case, which increases the size of supported buffer texture beyond 2^15 elements. Also need to fix the 2nd coord injected to handle the tex instructions that take integer coords. Fixes dEQP-GLES31.functional.texture.texture_buffer.render.as_fragment_texture.buffer_size_131071 and similar Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/a6xx: use ldib for ssbo readsRob Clark2019-02-201-24/+10
| | | | | | | | | | | | | ... instead of isam. It seems like when using isam, plus atomics, we can have the problem of old data being in the texture cache. Plus this way we don't have to load a component at a time. Note that blob still seems to use isam in some cases. I suppose it might be preferable in the case of loading a single component, when atomics are not in the picture (or that the ssbo does not need to otherwise be coherent). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sync instr/disasm and add ldib encodingRob Clark2019-02-204-14/+42
| | | | | | | | | | Resync disasm and instr header from envytools, and add ldib encoding. This replaces an opcode from a3xx which was never seen in practice, since that seemed easier than dealing with the same opc # meaning a different thing on a6xx. (Not really sure if 'sti' was actually a real thing, I think it was only seen in fuzzing.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/a6xx: fix load_ssbo barrier type.Rob Clark2019-02-201-2/+2
| | | | | | | Silly copy/pasta bug, since load_image is actually the same instruction but different barrier class. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rename put_dst()Rob Clark2019-02-203-9/+9
| | | | | | | This was overlooked when it moved to ir3_context.c and ceased to be static.. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix crash w/ masked non-SSA dstRob Clark2019-02-201-0/+2
| | | | | | | | | Fixes dEQP-GLES3.functional.shaders.indexing.varying_array.vec3_dynamic_write_dynamic_loop_read regression. Fixes: c1a27ba9baf freedreno/ir3: HIGH reg w/a for a6xx Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: 3d and cube image fixesRob Clark2019-02-201-4/+31
| | | | | | | Fixes dEQP-GLES31.functional.image_load_store.{3d,cube}.store.* and a bunch more Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash in compile fail caseRob Clark2019-02-203-6/+9
| | | | | | | The variant will be NULL if RA failed. Which isn't ideal, but at least lets not segfault and bring down the rest of the dEQP run with us. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix legalize for vecN inputsRob Clark2019-02-202-0/+3
| | | | | | | | | The wrmask is handled in regmask_get()/regmask_set(), but it wasn't being propagated from SSA src to dst. So for example, an SSBO read value that is passed in as src2.y component to atomic op, wasn't getting the (sy) flag set. Causing lots of fail. Signed-off-by: Rob Clark <[email protected]>
* radv: Disable depth clamping even without EXT_depth_range_unrestricted.Bas Nieuwenhuizen2019-02-201-2/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement VK_EXT_depth_clip_enable.Bas Nieuwenhuizen2019-02-203-2/+16
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: remove non-ssa support from nir_copy_prop()Timothy Arceri2019-02-211-36/+5
| | | | | | | | | Even in a very basic shader this reduces the time spent in nir_copy_prop() by ~17%. No shader-db changes for radeonsi NIR or i965. Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Handle clip+cull distances more generally as compact arrays.Bas Nieuwenhuizen2019-02-204-99/+83
| | | | | | | | | | | | Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <[email protected]>
* kmsro: Add the rest of the current set of tinydrm drivers.Eric Anholt2019-02-204-7/+33
| | | | | | | | | | | While I haven't tested them all, given that they're all using the same allocation paths and modifiers in the kernel they should be fine to use in the same way. v2: Rebase on other kmsro changes. v3: Skip repeated '[with_gallium_kmsro,' in the meson build. Acked-by: Alyssa Rosenzweig <[email protected]>
* i965: re-emit index buffer state on a reset option change.Andrii Simiklit2019-02-203-1/+13
| | | | | | | | | | | | | | Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]>
* nir: Don't forget if-uses in new nir_opt_dead_cf liveness checkKenneth Graunke2019-02-201-0/+10
| | | | | | | | | | | | | | | | Commit 08bfd710a25c14df5f690cce9604617536d7c560. (nir/dead_cf: Stop relying on liveness analysis) introduced a new check that iterated through a SSA def's uses, to see if it's used. But it only checked normal uses, and not uses which are part of an 'if' condition. This led to it thinking more nodes were dead than possible. Fixes Piglit's variable-indexing/tcs-output-array-float-index-wr test (and related tests) with the out-of-tree Iris driver. Fixes: 08bfd710a25 nir/dead_cf: Stop relying on liveness analysis Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/a6xx: Support MSAA resolve blits on blitterKristian H. Kristensen2019-02-201-3/+23
| | | | | | | | | | | | | | | This gets stencil and depth resolves working properly. Fixes: dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Copy stencil as R8_UINTKristian H. Kristensen2019-02-201-4/+14
| | | | | | | | | | | Blitter does support it after all. Previous attempt to use R8_UINT failed because we overwrote the a6xx format in emit_blit_texture(), but some of the later setup still looked at the gallium format. If we overwrite it in the pipe_blit_info before we even call into emit_blit_texture() it works properly. Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Update headersKristian H. Kristensen2019-02-207-7/+13
| | | | | | Add support for multisampled sources for the blitter. Signed-off-by: Kristian H. Kristensen <[email protected]>
* anv: use anv_shader_bin_write_to_blob()'s return valueEric Engestrom2019-02-201-3/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: drop unused importsEric Engestrom2019-02-201-2/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: make sure the extensions stay sortedEric Engestrom2019-02-201-0/+20
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: sort vendors extensions after KHR and EXTEric Engestrom2019-02-201-2/+2
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: sort extensions alphabeticallyEric Engestrom2019-02-201-1/+1
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: anv: refactor error handling in anv_shader_bin_write_to_blob()Tapani Pälli2019-02-201-28/+26
| | | | | | | | | | v2: blob manages error state internally, just return true if errors did not occur (Jason) CID: 1442546 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* wayland/egl: Ensure EGL surface is resized on DRI update_buffers()Carlos Garnacho2019-02-201-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fullscreening and unfullscreening a totem window while playing a video sometimes results in the video subsurface not changing size along. This is also reproducible with epiphany. If a surface gets resized while we have an active back buffer for it, the resized dimensions won't get neither immediately applied on the resize callback, nor correctly synchronized on update_buffers(), as the (now stale) surface size and currently attached buffer size still do match. There's actually 2 things to synchronize here, first the surface query size might not be updated yet to the wl_egl_window's (i.e. resize_callback happened while there is a back buffer), and second the wayland buffers would need dropping if new surface size differs with the currently attached buffer. These are done in separate steps now. https://bugzilla.redhat.com/show_bug.cgi?id=1650929 https://bugs.freedesktop.org/show_bug.cgi?id=109594 Fixes: a9fb331ea7d ("wayland/egl: update surface size on window resize") Signed-off-by: Carlos Garnacho <[email protected]> Reviewed-by: Juan A. Suarez <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Tested-by: Bastien Nocera <[email protected]> Tested-by: Denys Kostin <[email protected]>
* anv: implement VK_EXT_depth_clip_enableLionel Landwerlin2019-02-205-4/+23
| | | | | | | A new extension allowing the user to explictly specify the clipping behavior. Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan: Update the XML and headers to 1.1.101Lionel Landwerlin2019-02-202-34/+304
|
* isl: remove the cache line size alignment requirementSamuel Iglesias Gonsálvez2019-02-201-14/+0
| | | | | | | | | | | | The cacheline size was a requirement for using the BLT engine, which we don't use anymore except for a few things on old HW, so we drop it. Fixes CTS's CL#3500 test: dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Clean up a bunch of compiler warnings.Bas Nieuwenhuizen2019-02-203-7/+0
| | | | | | Random unused vars. Reviewed-by: Timothy Arceri <[email protected]>