summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: untabifyGuido Günther2019-06-052-4/+4
| | | | | | | Two driver files had tabs mixed with spaces. Remove the tabs. Signed-off-by: Guido Günther <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* panfrost: bifrost: Fix format string in disassemblerTomeu Vizoso2019-06-051-1/+1
| | | | | | | | | The compiler configuration was hardened to fail on format warnings and things stopped building. Fixes: c9c1e2610647 ("mesa: prevent common string formatting security issues") Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-By: Ryan Houdek <[email protected]>
* iris: Free the buffer when reading from the disk cache.Kenneth Graunke2019-06-041-3/+8
|
* panfrost/midgard: Don't promote non-SSA to pipeline registersAlyssa Rosenzweig2019-06-051-1/+3
| | | | | | | Fixes: 33800f4612 ("panfrost/midgard: Implement "pipeline register" prepass") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno: Drop invalid scissor optimization.Eric Anholt2019-06-041-7/+0
| | | | | | | We do support TF now, so it's no longer valid. Besides, if we want this optimization, we should probably have mesa/st doing it right for everyone. Reviewed-by: Rob Clark <[email protected]>
* virgl: resolve to correct level during texture readChia-I Wu2019-06-041-2/+2
| | | | | | | | | | When PIPE_TRANSFER_READ requires a resolve, we blit from the host storage to a temporary storage, and do a format conversion from the temporary storage to the guest storage. This change makes sure we convert to the correct level of the guest storage. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: fix texture resolving with compressed formatsChia-I Wu2019-06-041-12/+17
| | | | | | | | | util_format_translate_3d expects the source box to be aligned to the block size. When resolving, make sure the size of the staging buffer is aligned to the block size. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* freedreno: Add printf pattern string.Bas Nieuwenhuizen2019-06-041-1/+1
| | | | | | | Some new flag setting disallows it due to being a security risk. Fixes: c9c1e261064 "mesa: prevent common string formatting security issues" Reviewed-by: Rob Clark <[email protected]>
* panfrost/midgard: .pos propagationAlyssa Rosenzweig2019-06-041-8/+72
| | | | | | | | | | | | | A previous optimization converts fmax(x, 0.0) instructions to fmov.pos. This pass then propagates the .pos from the move up to the source instruction (when possible). From there, copy propagation will eliminate the move. In the future, we might prefer to do this in common NIR code like we do for saturate, as Bifrost can also benefit. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Cleanup copy propagationAlyssa Rosenzweig2019-06-041-11/+4
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Implement "pipeline register" prepassAlyssa Rosenzweig2019-06-044-2/+96
| | | | | | | | | | | | | | This prepass, run after scheduling but before RA, specializes to pipeline registers where possible. It walks the IR, checking whether sources are ever used outside of the immediate bundle in which they are written. If they are not, they are rewritten to a pipeline register (r24 or r25), valid only within the bundle itself. This has theoretical benefits for power consumption and register pressure (and performance by extension). While this is tested to work, it's not clear how much of a win it really is, especially without an out-of-order scheduler (yet!). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Helpers for pipelineAlyssa Rosenzweig2019-06-045-9/+79
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Refactor schedule/emit pipelineAlyssa Rosenzweig2019-06-046-707/+744
| | | | | | | | | | | | | | | | | | | | | First, this moves the scheduler and emitter out of midgard_compile.c into their own dedicated files. More interestingly, this slims down midgard_bundle to be essentially an array of _pointers_ to midgard_instructions (plus some bundling metadata), rather than the instructions and packing themselves. The difference is critical, as it means that (within reason, i.e. as long as it doesn't affect the schedule) midgard_instrucitons can now be modified _after_ scheduling while having changes updated in the final binary. On a more philosophical level, this removes an IR. Previously, the IR before scheduling (MIR) was separate from the IR after scheduling (post-schedule MIR), requiring a separate set of utilities to traverse, using different idioms. There was no good reason for this, and it restricts our flexibility with the RA. So unify all the things! Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Cleanup RA (stylistic changes)Alyssa Rosenzweig2019-06-041-16/+30
| | | | | | | Trivial. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Share MIR utilitiesAlyssa Rosenzweig2019-06-042-40/+46
| | | | | | | These are more generally useful than the files they were constrained to. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Misc. cleanup for readibilityAlyssa Rosenzweig2019-06-042-15/+35
| | | | | | | | Mostly, this fixes a number of instances of lines >> 80 chars, refactoring them into something legible. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Extend RA to non-vec4 sourcesAlyssa Rosenzweig2019-06-041-77/+278
| | | | | | | | | | | | | | | | | | | | This represents a major break with the former RA design. We now use conflicting register classes to represent the subdivision of Midgard's 128-bit registers into varying sizes and arrangement. We determine class based on the number of components in the instructions' masks. To support this, we include a number of helpers in the RA to allow composing swizzles and masks, such that MIR written implicitly assuming .xyzw sources can be transformed to use actual (non-aligned) sources. The net result is a marked decrease in register pressure on non-vec4-exclusive shaders. We could still be doing much better. Not implemented yet are: - Register spilling - Per-component liveness Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Set masks on ld_varyAlyssa Rosenzweig2019-06-041-1/+3
| | | | | | | | | | | | These masks distinguish scalar/vec2/vec3 loads from the default vec4, which helps with assembly readability (since it's immediately obvious how many components are _actually_ affected, rather than doing mysterious things to an unknown number of unused components). Later in the series, this will enable smarter register allocation, as the unused components will not be interpreted abnormally. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Fix liveness analysis bugsAlyssa Rosenzweig2019-06-041-2/+8
| | | | | | | | | This fixes liveness analysis with respect to inline constants and branching. in practice, the symptom is abnormally high register pressure. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Set int outmod for "pasted" codeAlyssa Rosenzweig2019-06-041-0/+4
| | | | | | | | | | These snippets of integer assembly are injected for various purposes. Eventually, we'll want to implement these in NIR directly. Regardless, the "default" output modifier is different between floats and ints, so let's set the right one. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Hoist some utility functionsAlyssa Rosenzweig2019-06-043-64/+71
| | | | | | | | These were static to midgard_compile.c but are more generally useful across the compiler. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Remove pinningAlyssa Rosenzweig2019-06-042-27/+2
| | | | | | | | | This mechanism is only used by blend shaders, so just use a move here. Ideally, it'll be copy-propped and DCE'd away; this removes a source of considerable indirection and will simplify RA logic. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* radeonsi/nir: Fix type in bindless address computationConnor Abbott2019-06-041-2/+2
| | | | | | Bindless handles in GL are 64-bit. This fixes an assert failure in LLVM. Reviewed-by: Marek Olšák <[email protected]>
* etnaviv: implement set_active_query_state(..) for hw queriesChristian Gmeiner2019-06-041-1/+10
| | | | | | | | | | Clear w/ quad uses a normal draw which adds up to OQ. st/meta uses set_active_query_state(..) to tell the driver to pause queries in such cases. Fixes spec@arb_occlusion_query@occlusion_query_meta_save piglit. Signed-off-by: Christian Gmeiner <[email protected]>
* iris: Fix SO stride units for DrawTransformFeedbackKenneth Graunke2019-06-032-2/+2
| | | | | | | | | | | Mesa measures in DWords. The hardware also claims to measure in DWords. Except the SO_WRITE_OFFSET field is actually bits 31:2, with 1:0 MBZ. Which means that it really measures in bytes. So, convert to bytes. Without this, our offset / stride denominator was 1/4th the size it should be, leading to 4x the vertex count that we should have had. Fixes GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_two_buffers
* amd/common: use generated register headerNicolai Hähnle2019-06-038-9/+6
|
* amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0Nicolai Hähnle2019-06-031-5/+5
| | | | | | | The automatic header generation unifies identical registers in a series and only emits definitions for the first one. This is mostly to avoid emitting excessive definitions for CB registers, but special-casing an exception for this family of registers doesn't seem worth it.
* amd/common: unify PITCH_GFX6 and PITCH_GFX9Nicolai Hähnle2019-06-032-7/+7
| | | | | | | | | | | The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.
* amd/common: cleanup DATA_FORMAT/NUM_FORMAT field namesNicolai Hähnle2019-06-032-8/+8
| | | | | | | | | | The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field *contents* were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.
* iris: Always reserve binding table space for NIR constantsCaio Marcelo de Oliveira Filho2019-06-032-9/+14
| | | | | | | | Don't have a separate mechanism for NIR constants to be removed from the table. If unused, we will compact it away. The use_null_surface is needed when INTEL_DISABLE_COMPACT_BINDING_TABLE is set. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Print binding tables when INTEL_DEBUG=btCaio Marcelo de Oliveira Filho2019-06-031-0/+53
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Compact binding tablesCaio Marcelo de Oliveira Filho2019-06-033-76/+234
| | | | | | | | | | | | | | | | | | | | | Change the iris_binding_table to keep track of what surfaces are actually going to be used, then assign binding table indices just for those. Reducing unused bytes on those are valuable because we use a reduced space for those tables in Iris. The rest of the driver can go from "group indices" (i.e. UBO #2) to BTI and vice-versa using helper functions. The value IRIS_SURFACE_NOT_USED is returned to indicate a certain group index is not used or a certain BTI is not valid. The environment variable INTEL_DISABLE_COMPACT_BINDING_TABLE can be set to skip compacting binding table. v2: (all from Ken) Use BITFIELD64_MASK helper. Improve comments. Assert all group is marked as used when we have indirects. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Create an enum for the surface groupsCaio Marcelo de Oliveira Filho2019-06-033-35/+45
| | | | | | | This will make convenient to handle compacting and printing the binding table. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Handle binding table in the driverCaio Marcelo de Oliveira Filho2019-06-036-121/+232
| | | | | | | | | | | | | | | | | | | | | Stop using brw_compiler to lower the final binding table indices for surface access. This is done by simply not setting the 'prog_data->binding_table.*_start' fields. Then make the driver perform this lowering. This is a better place to perfom the binding table assignments, since the driver has more information and will also later consume those assignments to upload resources. This also prepares us for two changes: use ibc without having to implement binding table logic there; and remove unused entries from the binding table. Since the `block` field in brw_ubo_range now refers to the final binding table index, we need to adjust it before using to index shs->constbuf. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Pull brw_nir_analyze_ubo_ranges() call out setup_uniformsCaio Marcelo de Oliveira Filho2019-06-031-3/+10
| | | | | | | | | We'll change iris to perform lowering of the binding table indices earlier (before the backend kick in), but the backend compiler uses the result of the analysis to identify load_ubo intrinsics, so we do the analysis after the lowering to have the right indices. Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: fix counting and printing for half registers.Hyunjun Ko2019-06-032-2/+2
| | | | | v2: defining 0x100 and use this for setting the FS_OUTPUT_REG.HALF_PRECISION Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Use output type size to set OUTPUT_REG_HALF_PRECISIONNeil Roberts2019-06-032-6/+2
| | | | | | | | | | | | | | | | | | | Previously the A5XX_SP_FS_OUTPUT_REG_HALF_PRECISION was set depending on whether half_precision was set in the shader key. With support for mediump precision, it is possible to have different outputs use different precisions. That means we can’t have a global shader state to specify it. Instead it now tries to copy the half-float-ness from the nir_variable for the output into the ir3_shader_variant. This is then used to decide whether to set half-precision for each output. The a6xx version is copied from the a5xx code but it has not been tested. v2. [Hyunjun Ko ([email protected])] There's the half flag recently added, which represents precision based on IR3_REG_HALF. Now use this flag to avoid duplication. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: init sctx->dma_copy before using itPierre-Eric Pelloux-Prayer2019-06-031-3/+3
| | | | | | | | | | | | | | Commit a1378639ab19 reordered context functions initializations but broke sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma. In this case sctx->dma_copy was assigned a value after being used in: sctx->b.resource_copy_region = sctx->dma_copy; This commit moves the FORCE_DMA special case after sctx->dma_copy initialization. See https://bugs.freedesktop.org/show_bug.cgi?id=110422 Signed-off-by: Marek Olšák <[email protected]>
* ac: use amdgpu-flat-work-group-sizeMarek Olšák2019-06-031-5/+2
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* etnaviv: drop a bunch of duplicated gallium PIPE_CAP default codeChristian Gmeiner2019-06-031-157/+0
| | | | | | | Now that we have the util function for the default values, we can get rid of the boilerplate. Signed-off-by: Christian Gmeiner <[email protected]>
* nir: copy intrinsic type when lowering load input/uniform and store outputJonathan Marek2019-06-031-0/+1
| | | | | | | | | Fixes: c1275052 "nir: add type information to load uniform/input and store output intrinsics" Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Tested-by: Erico Nunes <[email protected]> Tested-by: Andreas Baierl <[email protected]>
* iris: Drop unused locals from iris_clear.c to avoid warningCaio Marcelo de Oliveira Filho2019-05-311-3/+0
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir: remove bool lowering from lower_int_to_floatJonathan Marek2019-05-312-0/+3
| | | | | | | | | | | | | | Removes the bool_to_float logic from the int_to_float pass, so that both can be used separately. By having separate passes we have better validation and it makes it possible to use with the lower_ftrunc option (int lowering generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should probably be reworked). For now we always expect lower_bool to come after lower_int. Also fixes f2i32 to become ftrunc and adds u2f/f2u cases. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add lower_bitshift optionJonathan Marek2019-05-312-0/+2
| | | | | | | | | Add a "lower_bitshift" option, which disables optimizations introducing bitshifts and lowers ishl by constant to a multiply, so that we don't have to deal with bitshifts in int_to_float lowering. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/a6xx: add 'type' to shader state keyRob Clark2019-05-312-0/+2
| | | | | | | | | | | | | | | | | | | | We could have identical texture state for both VS and FS.. which would result in VS state getting created first, and FS state mapping to the identical cmdstream. Resulting in VS state getting emitted twice and no FS state emitted. Fixes: dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: fix GPU crash on small render targetsRob Clark2019-05-311-0/+7
| | | | | | | Fixes dEQP-GLES2.functional.multisampled_render_to_texture.readpixels Signed-off-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]>
* panfrost: Remove link stage for jobsTomeu Vizoso2019-05-312-68/+54
| | | | | | | | | | And instead, link them as they are added. Makes things a bit clearer and prepares future work such as FB reload jobs. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Switch to kernel 5.2-rc2Tomeu Vizoso2019-05-311-4/+3
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Update expectationsTomeu Vizoso2019-05-311-8/+3
| | | | | | | A bunch of tests have been fixed, but some regressions have appeared on T760. Signed-off-by: Tomeu Vizoso <[email protected]>
* radeonsi/nir: Remove hack for builtinsConnor Abbott2019-05-311-11/+2
| | | | | | | | | | We now bounds check properly in the uniform loading fast path, so there's no need to disable it by pretending there are other UBO bindings in use. The way this looks at the variable name was causing problems when two piglit shaders, one with a name that triggered the hack and one that didn't, got hashed to the same thing after stripping out the names. Reviewed-by: Timothy Arceri <[email protected]>