summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir: fix lowering arrays to elements for XFB outputsSamuel Pitoiset2019-01-221-2/+11
| | | | | | | | | | | | | | | | | | | | If we have a transform feedback output like: float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0) which is lowered by nir_lower_io_arrays_to_elements to, float x2_out (VARYING_SLOT_VAR1.x, 0, 0) float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0) We have to update the destination offset to avoid overwriting the same value. v2 (Jason Ekstrand): - Compute the correct offsets for arrays of vectors and/or doubles Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: do not remove varyings used for transform feedbackSamuel Pitoiset2019-01-221-0/+3
| | | | | | | | When a xfb buffer is explicitely declared on a varying variable, we shouldn't remove it at link time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Only set interface_type on blocksJason Ekstrand2019-01-221-9/+25
| | | | | | | | Instead of setting interface_type to whatever the per-vertex type is, we only set it on blocks. This allows later passes to tell the difference between variables that are in blocks and those that aren't. Reviewed-by: Alejandro Piñeiro <[email protected]>
* spirv: Only split blocksJason Ekstrand2019-01-221-3/+8
| | | | | | | | | Instead of splitting every per-vertex struct, just split the ones that are actually blocks. The reason for the split is so that we have separate variables for separate locations, qualifiers, and builtin decorations. The vulkan spec only allows these on members of blocks. Reviewed-by: Alejandro Piñeiro <[email protected]>
* spirv: Initialize struct member offsets to -1Jason Ekstrand2019-01-221-0/+1
| | | | | | | This is the "no offset specified" value. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Always emit at least one vertex elementJason Ekstrand2019-01-221-3/+1
| | | | | | | | | This seems to make the simulator happier. The early return wasn't really protecting anything and the code that follows will happily initialize the dummy element to STORE_0 and emit it. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* configure: EGL requirements only apply if EGL is builtEric Engestrom2019-01-221-0/+1
| | | | | | | | | Issue was hit with this configuration: --disable-{egl,gbm} --with-platform=drm Signed-off-by: Eric Engestrom <[email protected]> Fixes: 3208fd2e46b ("configure: move platform handling further up") Reviewed-by: Emil Velikov <[email protected]>
* freedreno: a2xx: add partial lower_scalar pass for ir2Jonathan Marek2019-01-225-0/+180
| | | | | | Some instructions can only be scalar on a2xx, lower these only Signed-off-by: Jonathan Marek <[email protected]>
* freedreno: a2xx: add ir2 copy propagationJonathan Marek2019-01-225-0/+236
| | | | | | | | Two cases: * replacing srcs which refer to MOV instructions * replacing MOVs used to write to exports Signed-off-by: Jonathan Marek <[email protected]>
* freedreno: a2xx: insert scalar MOV to allow 2 source scalarJonathan Marek2019-01-221-0/+132
| | | | | | | | | | | If we want to use a scalar instruction with two sources, both sources have to be in the same register. This covers a common case by inserting a scalar MOV into a previous instruction with only a vector alu instruction. A better method would be to have the sources end up in the same register in the first place, but when one source is a constant this is the only way. Signed-off-by: Jonathan Marek <[email protected]>
* freedreno: a2xx: NIR backendJonathan Marek2019-01-2221-2529/+3033
| | | | | | | | | | | | This patch replaces the a2xx TGSI compiler with a NIR compiler. It also adds several new features: -gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize -control flow (including loops) -texture related features (LOD/bias, cubemaps) -filling scalar ALU slot when possible Signed-off-by: Jonathan Marek <[email protected]>
* nir: cleanup glsl_get_struct_field_offset, glsl_get_explicit_strideTapani Pälli2019-01-222-5/+5
| | | | | | | | | Take away const qualifier from return type of these functions as -Wignored-qualifiers points out it is ignored for these cases. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* travis: fix autotools build after --enable-autotools switch additionEric Engestrom2019-01-221-1/+3
| | | | | | | Fixes: e68777c87ceed02ab199 "autotools: Deprecate the use of autotools" Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Update the JSON and headers from Khronos masterJason Ekstrand2019-01-212-124/+336
| | | | | | This corresponds to commit 79b6681aadcb53c27d1052e on GitHub. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Mark deref UBO and SSBO access as non-scalarJason Ekstrand2019-01-211-1/+3
| | | | | Fixes: 63b9aa2e2574 "spirv: Add support for using derefs for..." Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/spirv: handle ContractionOff execution modeKarol Herbst2019-01-216-3/+17
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/vtn: add caps for some cl related capabilitiesRob Clark2019-01-213-5/+20
| | | | | | | | | | | vtn supports these, so don't squalk if user is happy with enabling these. v2: add new members sorted Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vtn: handle SpvExecutionModelKernelKarol Herbst2019-01-211-0/+2
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa: add MESA_SHADER_KERNELKarol Herbst2019-01-2115-13/+41
| | | | | | | | used for CL kernels Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv/pipeline: Add a pdevice helper variableJason Ekstrand2019-01-211-9/+10
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* relnotes: Add newly added Vulkan extensionsJason Ekstrand2019-01-211-0/+7
| | | | | | | Both the Intel and RADV people have been really bad about adding things to the release notes. We should start actually paying attention. Acked-by: Tapani Pälli <[email protected]>
* anv: Only parse pImmutableSamplers if the descriptor has samplersJason Ekstrand2019-01-211-12/+31
| | | | | | | Cc: [email protected] Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* radv: prevent dirtying of dynamic state when it does not changeRhys Perry2019-01-211-16/+75
| | | | | | | | DXVK often sets dynamic state without actually changing it. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: avoid context rolls when binding graphics pipelinesRhys Perry2019-01-213-108/+141
| | | | | | | | | | | | | | | | | | It's common in some applications to bind a new graphics pipeline without ending up changing any context registers. This has a pipline have two command buffers: one for setting context registers and one for everything else. The context register command buffer is only emitted if it differs from the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug v3: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add missed situations for scissor bug workaroundRhys Perry2019-01-212-24/+43
| | | | | | | | v2: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass radv_draw_info to radv_emit_draw_registers()Rhys Perry2019-01-211-60/+58
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno: a2xx: sysmem renderingJonathan Marek2019-01-211-0/+45
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix non-zero texture base offsetsJonathan Marek2019-01-211-1/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20xJonathan Marek2019-01-213-18/+34
| | | | | | | | | | On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small rearrangement on a220 to reduce the size of draw commands. Only set DEALLOC_CNTL on a20x because the correct a220 value is not known. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix gmem2mem viewportJonathan Marek2019-01-211-0/+7
| | | | | | | Fixes cases where previous viewport values might case gmem2mem to fail. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTLJonathan Marek2019-01-212-18/+20
| | | | | | | | | | Doesn't change much, but reduces the size of fd2_emit_state gmem2mem does not need to change the value: no Z clipping on resolve mem2gmem now needs to restore the common value after rendering Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: cleanup init_shader_constJonathan Marek2019-01-213-14/+11
| | | | | | | | | Only 3 vertices are used so we can drop the data for vertex 4 It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: add bit_size parameter to system values with multiple allowed bit sizesKarol Herbst2019-01-215-9/+19
| | | | | | | | | v2: add assert to verify we have at least one valid bit_size v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add legal bit_sizes to intrinsicsKarol Herbst2019-01-214-13/+30
| | | | | | | | | | | | | | | | | | | With OpenCL some system values match the address bits, but in GLSL we also have some system values being 64 bit like subgroup masks. With this it is possible to adjust the builder functions so that depending on the bit_sizes the correct bit_size is used or an additional argument is added in case of multiple possible values. v2: validate dest bit_size v3: generate hex values in python code remove useless imports rename and move bit_sizes v4: add 1 to legal bit_sizes for front_face Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/validate: allow to check against a bitmask of bit_sizesKarol Herbst2019-01-211-17/+17
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: replace more nir_load_system_value calls with builder functionsKarol Herbst2019-01-215-14/+12
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl/lower_output_reads: set invariant and precise flags on temporariesKarol Herbst2019-01-211-0/+4
| | | | | | | | | | | | | | | | | fixes a couple of deqp tests (on nvc0 and potential other drivers): dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 CC: <[email protected]> Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add missing CAPs for unsupported featuresRhys Kidd2019-01-202-0/+4
| | | | | Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nir/spirv: handle SpvStorageClassCrossWorkgroupKarol Herbst2019-01-195-0/+12
| | | | | | | | | | v2: rename nir_var_global to nir_var_mem_global Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_shared to nir_var_mem_sharedKarol Herbst2019-01-1911-23/+23
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_ssbo to nir_var_mem_ssboKarol Herbst2019-01-1915-37/+37
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_ubo to nir_var_mem_uboKarol Herbst2019-01-1910-14/+14
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-1928-81/+81
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_private to nir_var_shader_tempKarol Herbst2019-01-1917-29/+29
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/genxml: add missing MI_PREDICATE compare operationsLionel Landwerlin2019-01-197-1/+12
| | | | | | | | Doesn't save us a great deal of lines but at least they get decoded in aubinators. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* anv: document cache flushes & invalidationsLionel Landwerlin2019-01-191-0/+67
| | | | | | | | | | | | | | | | | A little bit of explanation regarding how vkCmdPipelineBarrier() works. v2: Avoid referring to data port cache when it's actually sampler caches (Jason) Complete explanation for indirect draws (Jason) v3: s/samplers/sampler/ (Jason) s/UBOs/data port/ Add documentation for VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* anv: narrow flushing of the render target to buffer writesLionel Landwerlin2019-01-196-20/+15
| | | | | | | | | | | | | | In commit 9a7b3199037ac4 ("anv/query: flush render target before copying results") we tracked all the render target writes to apply a flushes in the vkCopyQueryResults(). But we can narrow this down to only when we write a buffer (which is the only input of vkCopyQueryResults). v2: Drop newer render target write flags introduce by 1952fd8d2ce905 ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* glsl: be much more aggressive when skipping shader compilationTimothy Arceri2019-01-192-6/+10
| | | | | | | | | | | | | | | | | | | | Currently we only add a cache key for a shader once it is linked. However games like Team Fortress 2 compile a whole bunch of shaders which are never actually linked. These compiled shaders can take up a bunch of memory. This patch changes things so that we add the key for the shader to the cache as soon as it is compiled. This means on a warm cache we can avoid the wasted memory from these shaders. Worst case scenario is we need to compile the shaders at link time but this can happen anyway if the shader has been evicted from the cache. Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a warm cache from start up to the game menu. V2: only add key to cache when compilation is successful. Acked-by: Marek Olšák <[email protected]>
* intel/fs: Promote execution type to 32-bit when any half-float conversion is ↵Francisco Jerez2019-01-181-0/+21
| | | | | | | | | | | | | | | needed. The docs are fairly incomplete and inconsistent about it, but this seems to be the reason why half-float destinations are required to be DWORD-aligned on BDW+ projects. This way the regioning lowering pass will make sure that the destination components of W to HF and HF to W conversions are aligned like the corresponding conversion operation with 32-bit execution data type. Tested-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac/nir_to_llvm: fix interpolateAt* for arraysTimothy Arceri2019-01-191-19/+58
| | | | | | | | | | | This builds on the recent interpolate fix by Rhys ee8488ea3b99. This fixes the arb_gpu_shader5 interpolateAt* tests that contain arrays. Fixes: ee8488ea3b99 ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics") Reviewed-by: Bas Nieuwenhuizen <[email protected]>