summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arraysTimothy Arceri2017-12-041-1/+1
| | | | | | | | This pass is more fully featured, it supports geom and tess shaders. It also supports interpolation intrinsics. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: allow builin arrays to be loweredTimothy Arceri2017-12-041-7/+10
| | | | | | | Galliums nir drivers expect this to be done. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: add array lowering function that assumes there are no indirectsTimothy Arceri2017-12-042-1/+44
| | | | | | | | | | The gallium glsl->nir pass currently lowers away all indirects on both inputs and outputs. This fuction allows us to lower vs inputs and fs outputs and also lower things one stage at a time as we don't need to worry about indirects on the other side of the shaders interface. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: enable nir varying array splittingTimothy Arceri2017-12-041-0/+3
| | | | Acked-by: Dave Airlie <[email protected]>
* st/glsl_to_nir: enable NIR link time optsTimothy Arceri2017-12-042-7/+105
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/nir: add support for packed inputsTimothy Arceri2017-12-041-21/+25
| | | | | | | | | Because NIR can create non vec4 variables when implementing component packing we need to make sure not to reprocess the same slot again. Also we can drop the fs_attr_idx counter and just use driver_location. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()Timothy Arceri2017-12-041-30/+37
| | | | | | | NIR component packing will be inserted between these calls and the calling of st_glsl_to_nir_post_opts(). Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: call some lowering passes earlierTimothy Arceri2017-12-041-8/+12
| | | | | | This is required so that we can enbale NIR linking optimisations. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: add basic NIR opt loop helperTimothy Arceri2017-12-041-0/+31
| | | | | | | | We need to be able to do these NIR opts in the state tracker rather than the driver in order for the NIR linking opts to be useful. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: make st_glsl_to_nir() staticTimothy Arceri2017-12-042-55/+51
| | | | | | Here we also move the extern C functions to the bottom of the file. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: split the st_glsl_to_nir() function in twoTimothy Arceri2017-12-041-22/+34
| | | | | | | | We want to be able to generate NIR then apply NIR optimisations. Once the optimisations are done we can then apply the new post opt function which assigns uniforms etc based on the optimised IR. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: create set_st_program() helperTimothy Arceri2017-12-041-34/+40
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl: move nir linking loop to new function st_link_nir()Timothy Arceri2017-12-043-17/+41
| | | | | | | This will allow us to refactor linking and include some nir link time optimisations. Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: fix support for scalar arrays in nir_lower_io_types()Timothy Arceri2017-12-041-7/+3
| | | | | | | This was just recreating the same vector type we alreay had and hitting an assert for scalars. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: add st_nir_assign_var_locations() helperTimothy Arceri2017-12-041-9/+34
| | | | | | This avoids packed varyings being assigned different driver locations. Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: enable nir component packingTimothy Arceri2017-12-041-0/+6
| | | | | | | | SaschaWillems Vulkan demo tessellation: ~4000fps -> ~4600fps Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add varying component packing helpersTimothy Arceri2017-12-042-0/+332
| | | | | | | | | | | | v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. v4: 33dca36f4f28 fixed nir_gather_info to update outputs_read correct, make sure we also adjust this correctly when packing components. Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]> (v3)
* nir: add varying array splitting passTimothy Arceri2017-12-044-0/+386
| | | | | | | | | | | | | V2: - fix matrix support, non-array matrices were being skipped in v1 v3: - handle lowering of tcs output loads correctly - correctly mark indirect locations for either in or out not both when processing a stage. - use nir_src_copy() when lowering stores. Reviewed-by: Nicolai Hähnle <[email protected]>
* freedreno/ir3: relax barriersRob Clark2017-12-031-2/+2
| | | | | | Instructions with no barrier_class can move wrt. an EVERYTHING barrier. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: all mem instructions have WAR hazzardRob Clark2017-12-031-1/+1
| | | | | | | | It isn't just load instructions that have write-after-read hazzard. Fixes stk gaussian blur compute shaders. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add debug option to force emulated indirectRob Clark2017-12-033-0/+12
| | | | | | Useful mostly for debugging indirect draw. Signed-off-by: Rob Clark <[email protected]>
* freedreno: also mark draw-indirect buffer as readRob Clark2017-12-031-0/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: small cleanupsRob Clark2017-12-031-17/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: avoid unneccessary batch flushRob Clark2017-12-031-0/+2
| | | | | | | | | In some cases we can end up trying to add a write dependency on ourself, which shouldn't trigger a flush. Avoids an extra couple flushes per from in stk. Signed-off-by: Rob Clark <[email protected]>
* freedreno: avoid mem2gmem for invalidated buffersRob Clark2017-12-033-2/+17
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: deferred flush supportRob Clark2017-12-035-4/+32
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: rework fence trackingRob Clark2017-12-0312-61/+109
| | | | | | | | | ctx->last_fence isn't such a terribly clever idea, if batches can be flushed out of order. Instead, each batch now holds a fence, which is created before the batch is flushed (useful for next patch), that later gets populated after the batch is actually flushed. Signed-off-by: Rob Clark <[email protected]>
* freedreno: proper locking for iterating dependent batchesRob Clark2017-12-032-8/+20
| | | | | | | | | In transfer_map(), when we need to flush batches that read from a resource, we should be holding screen->lock to guard against race conditions. Somehow deferred flush seems to make this existing race more obvious. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: correct max_indicies for indirect drawsRob Clark2017-12-031-1/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* spirv: Convert the supported_extensions struct to spirv_optionsJason Ekstrand2017-12-025-37/+44
| | | | | | | | This is a bit more general and lets us pass additional options into the spirv_to_nir pass beyond what capabilities we support. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* spirv: Only emit functions which are actually usedJason Ekstrand2017-12-023-8/+26
| | | | | | | | | | Instead of emitting absolutely everything, just emit the few functions that are actually referenced in some way by the entrypoint. This should save us quite a bit of time when handed large shader modules containing many entrypoints. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* spirv: Drop the impl field from vtn_builderJason Ekstrand2017-12-024-8/+6
| | | | | | | We have a nir_builder and it has an impl field. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* i965: Serialize nir later in the linking processJordan Justen2017-12-011-9/+16
| | | | | | | | | | | Fixes MESA_GLSL=cache_fb with piglit tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test Fixes: 0610a624a12 i965/link: Serialize program to nir after linking for shader cache Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* configure: avoid testing for negative compiler optionsMarc Dietrich2017-12-012-10/+19
| | | | | | | | | | | | | | | gcc seems to always accept unsupported negative compiler warning options: echo "int i;" | gcc -c -xc -Wno-bob - # no error echo "int i;" | gcc -c -xc -Walice - # unsupported compiler option Inverting the options fixes the tests. V2: fix options in meson build Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Signed-off-by: Marc Dietrich <[email protected]>
* broadcom/vc4: Use a single-entry cached last_hindex value.Eric Anholt2017-12-012-2/+20
| | | | | | | | | Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).
* broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.Eric Anholt2017-12-011-5/+14
| | | | | | | | No significant difference in the minetest replay, but it should reduce overhead by not requiring that we write quad indices to index buffers that we repeatedly re-upload (and making the draw packet smaller, as well). Over the course of the series the actual game seems to be up by 1-2 fps.
* broadcom/vc4: Use the new enum functionality of the XML to decode better.Eric Anholt2017-12-011-20/+25
|
* broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.Eric Anholt2017-12-013-3/+12
| | | | | | | | | Now that there's only one user of it, it's pretty obvious how to avoid emitting redundant ones. This should save a bunch of kernel validation overhead. No statistically sigificant difference on the minetest trace I was looking at (n=169), but the maximum FPS is up by .3%
* broadcom/vc4: Simplify the relocation handling for index buffers.Eric Anholt2017-12-012-17/+17
| | | | | | Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.
* broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.Eric Anholt2017-12-011-16/+27
| | | | | | | | | | | | | We failed to take the start into account for how many vertices to draw in this round, so we would end up decrementing count below 0, which as an unsigned number meant we would loop until the CLs soon ran out of space. When I wrote the code I was thinking about how to use the previously emitted shader state (no index bias baked into the elements) by emitting up to 65535 and then only re-emitting with bias for the second wround, but that doesn't work if the start is over 65535. Instead, just delay emitting shader state until we get into the drawarrays GFXH-515 loop and always bake the bias in when we're doing the workaround.
* broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.Eric Anholt2017-12-011-1/+1
| | | | For triangle strips, we step by max_verts - 2.
* meson: use dep_thread instead of dependency('threads') in freedrenoDylan Baker2017-12-011-1/+1
| | | | | | | | They are the same thing, but this is more consistent with the rest of the project. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Add lmsensors supportDylan Baker2017-12-017-4/+25
| | | | | | | | v2: - Make -Dlmsensors=false work - Simplify auto and true cases Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Add support for gallium extra hudDylan Baker2017-12-012-0/+10
| | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* glx: Prepare driFetchDrawable for no-config contextsAdam Jackson2017-12-013-8/+30
| | | | | | | | | When we look up the DRI drawable state we need to associate an fbconfig with the drawable. With GLX_EXT_no_config_context we can no longer infer that from the context and must instead query the server. Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx: Use __glXSendError instead of open-coding itAdam Jackson2017-12-012-26/+4
| | | | | | | | This also fixes a bug, the error path through MakeCurrent didn't translate the error code by the extension's error base. Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glx: Simplify some dummy vtable interactionsAdam Jackson2017-12-011-5/+5
| | | | | | | | The dummy vtable has these slots as NULL already, no need to check for the dummy context explicitly. Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* docs/release-calendar: update and extendEmil Velikov2017-12-011-16/+15
| | | | | | | | | v2: Missing td tag, add Andres + Juan for 17.2.8 and 17.3.3 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Nicolai Hähnle <[email protected]> (v1) Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* docs/specs: annotate MESA_set_3dfx_mode as obsoleteEmil Velikov2017-12-012-2/+2
| | | | | | | | | | | | Aimed to work with Glide, which hasn't been a thing in over 10 years. There are no drivers that implement it, so annotate it as obsolete v2: Move the extension to OLD/ Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v1) Reviewed-by: Adam Jackson <[email protected]> (v1) Reviewed-by: Ian Romanick <[email protected]>
* xlib: remove dummy GLX_MESA_set_3dfx_mode implementationEmil Velikov2017-12-016-66/+1
| | | | | | | | | | | | | The implementation is a simple 'return EGL_FALSE'. Stop pretending and simply remove it. Note: the removal of XMesa API is fine, since there hasn't been any users for it in years. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Ian Romanick <[email protected]>