summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Add support for flushing dirty TMU data at job end.Eric Anholt2019-01-142-0/+20
| | | | This will be needed for SSBOs and image_load_store.
* st/dri: fix dri2_format_table for argb1555 and rgb565Marek Olšák2019-01-141-1/+1
| | | | | | | | | The bug caused that rgb565 framebuffers used argb1555. Fixes: 433ca3127a3b94bfe9a513e7c7ce594e09e1359f Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* docs: fix gallium screen cap docsIlia Mirkin2019-01-101-11/+11
| | | | | | | | | Make sure that the next line starts with spaces so that bullets are maintained throughout, add `` around a few more special tokens, and fix SAMPLE_COUNT_TEXTURE -> SAMPLE_COUNT. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* freedreno/a6xx: fix 3d+tiled layoutRob Clark2019-01-101-34/+52
| | | | | | | | The last round of fixing 3d layer+level layout skipped the tiled case, since tiled texture support was not in place yet. This finishes the job. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: move tile_mode to sampler-view CSORob Clark2019-01-102-7/+7
| | | | | | | | | | | This is known when the CSO is created, so no need to patch it in later. Also, it seems like smaller textures where the first level is small enough to be linear, it seems like we should set linear tile mode. See: dEQP-GLES3.functional.texture.format.unsized.rgb_unsigned_byte_3d_pot Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: separate stencil restore/resolve fixesRob Clark2019-01-101-14/+21
| | | | | | | | | | | | | Previously we'd use format/etc from the primary (z32) buffer for the stencil (s8), due to confusion about rsc vs psurf. Rework this to drop extra arg and push down handling of separate stencil case (and make sure we take the fmt from the right place). This doesn't completely fix separate-stencil, but at least it avoids the GPU scribbling over random other cmdstream buffers and causing a bunch of bogus fails in dEQP. Signed-off-by: Rob Clark <[email protected]>
* etnaviv: fix typo in cflush_all descriptionGuido Günther2019-01-101-1/+1
| | | | | Signed-off-by: Guido Günther <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* st/va: Return correct status from vlVaQuerySurfaceStatusIndrajit Das2019-01-091-0/+31
| | | | | | | | | This ensures that during encoding, applications can get the correct status of the surface before submitting more operations on the same. Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Indrajit Das <[email protected]>
* Revert "llvmpipe: Always return some fence in flush (v2)"Roland Scheidegger2019-01-091-2/+0
| | | | | | | | | This reverts commit f6a6da8131383d8eeee07cd59326a70f4b15866b. With this commit we see massive amounts of asserts triggering in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib state tracker and piglit. Not entirely sure if the assert could just be removed.
* st/mesa: don't leak pipe_surface if pipe_context is not currentMarek Olšák2019-01-091-0/+19
| | | | | | | | | | We have found some pipe_surface leaks internally. This is the same code as surface_destroy in radeonsi. Ideally, surface_destroy would be in pipe_screen. Cc: 18.3 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.Mario Kleiner2019-01-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With Mesa 18.1, commit be973ed21f6e, si_llvm_load_input_vs() changed the number of source 32-bit wide dword components used for fetching vertex attributes into the vertex shader from a constant 4 to a variable num_channels number, depending on input data format, with some special case handling for input data formats like 64-Bit doubles. In the case of a GL_DOUBLE input data format with one or two components though, e.g, submitted via ... a) glTexCoordPointer(1, GL_DOUBLE, 0, buffer); b) glTexCoordPointer(2, GL_DOUBLE, 0, buffer); ... the input format would be SI_FIX_FETCH_RG_64_FLOAT, but no special case handling was implemented for that case, so in the default path the number of 32-bit dwords would be set to the number of float input components derived from info->input_usage_mask. This ends with corrupted input to the vertex shader, because fetching a 64-bit double from the vbo requires fetching two 32-bit dwords instead of 1, and fetching a two double input requires 4 dword fetches instead of 2, so in these cases the vertex shader receives incomplete/truncated input data: a) float v = gl_MultiTexCoord0.x; -> v.x is corrupted. b) vec2 v = gl_MultiTexCoord0.xy; -> v.x is assigned correctly, but v.y is corrupted. This happens with the standard TGSI IR compiled shaders. Under NIR with R600_DEBUG=nir, we got correct behavior because the current radeonsi nir code always assigns info->input_usage_mask = TGSI_WRITEMASK_XYZW, thereby always fetches 4 dwords regardless of what the shader actually needs. Fix this by properly assigning 2 or 4 dword fetches for one or two component GL_DOUBLE input. Fixes: be973ed21f6e ("radeonsi: load the right number of components for VS inputs and TBOs") Signed-off-by: Mario Kleiner <[email protected]> Cc: [email protected] Cc: Marek Olšák <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsicsRhys Perry2019-01-091-0/+3
| | | | | | | | | | | | | | Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is enabled with DXVK. It also fixes artifacts with Fallout 4's god rays with DXVK. Various piglit interpolateAt*() tests under NIR are also fixed. v2: formatting fix update commit message to include Fallout 4 and the Fixes tag Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595 Signed-off-by: Rhys Perry <[email protected]>
* llvmpipe: Always return some fence in flush (v2)Tomasz Figa2019-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <[email protected]>
* v3d: Enable GL_ARB_texture_gather on V3D 4.x.Eric Anholt2019-01-081-0/+5
| | | | | This is part of GLES 3.1, and with the NIR lowering we're now passing the GLES31 testcases.
* freedreno: Move register constant files to src/freedreno.Bas Nieuwenhuizen2019-01-0811-22475/+2
| | | | | | | | This way they can be shared. Build tested with meson, but not too sure on the autotools stuff though. Reviewed-by: Dylan Baker <[email protected]> Acked-by: Rob Clark <[email protected]>
* nir: rename global/local to private/function memoryKarol Herbst2019-01-083-4/+4
| | | | | | | | | | | | | | | | | | the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* autotools: Remove tegra vdpau driverDylan Baker2019-01-081-2/+0
| | | | | | | | | | | This has never functioned and probably wont ever function, due to the way gallium media state trackers are architected and the tegra video decoder is architected. Cc: Thierry Reding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Fixes: 1755f608f5201e0a23f00cc3ea1b01edd07eb6ef ("tegra: Initial support")
* clover/meson: Ignore 'svn' suffix when computing CLANG_RESOURCE_DIRPierre Moreau2019-01-081-1/+1
| | | | | | | | | | | | | | | The version exported by LLVM in its CMake configuration files can include the “svn” suffix when building a development version (for example “8.0.0svn”). However the exported clang headers are still found under “lib/clang/8.0.0/”, without the “svn” suffix. Meson takes care of removing the “svn” suffix from the version when using the dependency’s `version()` method. This processing is already performed in “configure.ac” when using autotools. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* virgl: use primconvert provoking vertex properlyDave Airlie2019-01-082-8/+24
| | | | | | | | This stores the raster state and calls the correct primconvert interface using the currently bound raster state. Reviewed-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: Add support for using derefs for UBO/SSBO accessJason Ekstrand2019-01-081-0/+1
| | | | | | | | | For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* glsl_type: Add support for explicitly laid out matrices and arraysJason Ekstrand2019-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | SPIR-V allows for matrix and array types to be decorated with explicit byte stride decorations and matrix types to be decorated row- or column-major. This commit adds support to glsl_type to encode this information. Because this doesn't work nicely with std430 and std140 alignments, we add asserts to ensure that we don't use any of the std430 or std140 layout functions with explicitly laid out types. In SPIR-V, the layout information for matrices is applied to the parent struct member instead of to the matrix type itself. However, this is gets rather clumsy when you're walking derefs trying to compute offsets because, the moment you hit a matrix, you have to crawl back the deref chain and find the struct. Instead, we take the same path here as we've taken in spirv_to_nir and put the decorations on the matrix type itself. This also subtly adds support for strided vector types. These don't come up in SPIR-V directly but you can get one as the result of taking a column from a row-major matrix or a row from a column-major matrix. Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Distinguish between normal uniforms and UBOsJason Ekstrand2019-01-081-2/+3
| | | | | | | | | | | | | | | | Previously, NIR had a single nir_var_uniform mode used for atomic counters, UBOs, samplers, images, and normal uniforms. This commit splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform is still a bit of a catch-all but the nir_var_ubo is specific to UBOs. While we're at it, we also rename shader_storage to ssbo to follow the convention. We need this so that we can distinguish between normal uniforms and UBO access at the deref level without going all the way back variable and seeing if it has an interface type. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* etnaviv: annotate variables only used in debug buildLucas Stach2019-01-071-7/+4
| | | | | | | | | Some of the status variables in the compiler are only used in asserts and thus may be unused in release builds. Annotate them accordingly to avoid 'unused but set' warnings from the compiler. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: enable full overwrite in a few more casesLucas Stach2019-01-071-4/+7
| | | | | | | | | Take into account the render target format when checking if the color mask affects all channels of the RT. This allows to enable full overwrite in a few cases where a non-alpha format is used. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* v3d: Fix up VS output setup during precompiles.Eric Anholt2019-01-041-6/+10
| | | | | | | | | I noticed that a VS I was debugging was missing all of its output stores -- outputs_written was for POS, VAR0, VAR3, while the shader's variables were POS, VAR9, and VAR12. I'm not sure what outputs_written is supposed to be doing here, but we can just walk the declared variables and avoid both this bug and the emission of extra stvpms for less-than-vec4 varyings.
* virgl: remove empty fileGurchetan Singh2019-01-031-0/+0
| | | | | Fixes: 174f53 ("virgl: consolidate transfer code") Reviewed-by: Erik Faye-Lund <[email protected]>
* virgl: don't flush an empty rangeGurchetan Singh2019-01-031-0/+4
| | | | | | | | | | | | | | Otherwise, the gl-1.0-long-dlist Piglit test crashes. Fixes: db7757 ("virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT") Reported by airlied@ v2: Exit on any invalid range (Erik) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109190 Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]> Tested-by: Jakob Bornecrantz <[email protected]>
* virgl/vtest: Use default socket name from protocol headerJakob Bornecrantz2019-01-031-3/+1
| | | | | | | | No functional change as the socket name is the same, just removing the double definition of the path. Reviewed-by: Gurchetan Singh <[email protected]> Signed-off-by: Jakob Bornecrantz <[email protected]>
* freedreno: fix staging resource size for arraysRob Clark2019-01-031-2/+10
| | | | | | | | | | | A 2d-array texture (for example), should get the # of array elements from box->depth, rather than depth0 which is minified. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_bias_float_fragment with tiled textures. Reported-by: Kristian H. Kristensen <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: remove blit_via_copy_region()Rob Clark2019-01-031-4/+0
| | | | | | | | | | | | | If we hit the memcpy() path for copy_region(), that will try to do a transfer_map(), which goes badly for blits to/from staging triggered by transfer_map() or transfer_unmap(). We could possibly add fd_blit2() which has allow_transfer_map param, and call that for staging blits. But I'm not really sure if trying the blit via copy_region() is very useful. At least for newer gens that implement fd_context::blit(), it probably isn't. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: rework blitter APIRob Clark2019-01-031-54/+8
| | | | | | | | Switch over to using fd_context::blit(), in the same way that a5xx does. The previous patch wires fd_resource_copy_region() up to the blitter so a6xx no longer needs to bypass the core layer to accelerate this. Signed-off-by: Rob Clark <[email protected]>
* freedreno: try blitter for fd_resource_copy_region()Rob Clark2019-01-031-0/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: rework blit APIRob Clark2019-01-038-27/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First step to unify the way fd5 and fd6 blitter works. Currently a6xx bypasses the blit API in order to also accelerate resource_copy_region() But this approach can lead to infinite recursion: #0 fd_alloc_staging (ctx=0x5555936480, rsc=0x7fac485f90, level=0, box=0x7fbab29220) at ../src/gallium/drivers/freedreno/freedreno_resource.c:291 #1 0x0000007fbdebed04 in fd_resource_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/drivers/freedreno/freedreno_resource.c:479 #2 0x0000007fbe5c5068 in u_transfer_helper_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/auxiliary/util/u_transfer_helper.c:243 #3 0x0000007fbde2dcb8 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47c780, src_level=0, src_box_in=0x7fbab2945c) at ../src/gallium/auxiliary/util/u_surface.c:350 #4 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173 #5 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587 #6 0x0000007fbde2f3d0 in util_try_blit_via_copy_region (ctx=0x5555936480, blit=0x7fbab29430) at ../src/gallium/auxiliary/util/u_surface.c:864 #7 0x0000007fbdec02c4 in fd_blit (pctx=0x5555936480, blit_info=0x7fbab29588) at ../src/gallium/drivers/freedreno/freedreno_resource.c:993 #8 0x0000007fbdf08408 in fd6_blit (pctx=0x5555936480, info=0x7fbab29588) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:546 #9 0x0000007fbdebdc74 in do_blit (ctx=0x5555936480, blit=0x7fbab29588, fallback=false) at ../src/gallium/drivers/freedreno/freedreno_resource.c:129 #10 0x0000007fbdebe58c in fd_blit_from_staging (ctx=0x5555936480, trans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:326 #11 0x0000007fbdebea38 in fd_resource_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:416 #12 0x0000007fbe5c5c68 in u_transfer_helper_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/auxiliary/util/u_transfer_helper.c:516 #13 0x0000007fbde2de24 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47b8e0, src_level=0, src_box_in=0x7fbab2997c) at ../src/gallium/auxiliary/util/u_surface.c:376 #14 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173 #15 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587 ... Instead rework the API to push the fallback back to core code, so that we can rework resource_copy_region() to have it's own fallback path, and then finally convert fd6 over to work in the same way. This also makes ctx->blit() optional, and cleans up some unnecessary callers. Signed-off-by: Rob Clark <[email protected]>
* freedreno: skip depth resolve if not writtenRob Clark2019-01-033-4/+14
| | | | | | | | | | | | For multi-pass rendering, it is common to keep the same depth buffer from previous pass, to discard geometry that would be hidden by later draws. In the later passes with depth-test enabled, but depth-write disabled, there is no reason to do gmem2mem resolve. TODO probably do something similar for stencil.. although stencil buffer isn't used as commonly these days Signed-off-by: Rob Clark <[email protected]>
* v3d: Refactor compiler entrypoints.Eric Anholt2019-01-021-26/+6
| | | | | | Before, I had per-stage entryoints with some helpers shared between them. As I extended for compute shaders and shader-db, it turned out that the other common code in the middle wanted to be shared too.
* v3d: Don't forget to include RT writes in precompiles.Eric Anholt2019-01-021-0/+10
| | | | | Looking at some assembly dumps for an optimization, we were clearly missing important parts of the shader!
* v3d: Fix segfault when failing to compile a program.Eric Anholt2019-01-021-2/+4
| | | | | | | We'll still fail at draw time, but this avoids a regression in shader-db execution once I enable TLB writes in precompiles. Fixes: b38e4d313fc2 ("v3d: Create a state uploader for packing our shaders together.")
* radeonsi: always unmap texture CPU mappings on 32-bit CPU architecturesMarek Olšák2019-01-021-0/+16
| | | | | | Team Fortress 2 32-bit version runs out of the CPU address space. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: remove unused variables in si_insert_input_ptrMarek Olšák2019-01-021-3/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: use u_decomposed_prims_for_vertices instead of u_prims_for_verticesMarek Olšák2019-01-021-1/+3
| | | | | | | It seems to be the same, but this doesn't use integer division with a variable divisor. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: make si_cp_wait_mem more configurableMarek Olšák2019-01-025-8/+8
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: call si_fix_resource_usage for the GS copy shader as wellMarek Olšák2019-01-021-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't emit redundant PKT3_NUM_INSTANCES packetsMarek Olšák2019-01-022-2/+10
| | | | Tested-by: Dieter Nützel <[email protected]>
* st/glsl_to_nir: call nir_lower_load_const_to_scalar() in the stTimothy Arceri2019-01-021-2/+0
| | | | | | | | | This will help the new opt introduced in the following patches allowing us to remove extra duplicate varyings. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeonsi: make use of ac_are_tessfactors_def_in_all_invocs()Timothy Arceri2019-01-021-8/+2
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unrequired param in si_nir_scan_tess_ctrl()Timothy Arceri2019-01-023-3/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()Timothy Arceri2019-01-021-29/+43
| | | | | | | | | | | | The previous code used a do while loop and continues after walking a nested loop/if-statement. This means we end up evaluating the last instruction from the nested block against the while condition and potentially exit early if it matches the exit condition of the outer block. Fixes: 386d165d8d09 ("tgsi/scan: add a new pass that analyzes tess factor writes") Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()Timothy Arceri2019-01-021-1/+1
| | | | | | | | | | | | | This just happened not to crash/assert because all loops have at least 1 if-statement and due to a second bug we end up matching the same ENDIF to exit both the iteration over the if-statment and the loop. The second bug is fixed in the following patch. Fixes: 386d165d8d09 ("tgsi/scan: add a new pass that analyzes tess factor writes") Reviewed-by: Marek Olšák <[email protected]>
* nv30: disable rendering to 3D texturesIlia Mirkin2019-01-011-0/+6
| | | | | | | | | | | There's no way to tell the 3D engine about swizzling on such textures. While rendering to NPOT ones may be possible, there's no great way to expose that in gallium, nor would there be any practical benefit. Fixes the non-compressed-format "copyteximage 3D" failures. Something odd going on with the compressed formats. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: fix some s3tc layout issuesIlia Mirkin2018-12-302-7/+26
| | | | | | | | | | | | | | s3tc layouts are a bit finicky - they're packed, but not swizzled. Adjust logic to allow for that case: - Don't set a uniform pitch for POT-sized compressed textures - Adjust define_rect API to be less confused about block sizes - Only mark a texture as linear if it has a uniform pitch set This has been tested to fix xonotic (as well as the s3tc-* piglits) on nv3x and keeps it working on nv4x. Signed-off-by: Ilia Mirkin <[email protected]>