summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* intel/compiler: add scale_factors to sampler_prog_key_dataTapani Pälli2019-02-123-0/+8
| | | | | | | | Patch propagates given scale_factors to lowering options. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* dri: add P010, P012, P016 for 10bit/12bit/16bit YUV420 formatsTapani Pälli2019-02-121-0/+17
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: add option to use scaling factor when sampling planes YUV loweringTapani Pälli2019-02-122-21/+35
| | | | | | | | | Patch adds nir_lower_tex_options as parameter to sample_plane so that we don't need to extend nir_tex_instr for this. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use info->textures_used instead of prog->SamplersUsed.Kenneth Graunke2019-02-112-7/+7
| | | | | | | | | | prog->SamplersUsed is set by the linker when validating resource limits, while info->textures_used is gathered after NIR optimizations, which may have eliminated some unused surfaces. This may let us skip some work. Reviewed-by: Eric Anholt <[email protected]>
* i965: Drop unnecessary 'and' with prog->SamplerUnitsKenneth Graunke2019-02-111-1/+1
| | | | | | | textures_used_by_txf is a subset of textures_used which is a subset of prog->SamplerUnits. This should do nothing. Reviewed-by: Eric Anholt <[email protected]>
* nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref.Kenneth Graunke2019-02-117-11/+40
| | | | | | | | | | | | | | | | | | | | | | | | Eric and I would like a bitmask of which samplers are used, similar to prog->SamplersUsed, but available in NIR. The linker uses SamplersUsed for resource limit checking, but later optimizations may eliminate more samplers. So instead of propagating it through, we gather a new one. While there, we also gather the existing textures_used_by_txf bitmask. Gathering these bitfields in nir_shader_gather_info is awkward at best. The main reason is that it introduces an ordering dependency between the two passes. If gathering runs before lower_samplers_as_deref, it can't look at var->data.binding. If the driver doesn't use the full lowering to texture_index/texture_array_size (like radeonsi), then the gathering can't use those fields. Gathering might be run early /and/ late, first to get varying info, and later to update it after variant lowering. At this point, should gathering work on pre-lowered or post-lowered code? Pre-lowered is also harder due to the presence of structure types. Just doing the gathering when we do the lowering alleviates these ordering problems. This fixes ordering issues in i965 and makes the txf info gathering work for radeonsi (though they don't use it). Reviewed-by: Eric Anholt <[email protected]>
* nir: Use sampler derefs in drawpixels and bitmap lowering.Kenneth Graunke2019-02-112-13/+34
| | | | Reviewed-by: Eric Anholt <[email protected]>
* program: Make prog_to_nir create texture/sampler derefs.Kenneth Graunke2019-02-111-5/+16
| | | | | | | | | | | | | Until now, prog_to_nir has been setting texture_index and sampler_index directly. This is different than GLSL shaders, which create variable dereferences and rely on lowering passes to reach this final form. radeonsi uses variable dereferences for samplers rather than texture_index and sampler_index, so it doesn't even make sense to set them there. By moving to derefs, we ensure that both GLSL and ARB programs produce the same final form that the driver desires. Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Use sampler derefs in built-in shaders.Kenneth Graunke2019-02-112-8/+24
| | | | Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Lower sampler derefs for builtin shaders.Kenneth Graunke2019-02-111-0/+2
| | | | Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Pull sampler lowering into a helper function.Kenneth Graunke2019-02-112-4/+14
| | | | | | This will make it easier to reuse across GLSL / ARB / built-ins. Reviewed-by: Eric Anholt <[email protected]>
* i965: Call nir_lower_samplers for ARB programs.Kenneth Graunke2019-02-111-0/+2
| | | | | | | | | An upcoming patch will start building derefs in prog_to_nir, at which point we'll need to lower them to indexes. This gets both GLSL and non-GLSL shaders using the same paths. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Don't look at sampler uniform storage for internal varsKenneth Graunke2019-02-111-3/+5
| | | | | | | | | | Passes like nir_lower_drawpixels add additional sampler variables, and set an explicit binding which never changes. These extra samplers don't have proper uniform storage associated with them, and there is no way to update bindings via the API. So, for any 'hidden' variables, just trust that there's an explicit binding set. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Allow gl_nir_lower_samplers*() without a gl_shader_programKenneth Graunke2019-02-111-3/+11
| | | | | | | | | | | | | | | | I would like to be able to run gl_nir_lower_samplers() to turn texture and sampler variable dereferences into indexes and offsets, even for ARB programs, and built-in shaders. This would make sampler handling more consistent across the various types of shaders. For GLSL programs, the gl_nir_lower_samplers_as_deref() pass looks up the variable bindings in the shader program's uniform storage. But ARB programs and built-in shaders don't have a gl_shader_program, and uniform storage doesn't exist. In this case, we simply skip that lookup, and trust var->data.binding to be set correctly by whoever created the shader. Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048Kenneth Graunke2019-02-111-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces.Francisco Jerez2019-02-111-18/+23
| | | | | | | | | | | | | | | | This fixes a rather astonishing problem that came up while debugging an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has the tendency to create multiple VkDevices, each one with a separate DRM device FD and therefore a disjoint GEM buffer object handle space. Because the intel_dump_gpu tool wasn't making any distinction between buffers from the different handle spaces, it was confusing the instruction state pools from both devices, which happened to have the exact same GEM handle and PPGTT virtual address, but completely different shader contents. This was causing the simulator to believe that the vertex pipeline was executing a fragment shader, which didn't end up well. Reviewed-by: Lionel Landwerlin <[email protected]>
* freedreno/a6xx: Fall back to masked RGBA blits for depth/stencilKristian H. Kristensen2019-02-111-5/+44
| | | | | | | | | | | | | | | | | | | The blitter doesn't seem to have a write mask, so for depth only and stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8 buffer and fall back to pipeline blits with corresponding write mask. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8 dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8 Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Add format argument to fd6_tex_swiz()Kristian H. Kristensen2019-02-114-8/+10
| | | | | | | | We need to allow overriding the format with that of the image or sampler view, so we can't take it from the resource in fd6_tex_swiz(). Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Support y-inverted blitsKristian H. Kristensen2019-02-111-5/+2
| | | | | | | | | | | | | | | | | The src coordinates are s24.8. For an inverted blit that ends at y=0 we need to program -1 for sy2, so we need to handle negative values correctly. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Support some depth/stencil blits on blitterKristian H. Kristensen2019-02-111-1/+84
| | | | | | | | | | | | | | | | | | We can rewrite almost all depth stencil blits to various red-only blits. The exception is depth-only or stencil-only blits into z24s8 combined depth stencil buffer. We can fall back for depth-only, but stencil-only remains broken. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Move blit check so as to restore commentKristian H. Kristensen2019-02-111-4/+4
| | | | | | | | | | | | | | The explanation for the compressed format check is broken across two comments: /* We can blit if both or neither formats are compressed formats... */ /* ... but only if they're the same compression format. */ but the ok_format() checks were inserted between, breaking up the flow of the sentence. Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Don't tell the blitter what it can't doKristian H. Kristensen2019-02-111-2/+4
| | | | | | | | Call ctx->blit() and let it reject blits it can't do instead of giving up on stencil blits and blits u_blitter can't do. Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Consolidate u_blitter functions in freedreno_blitter.cKristian H. Kristensen2019-02-115-149/+153
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Combine emit_blit and fd6_blitKristian H. Kristensen2019-02-111-12/+5
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Use the right resource for separate stencil strideKristian H. Kristensen2019-02-111-1/+1
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: Log number of draw for sysmem passesKristian H. Kristensen2019-02-111-2/+3
| | | | | Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Drop render condition check in blitterKristian H. Kristensen2019-02-111-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | We already check earlier in the call chain in fd_blit(). glBlitFramebuffer always sets render_condition_enable and thus we would never try the blitter path for that. Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.* down this path, it turs out that the fail_if(info->mask != util_format_get_mask(info->src.format)); fail_if(info->mask != util_format_get_mask(info->dst.format)); conditions weren't accurate. util_format_get_mask() returns PIPE_MASK_RGBA for any format with any color channels, while info->mask is the exact set of channels to blit. So we reject things we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask is RG while util_format_get_mask() returns RGBA - and accept things we can't. It turns out that the blitter is happy to blit different number of channels, but fails to blit formats with different numerical formats and srgb formats. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: regen headersKristian H. Kristensen2019-02-111-25/+56
| | | | | | | | Update for a6xx.xml.h to incorporate a few new bits and changes to blit src rect coordinate types. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* st/va/vp9: set max reference as default of VP9 reference numberLeo Liu2019-02-111-1/+6
| | | | | | | | If there is no information about number of render targets Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]> Cc: 19.0 <[email protected]>
* st/va: fix the incorrect max profiles reportLeo Liu2019-02-112-2/+3
| | | | | | | | | | | Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will be correct when adding more profiles in the future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107 Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]> Cc: 19.0 <[email protected]>
* st/va:Add support for indirect manner by returning ↵Guttula, Suresh2019-02-111-2/+5
| | | | | | | | | | | | | | | | | | VA_STATUS_ERROR_OPERATION_FAILED Based on VA Spec,DeriveImage() returns VA_STATUS_ERROR_OPERATION_FAILED if driver dont have support for internal surface formats.Currently vaDeriveImage() failed for non-contiguous planes and operation failed error string is required to support indirect manner i.e. vaCreateImage()+vaPutImage() incase vaDeriveImage() failed with VA_STATUS_ERROR_OPERATION_FAILED. This patch will notify to the client as operation failed with proper error sting,so that client will fallback to vaCreateImage()+vaPutImage(). v2: updated commit message based on VA spec. Signed-off-by: suresh guttula <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* winsys/amdgpu: cs_check_space sets the minimum IB size for future IBsMarek Olšák2019-02-112-2/+23
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: clean up IB buffer size computationMarek Olšák2019-02-111-8/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: remove occurence of INDIRECT_BUFFER_CONSTMarek Olšák2019-02-111-2/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: use a separate fence list for syncobjsMarek Olšák2019-02-112-17/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: unify fence list codeMarek Olšák2019-02-112-59/+42
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: don't drop manually added fence dependenciesMarek Olšák2019-02-111-2/+0
| | | | | | | wow, it's hard to believe that fence and syncobjs dependencies were ignored. Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0Marek Olšák2019-02-111-2/+5
| | | | | Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0Marek Olšák2019-02-111-1/+2
| | | | | Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocksJason Ekstrand2019-02-111-3/+2
| | | | | | | | | | | | | | | | | | | | | | When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: 7d1d1208c2b "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <[email protected]>
* intel/fs: Use enumerated array assignments in fb read TXF setupJason Ekstrand2019-02-111-5/+9
| | | | | | | It's more clear and means we don't have to update the array every time we add an optional texture instruction argument Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nvc0: we have 16k-sized framebuffers, fix default scissorsIlia Mirkin2019-02-101-2/+2
| | | | | | | | | For some reason we don't use view volume clipping by default, and use scissors instead. These scissors were set to an 8k max fb size, while the driver advertises 16k-sized framebuffers. Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* panfrost: Specify supported draw modes per-contextAlyssa Rosenzweig2019-02-112-12/+11
| | | | | | | | | | | | Midgard has native support for QUADS and POLYGONS; Bifrost seemingly does not. Thus, Midgard generally skips prim_convert whereas Bifrost needs the pass; this patch allows the setting of allowed primitives to occur on a per-context basis (for runtime hardware selection). v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Robert Foss <[email protected]>
* radv: remove alloc parameter from pipeline initDave Airlie2019-02-111-5/+2
| | | | | | clang points out this isn't used. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/llvm: initialise passes member.Dave Airlie2019-02-111-1/+1
| | | | | | Fixes coverity warning Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: glsl to nir fix uninit class member.Dave Airlie2019-02-111-0/+1
| | | | | | The constructor should init this to NULL Reviewed-by: Alejandro Piñeiro <[email protected]>
* panfrost: Elucidate texture op scheduling commentAlyssa Rosenzweig2019-02-101-8/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove speculative if 0'd format bit codeAlyssa Rosenzweig2019-02-101-6/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove if 0'd dead codeAlyssa Rosenzweig2019-02-105-83/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add kernel-agnostic resource managementAlyssa Rosenzweig2019-02-102-15/+172
| | | | | | | | | | Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <[email protected]>