mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fs	Alejandro Piñeiro	2019-02-21	7	-32/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: e68871f6a ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop
*	st/mesa: always unmap the uploader in st_atom_array.c	Marek Olšák	2019-02-20	1	-8/+6
\| \| \| \| \| \| \|	This is a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Dieter Nützel <[email protected]>
*	i965: re-emit index buffer state on a reset option change.	Andrii Simiklit	2019-02-20	3	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]>
*	st/nir: use NIR for asm programs	Timothy Arceri	2019-02-19	2	-1/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This uses prog_to_nir to translate ARB assembly programs to NIR. Co-authored by Tim Arceri, Dave Airlie, and Ken Graunke: - [Tim Arceri]: original patch - [Dave Airlie]: fix crashes with parameter names - [Ken Graunke]: - Rebase on SCALAR_ISA cap, lower wpos_ytransform too. - Rebase on streamout fixes. - Lower system values for fragcoord support. - Don't try to use prog_to_nir for ATI_fragment_shader programs. - Create TGSI for fixed-function or ARB vertex shaders even if the driver prefers NIR, so we can create draw module shaders for feedback/select emulation, which rely on TGSI. Tested on: - iris (Intel Skylake/Kabylake): Piglit & GL CTS - Ken Graunke - radeonsi (AMD Vega 64): Piglit - Ken Graunke - vc4/v3d - Piglit - Eric Anholt - freedreno - dEQP - Kristian Høgsberg Fixes lit_degenerate_case on vc4 and v3d, and vp-address-01, vp-arl-constant-array-huge-offset-neg, and vp-arl-neg-array on v3d. No Piglit regressions on radeonsi; no dEQP regressions on freedreno. Acked-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.	Kenneth Graunke	2019-02-19	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Even if the driver wants to use NIR shaders, we may need to have TGSI tokens for creating draw module vertex shaders for the feedback/select render modes. So...if the st_vertex_program has any TGSI...copy it to the variant. Acked-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	mesa: Align doubles to a 64-bit starting boundary, even if packing.	Kenneth Graunke	2019-02-19	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the new Intel Iris driver, I am using Tim's new packed uniform storage system. It works great, with one caveat: our scalar compiler backend assumes that uniform offsets will be aligned to the underlying data type. For example, doubles must be 64-bit aligned, floats 32-bit, half-floats 16-bit, and so on. It does not need any other padding. Currently, _mesa_add_parameter aligns everything to 32-bit offsets, creating doubles that have an unaligned offset. This patch alters that code to align doubles to 64-bit offsets. This may be slightly less optimal for drivers which can support full packing, and allow reads from unaligned offsets at no penalty. We could make this extra alignment optional. However, it only comes into play when intermixing double and single precision uniforms. Doubles are already not too common, and intermixed values (floats then doubles) is probably even less common. At most, we burn a single 32-bit slot to the alignment, which is not that expensive. So, it doesn't seem worthwhile to add the extra complexity. Eventually, we'll likely want to update this code to allow half-float values to be packed tighter than 32-bit offsets. At that point, we'll probably want to revisit what drivers ultimately want, and add options. Acked-by: Timothy Arceri <[email protected]>
*	compiler: Make is_64bit(GL_*) helper more broadly available	Kenneth Graunke	2019-02-19	1	-0/+31
\| \| \| \| \| \| \| \|	I'd like to use this in the prog_parameter.c code, so I need to move it into C, make it non-static, and so on. This probably isn't the ideal place for it, but I couldn't think of a better one. Acked-by: Timothy Arceri <[email protected]>
*	i965: always enable EXT_float_blend	Ilia Mirkin	2019-02-18	1	-0/+1
\| \| \| \| \| \| \| \| \|	From the table in isl_format.c, it appears that all generations support blending on 32-bit float surfaces. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	st/mesa: enable GL_EXT_float_blend when possible	Ilia Mirkin	2019-02-18	1	-0/+10
\| \| \| \| \| \| \| \| \|	If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip EXT_float_blend on (which will affect ES3 contexts). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
*	mesa: add explicit enable for EXT_float_blend, and error condition	Ilia Mirkin	2019-02-18	4	-1/+26
\| \| \| \| \| \| \| \| \|	If EXT_float_blend is not supported, error out on blending of FP32 attachments in an ES2 context. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: scale factor changes should trigger recompile	Lionel Landwerlin	2019-02-18	2	-1/+16
\| \| \| \| \| \| \| \|	Found by inspection. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3da858a6b990c5 ("intel/compiler: add scale_factors to sampler_prog_key_data") Reviewed-by: Tapani Pälli <[email protected]>
*	mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment	Tapani Pälli	2019-02-18	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes invalid access to Attachment array which would occur if caller would exceed MaxColorAttachments. In practice this should not ever happen because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be valid and InvalidateFramebuffer will error out before but this should make coverity happy. v2: const, remove _EXT (Ian) CID: 1442559 Fixes: 0c42b5f3cb9 "mesa: wire up InvalidateFramebuffer" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Assert the execobject handles match for this device	Chris Wilson	2019-02-16	1	-0/+2
\| \| \| \| \| \| \|	Object handles are local to the device fd, so double check we are not mixing together objects from multiple screens on execbuf submission. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Removed the field etc_format from the struct intel_mipmap_tree	Eleni Maria Stea	2019-02-15	3	-18/+1
\| \| \| \| \| \| \| \| \|	After the previous changes to emulate the ETC/EAC formats using the secondary shadow miptree, the etc_format field of the intel_mipmap_tree struct became redundant and the remaining check that used it has been replaced. (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
*	i965: Enabled the OES_copy_image extension on Gen 7 GPUs	Eleni Maria Stea	2019-02-15	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \|	OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
*	i965: Fixed the CopyImageSubData for ETC2 on Gen < 8	Eleni Maria Stea	2019-02-15	3	-18/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery) v4: - Fixed unrelated empty line removal (Nanley Chery) - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only called inside its following function, we don't need to declare it at the top of the file anymore. (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
*	i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.	Eleni Maria Stea	2019-02-15	3	-69/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map\|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Fixed the format in the mt_surface_usage, set at the miptree creation, in miptree_create of intel_mipmap_tree.c (Nanley Chery) v5: - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery) - Update the flag shadow_needs_update outside the function intel_miptree_update_etc_shadow (Nanley Chery) - Fixed indentation error (Nanley Chery) v6: - Fixed typo in commit message (Nanley Chery) - Simplified the assignment of the mt_fmt in the miptree_create of the intel_mipmap_tree.c (Nanley Chery) - Combined declarations and assignments where it was possible in the intel_miptree_update_etc_shadow and intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c (Nanley Chery) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272 Reviewed-by: Nanley Chery <[email protected]>
*	i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*	Nanley Chery	2019-02-15	3	-19/+19
\| \| \| \| \| \| \|	Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea <[email protected]>
*	mesa: INVALID_VALUE for wrong type or format in ClearBufferData	Andres Gomez	2019-02-15	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of generating a GL_INVALID_ENUM error when the type or format is incorrect while using glClear{Named}Buffer{Sub}Data, generate GL_INVALID_VALUE. From page 72 (page 94 of the PDF) of the OpenGL 4.6 spec: " An INVALID_VALUE error is generated if type is not one of the types in table 8.2. An INVALID_VALUE error is generated if format is not one of the formats in table 8.3." Fixes the following test: KHR-GL45.direct_state_access.buffers_errors v2: correct the doxygen documentation. Cc: Pi Tabred <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	drirc/i965: add option to disable 565 configs and visuals	Tapani Pälli	2019-02-15	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \|	We have cases where we would not like to expose these. v2: call the option allow_rgb565_configs for consistency with existing allow_rgb10_configs (Eric, Jason) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	drm-uapi: use local files, not system libdrm	Eric Engestrom	2019-02-14	17	-20/+20
\| \| \| \| \| \| \| \| \|	There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	mesa: Advertise EXT_float_blend in ES 3.0+ contexts.	Kenneth Graunke	2019-02-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extension simply drops a draw time restriction: "Furthermore, an INVALID_OPERATION error is generated by DrawArrays and the other drawing commands defined in section 2.8.3 (10.5 in ES 3.1) if blending is enabled (see below) and any draw buffer has 32-bit floating-point format components." We never correctly enforced this restriction anyway, so we were basically already implementing it. We just need to advertise it for our behavior to be correct. The extension requires EXT_color_buffer_float, but we already enable that via dummy_true. So we can dummy_true this one as well. Found while debugging WebGL conformance tests. Does not fix any. Reviewed-by: Tapani Pälli <[email protected]>
*	i965: add P0x formats and propagate required scaling factors	Tapani Pälli	2019-02-12	3	-0/+17
\| \| \| \| \| \|	Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/compiler: add scale_factors to sampler_prog_key_data	Tapani Pälli	2019-02-12	1	-0/+1
\| \| \| \| \| \| \| \|	Patch propagates given scale_factors to lowering options. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Use info->textures_used instead of prog->SamplersUsed.	Kenneth Graunke	2019-02-11	2	-7/+7
\| \| \| \| \| \| \| \| \| \|	prog->SamplersUsed is set by the linker when validating resource limits, while info->textures_used is gathered after NIR optimizations, which may have eliminated some unused surfaces. This may let us skip some work. Reviewed-by: Eric Anholt <[email protected]>
*	i965: Drop unnecessary 'and' with prog->SamplerUnits	Kenneth Graunke	2019-02-11	1	-1/+1
\| \| \| \| \| \| \|	textures_used_by_txf is a subset of textures_used which is a subset of prog->SamplerUnits. This should do nothing. Reviewed-by: Eric Anholt <[email protected]>
*	nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref.	Kenneth Graunke	2019-02-11	4	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Eric and I would like a bitmask of which samplers are used, similar to prog->SamplersUsed, but available in NIR. The linker uses SamplersUsed for resource limit checking, but later optimizations may eliminate more samplers. So instead of propagating it through, we gather a new one. While there, we also gather the existing textures_used_by_txf bitmask. Gathering these bitfields in nir_shader_gather_info is awkward at best. The main reason is that it introduces an ordering dependency between the two passes. If gathering runs before lower_samplers_as_deref, it can't look at var->data.binding. If the driver doesn't use the full lowering to texture_index/texture_array_size (like radeonsi), then the gathering can't use those fields. Gathering might be run early /and/ late, first to get varying info, and later to update it after variant lowering. At this point, should gathering work on pre-lowered or post-lowered code? Pre-lowered is also harder due to the presence of structure types. Just doing the gathering when we do the lowering alleviates these ordering problems. This fixes ordering issues in i965 and makes the txf info gathering work for radeonsi (though they don't use it). Reviewed-by: Eric Anholt <[email protected]>
*	program: Make prog_to_nir create texture/sampler derefs.	Kenneth Graunke	2019-02-11	1	-5/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	Until now, prog_to_nir has been setting texture_index and sampler_index directly. This is different than GLSL shaders, which create variable dereferences and rely on lowering passes to reach this final form. radeonsi uses variable dereferences for samplers rather than texture_index and sampler_index, so it doesn't even make sense to set them there. By moving to derefs, we ensure that both GLSL and ARB programs produce the same final form that the driver desires. Reviewed-by: Eric Anholt <[email protected]>
*	st/nir: Use sampler derefs in built-in shaders.	Kenneth Graunke	2019-02-11	2	-8/+24
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	st/nir: Lower sampler derefs for builtin shaders.	Kenneth Graunke	2019-02-11	1	-0/+2
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	st/nir: Pull sampler lowering into a helper function.	Kenneth Graunke	2019-02-11	2	-4/+14
\| \| \| \| \| \|	This will make it easier to reuse across GLSL / ARB / built-ins. Reviewed-by: Eric Anholt <[email protected]>
*	i965: Call nir_lower_samplers for ARB programs.	Kenneth Graunke	2019-02-11	1	-0/+2
\| \| \| \| \| \| \| \| \|	An upcoming patch will start building derefs in prog_to_nir, at which point we'll need to lower them to indexes. This gets both GLSL and non-GLSL shaders using the same paths. Reviewed-by: Eric Anholt <[email protected]>
*	st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048	Kenneth Graunke	2019-02-11	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: consider a 'base level' when calculating width0, height0, depth0	Andrii Simiklit	2019-02-07	1	-1/+25
\| \| \| \| \| \| \| \| \| \| \|	I guess that when we calculating the width0, height0, depth0 to use for function 'intel_miptree_create' we need to consider the 'base level' like it is done in the 'intel_miptree_create_for_teximage' function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987 Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	st/glsl_to_nir: call nir_remove_dead_variables() after lowing local indirects	Timothy Arceri	2019-02-08	1	-0/+7
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	util: move BITFIELD macros to util/macros.h	Timothy Arceri	2019-02-08	1	-24/+0
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	st/mesa: require RGBA2, RGB4, and RGBA4 to be renderable	Karol Herbst	2019-02-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	If the driver does not support rendering to these formats but does support texturing, we can end up in incompatibilities between textures and renderbuffers that are then copied to. Fixes KHR-GL45.copy_image.functional on nvc0 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
*	gallium: add PIPE_CAP_MAX_VARYINGS	Karol Herbst	2019-02-07	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
*	st/nir: Use src/ relative include path for autotools	Kristian H. Kristensen	2019-02-05	2	-2/+4
\| \| \| \| \| \|	Fixes: cdc53fa81cbeb80373eac33ef7695d9025caf14b Acked-by: Kenneth Graunke <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
*	gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit.	Kenneth Graunke	2019-02-05	1	-2/+5
\| \| \| \| \| \| \| \| \|	Iris would like to use compact arrays for tesslevels and clip/cull distances. radeonsi will likely want to switch to these at some point, since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready for them just yet. Reviewed-by: Timothy Arceri <[email protected]>
*	st/nir: Call nir_lower_clip_cull_distance_arrays().	Kenneth Graunke	2019-02-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Today, st always sets LowerCombinedClipCullDistance, causing the GLSL IR lowering to run, giving us vec4[2] arrays. I would like to disable this and instead run the NIR lowering so that we get compact float[] arrays instead. Calling the new pass is a noop if the GLSL IR pass has already run, so it's safe to call the pass unconditionally. Reviewed-by: Timothy Arceri <[email protected]>
*	program: Extend prog_to_nir handle system values.	Kenneth Graunke	2019-02-05	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \|	Some drivers, such as radeonsi, use a system value for gl_FragCoord rather than an input variable. In this case, our Mesa IR will have a PROGRAM_SYSTEM_VALUE register, which we need to translate. This makes prog_to_nir work for Gallium drivers which expose the PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL capability bit. Reviewed-by: Eric Anholt <[email protected]>
*	program: Use u_bit_scan64 in prog_to_nir.	Kenneth Graunke	2019-02-05	1	-7/+6
\| \| \| \| \| \| \|	We can simply iterate the bits rather than using util_last_bit and checking each one up until that point. Reviewed-by: Eric Anholt <[email protected]>
*	st/mesa: Add NIR versions of the PBO upload/download shaders.	Kenneth Graunke	2019-02-05	1	-2/+188
\| \| \| \| \| \|	Acked-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/mesa: Add a NIR version of the OES_draw_texture built-in shaders.	Kenneth Graunke	2019-02-05	1	-7/+62
\| \| \| \| \| \|	Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/mesa: Add NIR versions of the clear shaders.	Kenneth Graunke	2019-02-05	1	-13/+67
\| \| \| \| \| \| \| \| \| \| \| \|	We implement the basic VS and FS, as well as the VS that does layered clears by writing gl_Layer from the vertex shader. Drivers which need a geometry shader for writing layer continue falling back to TGSI, as I didn't need this and so didn't bother implementing it. (We certainly could, however, if people want to add it in the future.) Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/mesa: Add NIR versions of the drawpixels Z/stencil fragment shaders.	Kenneth Graunke	2019-02-05	1	-21/+119
\| \| \| \| \| \|	Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/mesa: Add a NIR version of the drawpixels/bitmap VS copy shader.	Kenneth Graunke	2019-02-05	1	-8/+29
\| \| \| \| \| \| \| \| \|	This provides a native NIR version of the DrawPixels/Bitmap passthrough vertex shader. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/nir: Make new helpers for constructing built-in NIR shaders.	Kenneth Graunke	2019-02-05	4	-0/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The state tracker generates several built-in shaders in order to perform scissored clears, upload/download PBOs, and so on. These are currently constructed using TGSI, using ureg and u_simple_shader. I want to have NIR versions of these shaders, for my Gallium driver that has a NIR backend but no TGSI support. To that end, we'll want a few helpers to help construct simple shaders. This patch adds two new helpers: - st_nir_finish_builtin_shader() takes a manually constructed NIR shader, applies lowering passes (like st_link_nir would do for GLSL), and constructs the pipe_shader_state. - st_nir_make_passthrough_shader() makes a simple passthrough shader, which copies inputs to outputs. This is similar to u_simple_shaders. v2: Set info->fs.untyped_color_outputs for vc4/v3d (thanks Eric!). Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>
*	st/nir: Move varying setup code to a helper function.	Kenneth Graunke	2019-02-05	2	-20/+29
\| \| \| \| \| \| \| \|	I want to reuse this for built-in shaders. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]>