aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* anv/formats: Use isl_format_supports* for format introspectionJason Ekstrand2016-05-231-22/+19
|
* isl: Add per-gen format introspectionJason Ekstrand2016-05-232-0/+399
| | | | | | This is just a copy-and-paste from brw_surface_formats.c. For the supports_vertex_fetch function, we do a bit more work so that it properly handles Bay Trail.
* isl: Add the ISL_FORMAT_R32G32_FLOAT_LD formatJason Ekstrand2016-05-232-0/+2
|
* isl: Add support for quering the string name of a formatJason Ekstrand2016-05-232-1/+9
|
* i965: Enable ARB/KHR_robust_buffer_access_behavior on BYT and HSW+Jason Ekstrand2016-05-232-2/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* main: Add extension enable bits for KHR_robust_buffer_access_behaviorJason Ekstrand2016-05-232-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: Protect against sampler index overflowJason Ekstrand2016-05-231-3/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add an option to clamp block indices when lowering UBO/SSBOsJason Ekstrand2016-05-235-6/+39
| | | | | | | | This prevents array overflow when the block is actually an array of UBOs or SSBOs. On some hardware such as i965, such overflows can cause GPU hangs. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/linker: Add a helper variable for compiler optionsJason Ekstrand2016-05-231-2/+5
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use the real size for index buffersJason Ekstrand2016-05-233-3/+8
| | | | | | | Previously, we were using the size of the whole BO which may be substantially larger than the actual index buffer size. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use the real size for vertex buffersJason Ekstrand2016-05-233-2/+17
| | | | | | | Previously, we were using the size of the BO which may be substantially larger than the actual vertex buffer size. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use 3-channel formats for vertex fetch when possible.Jason Ekstrand2016-05-231-11/+37
| | | | | | | | | For a long time, several of the 3-channel vertex formats didn't exist so we faked them with 4-channel versions. Starting with Sandy Bridge, we can use R16G16B16_FLOAT and 8 and 16-bit integer formats become available on Haswell and Bay Trail. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/surface_formats: Update the VB column for new formats added on BYTJason Ekstrand2016-05-231-20/+20
| | | | | | | Bay Trail and Haswell added a bunch of new vertex formats. There was also the addition of 64-bit passthrough formats for BDW+. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Properly handle rounding when dividing by InstanceDivisorJason Ekstrand2016-05-231-2/+2
| | | | | | | | | The old code always divided rounded down and then subtracted 1. What we wanted was to divide rounded up and then subtract 1 which is equivalent to subtracting 1 and then dividing rounded down. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Account for BaseInstance in VBO boundsJason Ekstrand2016-05-233-2/+5
| | | | | Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use worst-case VBO bounds if brw->num_instances == 0Jason Ekstrand2016-05-231-9/+10
| | | | | | | | | | | Previously, we only handled the "I don't know what's going on" case for things with InstanceDivisor == 0. However, in the DrawIndirect case we can get num_instances == 0 and we don't know what's going on with the instanced ones either. This commit makes the worst-case bound the default and then conservatively tightens the bound. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Delay when we get the bo for vertex buffersJason Ekstrand2016-05-231-22/+49
| | | | | | | | | | | | The previous code got the BO the first time we encountered it. However, this can potentially lead to problems if the BO is used for multiple arrays with the same buffer object because the range we declare as busy may not be quite right. By delaying the call to intel_bufferobj_buffer, we can ensure that we have the full range for the given buffer. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Stop relying on min_index == -1 for invalid index boundsJason Ekstrand2016-05-233-3/+7
| | | | | | | | | | | The vbo layer passes an index_bounds_valid flag that we should be using instead. This also fixes a bug when min_index == -1 and basevertex != 0 where we were actually comparing min_index + basevertex == -1 which was false and we were getting the wrong buffer-sizing path. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vbo: Declare the index range invalid for DrawTransformFeedbackJason Ekstrand2016-05-231-1/+1
| | | | | | | | | | Right now, we're setting the range to [0, 0] which is obviously bogus. Instead, we should set it to be invalid like we do for DrawIndirect. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vbo: Declare the index range invalid for DrawIndirectJason Ekstrand2016-05-231-1/+1
| | | | | | | | | | | | | | Right now, we're just setting the range to [0, MAX_UINT32] which, while correct isn't helpful. With DrawIndirect, you can't really know what the actual range is so we may as well flag it as being an invalid range. This is what we do for draws with index buffer which is similar (the indices aren't statically known) if a bit simpler. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/teximage: fix GL_FLOAT in commentIlia Mirkin2016-05-231-1/+1
| | | | | | Noticed by Brian. Trivial. Signed-off-by: Ilia Mirkin <[email protected]>
* glsl: fix explicit location validation for doublesTimothy Arceri2016-05-241-1/+3
| | | | | | | | Previously we would fail to find a match for the second half of a dvec4 as 'i' would get incremented to 1 before we added the var to the array at component 0. Reviewed-by: Anuj Phogat <[email protected]>
* docs: update ARB_cull_distance status.Dave Airlie2016-05-242-2/+2
| | | | Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: reenable cullingDave Airlie2016-05-241-1/+1
| | | | | | Now the lowering pass is fixed, reenable ARB_cull_distance. Signed-off-by: Dave Airlie <[email protected]>
* i965: reenable ARB_cull_distance.Dave Airlie2016-05-241-1/+1
| | | | | | Now the lowering pass is fixed we can reenable culling. Signed-off-by: Dave Airlie <[email protected]>
* glsl: rewrite clip/cull distance lowering passDave Airlie2016-05-243-63/+170
| | | | | | | | | | | | | | | | | | | | | | | | | The last version of this broke clipping, and I had to spend sometime getting this working properly. I had to introduce a third pass to count the clip/cull totals, all due to one messy corner case. We have a piglit test tes-input-gl_ClipDistance.shader_test that doesn't actually output the clip distances, it just passes them like a varying from TCS->TES, the older lowering pass worked but to lower clip/cull we need to know the total number of clip+culls used to defined the new variable correctly, and to offset culls properly. This adds an extra pass that works out the sizes for clip/cull, then lowers gl_ClipDistance then gl_CullDistance into the new gl_ClipDistanceMESA. The pass checks using the fixed array sizes code if they array has been referenced, or is actually never used, and ignores it in the latter case. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: make max array trackers ints and use -1 as base. (v2)Dave Airlie2016-05-2410-35/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug that breaks cull distances. The problem is the max array accessors can't tell the difference between an never accessed unsized array and an accessed at location 0 unsized array. This leads to converting an undeclared unused gl_ClipDistance inside or outside gl_PerVertex to a size 1 array. However we need to the number of active clip distances to work out the starting point for the cull distances, and this offset by one when it's not being used isn't possible to distinguish from the case were only the first element is accessed. I tried to use ->used for this, but that doesn't work when gl_ClipDistance is part of an interface block. So this changes things so that max_array_access is an int and initialised to -1. This also allows unsized arrays to proceed further than that could before, but we really shouldn't mind as they will get eliminated if nothing uses them later. For initialised uniforms we no longer change their array size at runtime, if these are unused they will get eliminated eventually. v2: use ralloc_array (Ilia) Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv/formats: Make alpha blending a property of render targetsNanley Chery2016-05-231-4/+2
| | | | | | | | In agreement with the SNB PRM, alpha blending is a property that render targets may or may not support. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORMNanley Chery2016-05-231-1/+1
| | | | | | | This format does not support alpha blending, according to the SNB PRM. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: deindent blorp code.Dave Airlie2016-05-241-9/+9
| | | | | | | gcc6 warns about this. Acked-by: Matt Turner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: reindent line in ast_function.cppDave Airlie2016-05-241-1/+1
| | | | | | | This fixes a warning with gcc -Wmisleading-indentation. Acked-by: Matt Turner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: allow GL_FRAMEBUFFER_DEFAULT_LAYERS to be queried with ES geometryIlia Mirkin2016-05-231-2/+2
| | | | | | | When we have the geometry extensions, enable querying of the new param. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: allow xfb to be active in GLES when geometry shader is enabled.Ilia Mirkin2016-05-231-2/+4
| | | | | | | | OES_geometry_shader has wording to allow xfb when using Draw*Indirect and DrawElements. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main: check driver float texture support before upgrading to 16F/32FIlia Mirkin2016-05-231-28/+33
| | | | | | | | | | When passing in GL_RGBA or other base formats, we will try to upgrade the format to whatever the passed in type was. However not all drivers (notably nv30) support 32F textures, and so this would lead to crashes down the line. Only upgrade when the relevant extensions are available. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: update inst->info along with inst->opIlia Mirkin2016-05-231-0/+1
| | | | | | | | | | | | Otherwise we still have TGSI_OPCODE_CMP's info, which causes a number of later logic to go wrong. This fixes dEQP-GLES2.functional.shaders.functions.control_flow.return_in_if_vertex on nv30. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: Use correct mode for split components.Bas Nieuwenhuizen2016-05-241-1/+1
| | | | | | | | The mode should stay the same as the original struct. In particular, shared should not be changed to temporary. Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Bas Nieuwenhuizen <[email protected]>
* mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED).Kenneth Graunke2016-05-236-0/+7
| | | | | | | | | | | | | | | | | | | | | Technically, this was introduced with GL 4.4. However, I believe it was intended to be retroactive. As far as I know, AMD has never supported primitive restart with patches, while NVidia and Intel do. This necessitated the need for a query which would allow applications to figure out whether this was usable or not. I decided to expose it everywhere ARB_tessellation_shader is exposed. (It's also in both OES and EXT_tessellation_shader.) Enable this for i965 and Gallium drivers which expose the capability. v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin). Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* gallium: Add a pipe cap for whether primitive restart works for patches.Kenneth Graunke2016-05-2317-0/+18
| | | | | | | | | | | | | | | Some hardware supports primitive restart on patch primitives, and other hardware does not. Modern GL and ES include a query for this feature; adding a capability bit will allow us to answer it. As far as I know, AMD hardware does not support this feature, while NVIDIA and Intel hardware does. However, most Gallium drivers do not appear to support tessellation shaders yet. So, I've enabled it for nvc0 and disabled it everywhere else. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/fs: Mark UBO uniform pull constant loads as force_writemask_all.Francisco Jerez2016-05-232-1/+4
| | | | | | | | | | This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Allow spilling of non-contiguous registers.Francisco Jerez2016-05-231-19/+2
| | | | | | | This should be working fine now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Calculate the (un)spill block size correctly.Francisco Jerez2016-05-231-3/+17
| | | | | | | | | | | | | | | | | | | | | Currently the spilling code attempts to guess the scratch message block size from the dispatch width of the shader, which is plain wrong for SIMD-lowered instructions (frequently but not exclusively encountered in SIMD32 shaders) or for instructions with register region data types of size other than 32 bit. Instead try to use the SIMD component size of the instruction which in some cases will allow the dataport to apply the correct channel mask to the scratch data read or written. In the spill case the block size needs to be clamped to the number of MRF registers reserved for spilling. In the unspill case I didn't even bother because we currently have no 100% accurate way to determine whether a source region is per-channel or whether it contains things like headers that don't respect channel boundaries -- That's fine, because the unspill is marked force_writemask_all we can just use the largest allowable scratch message size. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on spills not matching the channel layout of the ↵Francisco Jerez2016-05-231-3/+16
| | | | | | | | | | | instruction. This prevents the application of an incorrect channel mask by the scratch write instruction for spilled variables that don't have an exact one-to-one correspondence between channels of the variable and 32-bit components of the scratch write instruction. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on unspills.Francisco Jerez2016-05-231-2/+10
| | | | | | | | | | This makes sure that unspills restore the exact contents of the variable in scratch space into the GRF without applying channel masking, which is incorrect under control flow for things like message headers or vectors of heterogeneous types that don't properly respect channel boundaries. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Move scratch block size calculation into the caller of emit_(un)spill.Francisco Jerez2016-05-231-21/+23
| | | | | | | | | | | | | This makes emit_(un)spill even more stupid by removing the logic that decides what execution size each scratch read or write send message should have and instead relying on the caller to specify an appropriate execution size via the builder argument. This makes sense because the caller will need to act differently based on the scratch message width (e.g. emit an additional unspill before the instruction if the execution width and channel layout of the spill doesn't match the instruction's). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Make emit_spill/unspill static functions taking builder as argument.Francisco Jerez2016-05-232-24/+21
| | | | | | | | | This seems cleaner than exposing an implementation detail of brw_fs_reg_allocate.cpp to the world, and will give the caller control over the instruction execution flags (e.g. force_writemask_all) that are applied to the scratch read and write instructions. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Apply execution controls from the instruction to scratch messages.Francisco Jerez2016-05-231-6/+5
| | | | | | | | | | | | | Until now the execution controls (e.g. channel group, force_writemask_all, exec_size) of the instruction had been completely ignored by spilling, even though that can lead to a mismatch between the channel mask applied to the contents of the (un)spilled memory and the GRF source or destination of the instruction. In some cases we'll actually want the (un)spill messages to be marked force_writemask_all regardless of whether the instruction has it set, but that will have to be handled specially by the caller. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Fix signedness of local variables and arguments of emit_(un)spill.Francisco Jerez2016-05-232-8/+8
| | | | | | | To avoid some some spurious warnings about comparison signedness in the following commits. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Factor out calculation of the block of MRFs reserved for spilling.Francisco Jerez2016-05-231-9/+45
| | | | | | | And as we're at it fix the calculation to allocate a larger block of registers for 32-wide dispatch. Reviewed-by: Jason Ekstrand <[email protected]>
* egl: Add OpenGL_ES to API string regardless of GLES versionPlamena Manolova2016-05-231-7/+4
| | | | | | | | | | | | | | | | | According to the EGL specifications eglQueryString(EGL_CLIENT_APIS) should return a string containing a combination of "OpenGL", "OpenGL_ES" and "OpenVG", any other values would be considered invalid. Due to this when the API string is constructed, the version of GLES should be disregarded and "OpenGL_ES" should be attached once instead of "OpenGL_ES2" and "OpenGL_ES3". Fixes: dEQP-EGL.functional.negative_api* and dEQP-EGL.functional.query_context.simple.query_api Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* freedreno/ir3: disable cp for indirect src'sRob Clark2016-05-231-0/+9
| | | | | | | | | | | | The variable-indexing tests always had a few random fails, which I usually couldn't reproduce when running tests manually. Somehow recently this got a lot worse. I ported a couple of the shaders to GLES to see what blob does, and it also seems to be avoiding to cp indirect srcs. So I guess indirect w/ instructions other than cat1 (mov) are not totally reliable. Let's just switch that off until this is better understood. Signed-off-by: Rob Clark <[email protected]>