summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen9: Remove the halign/valign field setup code in fast copy blitAnuj Phogat2016-05-261-65/+0
| | | | | | | | | | | | | Experimentation with different values of src/dst horizontal/vertical alignment showed that these fileds are not used on gen9 hardware. A recent update in graphics specs has removed these fields from XY_FAST_COPY_BLT command. Cc: Ben Widawsky <[email protected]> Cc: Chad Versace <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Enable OES_copy_image (and EXT) on Gen8+ and Baytrail.Kenneth Graunke2016-05-251-0/+8
| | | | | | | | | | | | | | | | | | | | | | | For now, only enable it on platforms that actually support ETC2. At this point, Broadwell is only failing 5 (out of 8358) dEQP tests: dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits. srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d These fail with all methods (meta, blorp, blitter, memcpy). All are blacklisted from the Android mustpass list, which makes me wonder whether there's an issue with the tests. The formats in question work with other targets, and the targets in question work with other formats... Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Implement a BLORP path for CopyImage and prefer it over Meta.Kenneth Graunke2016-05-251-6/+28
| | | | | | | | | | | | We're dropping Meta in favor of BLORP everywhere we can. This also fixes bugs when copying cubemaps to 2D, which is currently broken in the meta pass. BLORP just works. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Make the CopyImage BLT path bail for stencil images.Kenneth Graunke2016-05-251-0/+3
| | | | | | | | | | The BLT can't handle S8 because it's W-tiled (at least without additional funny business, and I'm not sure we care). Disallow it so it falls back to the CPU path, which works. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Also copy stencil miptree data.Kenneth Graunke2016-05-251-0/+15
| | | | | | | | The Meta path handles this, but the CPU/BLT fallbacks did not. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Make a helper function for CopyImage of a miptree.Kenneth Graunke2016-05-251-41/+54
| | | | | | | | | | | | Currently, it only contains the BLT/CPU fallbacks, so the name is a bit too generic. But eventually this will use BLORP as well, at which point the name will make more sense. The next patch will introduce a second call. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Combine src/dest tex vs. rb checks in intel_copy_image_sub_data.Kenneth Graunke2016-05-251-20/+13
| | | | | | | | | This simplifies things a little - now we only have one (tex or rb?) if-ladder for src, and a second for dst, rather than four. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Account for MinLayer in CopyImageSubData's blitter/CPU paths.Kenneth Graunke2016-05-251-0/+4
| | | | | | | | | | | Fixes Piglit's arb_copy_image-texview test with the Meta path disabled (so we hit the blitter/CPU fallback paths). v2: Add MinLayer even for cube maps (suggested by Ilia). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Mark fallthrough in switch statement.Matt Turner2016-05-251-0/+1
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965: Assert that a depth_mt exists when using HiZ.Matt Turner2016-05-254-0/+4
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965/fs: take into account doubles when emitting system valuesAlejandro Piñeiro2016-05-251-1/+2
| | | | | | | Fixes the following cts test: GL42-CTS.vertex_attrib_64bit.limits_test Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix shadowing of 'height' parameterKristian Høgsberg Kristensen2016-05-251-2/+2
| | | | | | | | | | | | | The nested declaration of 'height' shadows a parameter and uses uninitialized memory. Fix by renaming to 'plane_height' which also makes the code clearer. This would typically break the bo size computation, but we don't use that except when mmaping, and we don't mmap YUV buffers much. Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reported-by: Mathias Fröhlich <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add .gitignore entries for make check binariesKristian Høgsberg Kristensen2016-05-251-0/+3
| | | | | Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Enable GL_KHR_robustnessKristian Høgsberg Kristensen2016-05-254-0/+26
| | | | | | | | | | | | | | | GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry points that we already implement. This patch adds a new dispatch table that returns GL_CONTEXT_LOST from all entry points and implements the GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn that we've lost the context. With the GL_CONTEXT_LOST reporting in place and dispatch for the new entry points we can turn on GL_KHR_robustness. Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* i965/draw: Use the correct buffer index for interleaved VBO sizesJason Ekstrand2016-05-241-2/+4
| | | | | | The buffer_range_* arrays are indexed by buffer index not element index. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/gen7: Fix gl_HelperInvocationJordan Justen2016-05-241-1/+1
| | | | | | | | | It appears that UV immediates aren't working on Ivy Bridge. In this case, a signed version will work, and this fixes the piglit tests/spec/glsl-4.50/execution/helper-invocation.shader_test test. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* dri: Add YVU formatsKristian Høgsberg Kristensen2016-05-241-0/+25
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965: Allow creating planar YUV __DRIimagesKristian Høgsberg Kristensen2016-05-241-10/+20
| | | | | | | | | Lift the resctriction we had before and allow creation of images with multiple planes. We still require all the planes to be within the same bo. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Invoke lowering pass for YUV texturesKristian Høgsberg Kristensen2016-05-245-0/+44
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965: Support textures with multiple planesKristian Høgsberg Kristensen2016-05-247-19/+59
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965: Create multiple miptrees for planar YUV imagesKristian Høgsberg Kristensen2016-05-243-1/+53
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965: Refactor intel_set_texture_image_bo() to create_mt_for_dri_image()Kristian Høgsberg Kristensen2016-05-241-39/+30
| | | | | | | | | This function now only creates the mt and we then call intel_set_texture_image_mt() in intel_image_target_texture_2d() to set it for the texture image. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Use intel_set_texture_image_mt() in intelSetTexBuffer2()Kristian Høgsberg Kristensen2016-05-241-12/+15
| | | | | | | | Create the mt for the drawable bo directly and call our new intel_miptree_create_for_bo() helper instead. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Add new intel_set_texture_image_mt() helperKristian Høgsberg Kristensen2016-05-241-27/+42
| | | | | | | | This factors out the work of setting up a miptree as the backing for a texture image into a new helper. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: fix double-precision vertex inputs measurementJuan A. Suarez Romero2016-05-243-16/+53
| | | | | | | | | | | | | | | | | | | For double-precision vertex inputs we need to measure them in dvec4 terms, and for single-precision vertex inputs we need to measure them in vec4 terms. For the later case, we use type_size_vec4() function. For the former case, we had a wrong implementation based on type_size_vec4(). This commit introduces a proper type_size_dvec4() function, that we use to measure vertex inputs. Measuring double-precision vertex inputs as dvec4 is required because ARB_vertex_attrib_64bit states that these uses the same number of locations than the single-precision version. That is, two consecutives dvec4 would be located in location "x" and location "x+1", not "x+2". Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: remove separate enable for KHR_robust_buffer_access_behaviorIlia Mirkin2016-05-231-1/+0
| | | | | | | | | This extension appears to be a strict subset of the ARB version. Also remove it from GL3.txt since it doesn't seem relevant. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Use ISL for surface format introspectionJason Ekstrand2016-05-237-387/+19
| | | | | With this, we can delete the surface format table in brw_surface_formats.c because all of the information we need is now in ISL.
* i965: Enable ARB/KHR_robust_buffer_access_behavior on BYT and HSW+Jason Ekstrand2016-05-231-0/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add an option to clamp block indices when lowering UBO/SSBOsJason Ekstrand2016-05-231-0/+1
| | | | | | | | This prevents array overflow when the block is actually an array of UBOs or SSBOs. On some hardware such as i965, such overflows can cause GPU hangs. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use the real size for index buffersJason Ekstrand2016-05-233-3/+8
| | | | | | | Previously, we were using the size of the whole BO which may be substantially larger than the actual index buffer size. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use the real size for vertex buffersJason Ekstrand2016-05-233-2/+17
| | | | | | | Previously, we were using the size of the BO which may be substantially larger than the actual vertex buffer size. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use 3-channel formats for vertex fetch when possible.Jason Ekstrand2016-05-231-11/+37
| | | | | | | | | For a long time, several of the 3-channel vertex formats didn't exist so we faked them with 4-channel versions. Starting with Sandy Bridge, we can use R16G16B16_FLOAT and 8 and 16-bit integer formats become available on Haswell and Bay Trail. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/surface_formats: Update the VB column for new formats added on BYTJason Ekstrand2016-05-231-20/+20
| | | | | | | Bay Trail and Haswell added a bunch of new vertex formats. There was also the addition of 64-bit passthrough formats for BDW+. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Properly handle rounding when dividing by InstanceDivisorJason Ekstrand2016-05-231-2/+2
| | | | | | | | | The old code always divided rounded down and then subtracted 1. What we wanted was to divide rounded up and then subtract 1 which is equivalent to subtracting 1 and then dividing rounded down. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Account for BaseInstance in VBO boundsJason Ekstrand2016-05-233-2/+5
| | | | | Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Use worst-case VBO bounds if brw->num_instances == 0Jason Ekstrand2016-05-231-9/+10
| | | | | | | | | | | Previously, we only handled the "I don't know what's going on" case for things with InstanceDivisor == 0. However, in the DrawIndirect case we can get num_instances == 0 and we don't know what's going on with the instanced ones either. This commit makes the worst-case bound the default and then conservatively tightens the bound. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Delay when we get the bo for vertex buffersJason Ekstrand2016-05-231-22/+49
| | | | | | | | | | | | The previous code got the BO the first time we encountered it. However, this can potentially lead to problems if the BO is used for multiple arrays with the same buffer object because the range we declare as busy may not be quite right. By delaying the call to intel_bufferobj_buffer, we can ensure that we have the full range for the given buffer. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/draw: Stop relying on min_index == -1 for invalid index boundsJason Ekstrand2016-05-233-3/+7
| | | | | | | | | | | The vbo layer passes an index_bounds_valid flag that we should be using instead. This also fixes a bug when min_index == -1 and basevertex != 0 where we were actually comparing min_index + basevertex == -1 which was false and we were getting the wrong buffer-sizing path. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: reenable ARB_cull_distance.Dave Airlie2016-05-241-1/+1
| | | | | | Now the lowering pass is fixed we can reenable culling. Signed-off-by: Dave Airlie <[email protected]>
* i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORMNanley Chery2016-05-231-1/+1
| | | | | | | This format does not support alpha blending, according to the SNB PRM. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: deindent blorp code.Dave Airlie2016-05-241-9/+9
| | | | | | | gcc6 warns about this. Acked-by: Matt Turner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED).Kenneth Graunke2016-05-231-0/+1
| | | | | | | | | | | | | | | | | | | | | Technically, this was introduced with GL 4.4. However, I believe it was intended to be retroactive. As far as I know, AMD has never supported primitive restart with patches, while NVidia and Intel do. This necessitated the need for a query which would allow applications to figure out whether this was usable or not. I decided to expose it everywhere ARB_tessellation_shader is exposed. (It's also in both OES and EXT_tessellation_shader.) Enable this for i965 and Gallium drivers which expose the capability. v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin). Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Mark UBO uniform pull constant loads as force_writemask_all.Francisco Jerez2016-05-232-1/+4
| | | | | | | | | | This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Allow spilling of non-contiguous registers.Francisco Jerez2016-05-231-19/+2
| | | | | | | This should be working fine now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Calculate the (un)spill block size correctly.Francisco Jerez2016-05-231-3/+17
| | | | | | | | | | | | | | | | | | | | | Currently the spilling code attempts to guess the scratch message block size from the dispatch width of the shader, which is plain wrong for SIMD-lowered instructions (frequently but not exclusively encountered in SIMD32 shaders) or for instructions with register region data types of size other than 32 bit. Instead try to use the SIMD component size of the instruction which in some cases will allow the dataport to apply the correct channel mask to the scratch data read or written. In the spill case the block size needs to be clamped to the number of MRF registers reserved for spilling. In the unspill case I didn't even bother because we currently have no 100% accurate way to determine whether a source region is per-channel or whether it contains things like headers that don't respect channel boundaries -- That's fine, because the unspill is marked force_writemask_all we can just use the largest allowable scratch message size. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on spills not matching the channel layout of the ↵Francisco Jerez2016-05-231-3/+16
| | | | | | | | | | | instruction. This prevents the application of an incorrect channel mask by the scratch write instruction for spilled variables that don't have an exact one-to-one correspondence between channels of the variable and 32-bit components of the scratch write instruction. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on unspills.Francisco Jerez2016-05-231-2/+10
| | | | | | | | | | This makes sure that unspills restore the exact contents of the variable in scratch space into the GRF without applying channel masking, which is incorrect under control flow for things like message headers or vectors of heterogeneous types that don't properly respect channel boundaries. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Move scratch block size calculation into the caller of emit_(un)spill.Francisco Jerez2016-05-231-21/+23
| | | | | | | | | | | | | This makes emit_(un)spill even more stupid by removing the logic that decides what execution size each scratch read or write send message should have and instead relying on the caller to specify an appropriate execution size via the builder argument. This makes sense because the caller will need to act differently based on the scratch message width (e.g. emit an additional unspill before the instruction if the execution width and channel layout of the spill doesn't match the instruction's). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Make emit_spill/unspill static functions taking builder as argument.Francisco Jerez2016-05-232-24/+21
| | | | | | | | | This seems cleaner than exposing an implementation detail of brw_fs_reg_allocate.cpp to the world, and will give the caller control over the instruction execution flags (e.g. force_writemask_all) that are applied to the scratch read and write instructions. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Apply execution controls from the instruction to scratch messages.Francisco Jerez2016-05-231-6/+5
| | | | | | | | | | | | | Until now the execution controls (e.g. channel group, force_writemask_all, exec_size) of the instruction had been completely ignored by spilling, even though that can lead to a mismatch between the channel mask applied to the contents of the (un)spilled memory and the GRF source or destination of the instruction. In some cases we'll actually want the (un)spill messages to be marked force_writemask_all regardless of whether the instruction has it set, but that will have to be handled specially by the caller. Reviewed-by: Jason Ekstrand <[email protected]>