aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965/blorp: Stop using the miptree in state setup for tex/rt surfacesJason Ekstrand2016-08-176-50/+45
| | | | | | | | | | This commit movies us from a miptree model to a surf+bo+offset model. In the GL driver, miptrees are almost always at the start of the bo so the offset is zero but we don't want to always make that assumption. In the sort term, gen6 stencil and HiZ will be at an offset but, in the long term, any Vulkan surface is liable to be at a non-zero offset. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp/blit: Move format work-arounds before surface_info_initJason Ekstrand2016-08-171-11/+12
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Add real support for HiZJason Ekstrand2016-08-171-13/+28
| | | | | | | | The previous HiZ support was bogus because all of get_aux_isl_surf looked at mt->mcs_mt directly. For HiZ buffers, you need to look at either mt->hiz_buf or mt->hiz_buf->mt. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Use the isl helpers for creating aux surfacesJason Ekstrand2016-08-171-46/+9
| | | | | | | | | | | In order for the calculations of things such as fast clear rectangles to work, we need more details of the auxiliary surface to be correct. In particular, we need to be able to trust the width and height fields. (These are not necessarily what you want coming out of the miptree.) The only values state setup really cares about are the row and array pitch and those we can safely stomp from the miptree. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Use mcs_mt->qpitch for aux surfacesJason Ekstrand2016-08-171-1/+2
| | | | | | | At one point, we were doing this correctly. It must have gotten lost in one of the many rebases. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Allow get_aux_isl_surf when there is no aux surfaceJason Ekstrand2016-08-171-1/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Support depth in get_isl_clear_colorJason Ekstrand2016-08-171-1/+6
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove unused fields from blorp_surface_infoJason Ekstrand2016-08-172-19/+0
| | | | | | | The only reason why we need layer or level is that we need the z-offset for 3-D surfaces. Let's just have the one field for that. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Simplify depth buffer state setup a bitJason Ekstrand2016-08-172-55/+17
| | | | | | | The data comes in via ISL in a format that's almost directly usable by the hardware so we can avoid some of the conversion headache. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Use the generic surface state path for gen8 texturesJason Ekstrand2016-08-174-48/+8
| | | | | | | Now that the generic blorp path uses base level/layer, there's no need to make gen8 special. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Only do offset hacks for fake W-tiling and IMSJason Ekstrand2016-08-173-114/+153
| | | | | | | | | | | | | Since the dawn of time, blorp has used offsets directly to get at different mip levels and array slices of surfaces. This isn't really necessary since we can just use the base level/layer provided in the surface state. While it may have simplified blorp's original design, we haven't been using the blorp path for surface state on gen8 thanks to render compression and there's really no good need for it most of the time. This commit restricts such surface munging to the cases of fake W-tiling and fake interleaved multisampling. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Add a z_offset field to blorp_surface_infoJason Ekstrand2016-08-173-9/+14
| | | | | | | The layer field is in terms of physical layers which isn't quite what the sampler will want for 2-D MS array textures. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Pass the Z component into all texture operationsJason Ekstrand2016-08-171-42/+35
| | | | | | | | Multisample array surfaces on IVB don't support the minimum array element surface attribute so it needs to come through the sampler message. We may as well just pass it through everything. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Rework hiz rect alignment calculationsJason Ekstrand2016-08-171-8/+15
| | | | | | | | | | | | At the moment, the minify operation does nothing because params.depth.view.base_level is always zero. However, as soon as we start using actual base miplevels and array slices, we are going to need the minification. Also, we only need to align the surface dimensions in the case where we are operating on miplevel 0. Previously, it didn't matter because it aligned on miplevel 0 and, for all other miplevels, the miptree code guaranteed that the level was already aligned. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Map 1-D render targets with DIM_LAYOUT_GEN4_2D as 2D on gen9Jason Ekstrand2016-08-171-0/+6
| | | | | | | | | The sampling hardware can handle them ok. It just looks at the tiling to determine whether it's the new gen9 1-D layout or the old one. The render hardware isn't so smart. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Fill out the isl_surf::usage fieldJason Ekstrand2016-08-171-1/+24
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Use the isl_view from the blorp_surface_infoJason Ekstrand2016-08-171-17/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Get rid of brw_blorp_surface_info::width/heightJason Ekstrand2016-08-175-44/+25
| | | | | | Instead, we manually mutate the surface size as needed. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Move surface offset calculations into a helperJason Ekstrand2016-08-171-32/+43
| | | | | | | | The helper does a full transformation on the surface to turn it into a new 2-D single-layer single-level surface representing the original layer and level in memory. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Use ISL to compute image offsetsJason Ekstrand2016-08-171-3/+91
| | | | | | | For the moment, we still call the old miptree function; we just assert that the two are equal. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Add an isl_view to blorp_surface_infoJason Ekstrand2016-08-175-53/+60
| | | | | | | | | Eventually, this will be the actual view that gets passed into isl to create the surface state. For now, we just use it for the format and the swizzle. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Move intratile offset calculations out of surface state setupJason Ekstrand2016-08-173-29/+18
| | | | | | | | | Previously we multiplied full x/y offsets, resolved tile aligned buffer offset and intra tile offset based on that. Now we let ISL to take into account the msaa setting and we only multiply the resolved intra tile offsets. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Refactor interleaved multisample destination handlingJason Ekstrand2016-08-171-37/+34
| | | | | | | | We put all of the code for fake IMS together. This requires moving a bit of the program key setup code further down so that it gets the right values out of the final surface. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Get rid of brw_blorp_surface_info::array_layoutJason Ekstrand2016-08-172-10/+0
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Use isl_msaa_layout instead of intel_msaa_layoutJason Ekstrand2016-08-173-104/+39
| | | | | | We also remove brw_blorp_surface_info::msaa_layout. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Use the ISL aux_layout for deciding whether to do an MCS fetchJason Ekstrand2016-08-172-7/+11
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Get rid of brw_blorp_surface_info::num_samplesJason Ekstrand2016-08-176-35/+31
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Make sample count asserts a bit more lazyJason Ekstrand2016-08-171-5/+5
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Get rid of brw_blorp_surface_info::map_stencil_as_y_tiledJason Ekstrand2016-08-173-39/+26
| | | | | | | Now that we're carrying around the isl_surf, we can just modify it directly instead of passing an extra bit around. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove compute_tile_offsetsJason Ekstrand2016-08-172-34/+5
| | | | | | We have a handy little function is ISL that does exactly the same thing. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Create the isl_surf up-frontJason Ekstrand2016-08-172-11/+19
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp/clear: Initialize surface info after allocating an MCSJason Ekstrand2016-08-171-6/+6
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Remove the stencil_as_y_tiled parameter from get_tile_masksJason Ekstrand2016-08-174-10/+8
| | | | | | | It's only used to stomp the tiling to Y and it's only used by blorp so there's no reason why blorp can't do it itself. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Estimate maximum sampler message execution size more accurately.Francisco Jerez2016-08-161-37/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic used to determine the execution size of sampler messages was based on special-casing several argument and opcode combinations, which unsurprisingly missed the possibility that some messages could exceed the payload size limit or not depending on the number of coordinate components present. In particular: - The TXL, TXB and TEX messages (the latter on non-FS stages only) would attempt to use SIMD16 on Gen7+ hardware even if a shadow reference was present and the texture was a cubemap array, causing it to overflow the maximum supported sampler payload size and crash. - The TG4_OFFSET message with shadow comparison was falling back to SIMD8 regardless of the number of coordinate components, which is unnecessary when two coordinates or less are present. Both cases have been handled incorrectly ever since cubemap arrays and texture gather were respectively enabled (the current logic used by the SIMD lowering pass is almost unchanged from the previous no16 fall-back logic used pre-SIMD lowering times). Fixes the following GL4.5 conformance test on Gen7-8 (the bug also affects Gen9+ in principle, but SKL passes the test by luck because it manages to use the TXL_LZ message instead of TXL): GL45-CTS.texture_cube_map_array.sampling Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97267 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Return zero from fs_inst::components_read for non-present sources.Francisco Jerez2016-08-161-2/+5
| | | | | | | | | This makes it easier for the caller to find out how many scalar components are actually read by the instruction. As a bonus we no longer need to special-case BAD_FILE in the implementation of fs_inst::regs_read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Lower TEX to TXL during NIR translation.Francisco Jerez2016-08-162-14/+6
| | | | | | | | This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Change 8X MSAA sample mappingAnuj Phogat2016-08-122-6/+6
| | | | | | | | This is required following the change in 8X sample positions. Fixes the recently modified multisample-scaled-blit piglit tests. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Change 8x multisample positionsAnuj Phogat2016-08-121-23/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no standard sample positions defined in OpenGL and OpenGL ES specs. Implementations have the freedom to pick the positions which give plausible results. But the Vulkan 1.0 spec does define standard sample positions for different sample counts. Defined positions in Vulkan for all the sample counts except 8X match with the positions we set in i965. We have an upcoming plan to share the blorp code between OpenGL and Vulkan driver in near future. Keeping the 8X sample positions same on both the drivers will help us move in that direction. Here is an argument by Neil Roberts (from commit 20250e85) against any advantage of current 8X sample positions over the new ones: "The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern." Observed no regressions in jenkins testing. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Make opt_vector_float reset at the top of each blockJason Ekstrand2016-08-101-80/+82
| | | | | | | | | | | The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* util: Move _mesa_fsl/util_last_bit into util/bitscan.hMathias Fröhlich2016-08-097-13/+13
| | | | | | | | | | | As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <[email protected]> Tested-by: Brian Paul <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Make BLORP's BlitFramebuffer follow the GL 4.4 sRGB rules.Kenneth Graunke2016-08-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is enabled. This is technically incompatible in certain cases, but is more consistent across GL, ES, and WebGL, and more flexible. The NVIDIA 367.35 drivers appear to follow this behavior. For the awful spec analysis, please read Piglit's tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this is the right rule to implement. Note that ctx->Color.sRGBEnabled is initialized to _mesa_is_gles(ctx), and ES doesn't have enable/disable flags for GL_FRAMEBUFFER_SRGB, so it's effectively on all the time. This means the ES behavior should be unchanged. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Make BLORP do sRGB encode/decode on ES 2 as well.Kenneth Graunke2016-08-081-2/+2
| | | | | | | | | | | | This should have no effect, as all drivers which support BLORP also support ES 3.0 - so ES 2.0 would be promoted and follow the ES 3 rules. ES 1.0 doesn't have BlitFramebuffer. This is purely to clarify the next patch a bit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Drop the "do resolves in sRGB" hack.Kenneth Graunke2016-08-081-24/+0
| | | | | | | | | | | | | | | | | | | | | | | I've never quite understood the purpose of this hack - supposedly, doing resolves in the sRGB colorspace is slightly more accurate. Currently, BlitFramebuffer() ignores sRGB encoding and decoding on OpenGL, although it encodes and decodes in GLES 3.x. The updated OpenGL 4.4 rules also allow for encoding and decoding if GL_FRAMEBUFFER_SRGB is enabled, allowing the application to control what colorspace blits are done in. I don't think this hack makes any sense in such a world - the application can do what it wants, and we shouldn't second guess them. A related Piglit patch, "Make multisample accuracy test set GL_FRAMEBUFFER_SRGB when resolving." makes the Piglit MSAA accuracy test explicitly request SRGB encoding/decoding during resolves when running "srgb" subtests. Without that patch, this commit will regress those tests, but with it, they should continue to work just fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Bail on the BLT path if BlitFramebuffer requires sRGB conversion.Kenneth Graunke2016-08-082-2/+10
| | | | | | | | | | | | | Modern OpenGL BlitFramebuffer require sRGB encode/decode when GL_FRAMEBUFFER_SRGB is enabled. The blitter can't handle this, so we need to bail. On Gen4-5, this means falling back to Meta, which should handle it. We allow sRGB <-> sRGB blits, as decode then encode ought to be a noop (other than potential precision loss, which nobody wants anyway). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Rework the unlit centroid workaround.Kenneth Graunke2016-08-052-25/+33
| | | | | | | | | | | | | | | | | | | | | | | Previously, for every input, we moved the dispatch mask to the flag register, then emitted two predicated PLN instructions, one with centroid barycentric coordinates (for normal pixels), and one with pixel barycentric coordinates (for unlit helper pixels). Instead, we can simply emit a set of predicated MOVs at the top of the program which copy the pixel barycentric coordinates over the centroid ones for unlit helper pixel channels. Then, we can just use normal PLNs. On Sandybridge: total instructions in shared programs: 7538470 -> 7534500 (-0.05%) instructions in affected programs: 101268 -> 97298 (-3.92%) helped: 705 HURT: 9 (all of which are SIMD16 programs) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use a separate register for every access to an SSA undef.Kenneth Graunke2016-08-042-13/+11
| | | | | | | | | | | | | | | | | | | | | | Previously, we allocated a new VGRF for every undefined definition. Instead, this patch makes us allocate a new VGRF for every use of an undefined definition. This makes sure that undefined values are fully independent of one another, and have live ranges limited to their single use. This allows register coalescing to combine the source and destination of MOVs from undefined sources, eliminating the MOV altogether. On Broadwell: total instructions in shared programs: 11641187 -> 11640214 (-0.01%) instructions in affected programs: 70199 -> 69226 (-1.39%) helped: 213 HURT: 1 v2: Add a comment (based on Iago's suggested one). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: use mt->offset in intel_miptree_map_movntdqa()Haixia Shi2016-08-031-0/+3
| | | | | | | | | | | We need to include mt->offset in the calculation of src pointer because its value may be non-zero, for example in a cubemap texture. Signed-off-by: Haixia Shi <[email protected]> Cc: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Change-Id: I461ad5b204626d5a1c45611fc6b63735dcf29f63
* i965: Disable the unlit centroid workaround on Gen7.Matt Turner2016-08-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once upon a time (commit 8313f44409) Paul added code for the unlit centroid workaround (WaCopyUnlitCentroidBarys). His commit message claims it fixed the EXT_framebuffer_multisample/interpolation {2,4} {centroid-deriv,centroid-deriv-disabled} piglit tests but does not say on which platform, though he cites the IVB PRM. "3DSTATE_WM [DevIVB, DevHSW]" says "[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW may produce incorrect interpolation results when a 2X2 pixels have unlit pixels." I later disabled it for Haswell (commit f6db414f3c) with no known ill effects. The Sandybridge page does not have this text, but the workarounds database (see WaCopyUnlitCentroidBarys) says the issues applies *only* to Sandybridge, and in fact in commit 1a2de7dce8fc I note that disabling the workaround on Sandybridge causes the tests Paul originally mentioned to fail. So this is, and always has been, a huge confusing mess. Disabling the workaround indeed causes the tests Paul originally mentioned to fail on Sandybridge but not on Ivybridge/Baytrail. On Ivybridge: total instructions in shared programs: 6914901 -> 6909599 (-0.08%) instructions in affected programs: 106766 -> 101464 (-4.97%) helped: 884 total cycles in shared programs: 70874764 -> 70813774 (-0.09%) cycles in affected programs: 794144 -> 733154 (-7.68%) helped: 688 HURT: 186 LOST: 1 GAINED: 6 Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Avoid aliasing violation.Matt Turner2016-08-011-1/+3
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: fix comparison warningTimothy Arceri2016-08-011-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>