summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Validate (and resolve) all the bound textures.Chris Forbes2014-03-112-2/+2
| | | | | | | | | | | | | | | | | | | | BRW_MAX_TEX_UNIT is the static limit on the number of textures we support per-stage, not in total. Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is significantly larger, and across the various shader stages, up to ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually used. Fixes invisible bad behavior in piglit's max-samplers test (although this escalated to an assertion failure on HSW with texture_view, since non-immutable textures only have _Format set by validation.) Signed-off-by: Chris Forbes <[email protected]> Cc: "9.2 10.0 10.1" <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit befbda56a246f77797bdf13fc005353441db2879)
* dri/i9*5: correctly calculate the amount of system memoryEmil Velikov2014-03-112-2/+2
| | | | | | | | | | The variable name states megabytes, while we calculate the amount in kilobytes. Correct this by dividing with the correct amount. Signed-off-by: Emil Velikov <[email protected]> Cc: "10.0 10.1" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit fc25956badb8e1932cc19d8c97b4be16e92dfc65)
* i965: Fix the region's pitch condition to use blitterAnuj Phogat2014-03-101-3/+3
| | | | | | | | | | intelEmitCopyBlit uses a signed 16-bit integer to represent buffer pitch, so it can only handle buffer pitches < 32k. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit b3094d9927fe7aa5a84892262404aaad4d728724)
* i965: Create a hardware context before initializing state module.Kenneth Graunke2014-03-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | brw_init_state() calls brw_upload_initial_gpu_state(). If hardware contexts are enabled (brw->hw_ctx != NULL), this will upload some initial invariant state for the GPU. Without hardware contexts, we rely on this state being uploaded via atoms that subscribe to the BRW_NEW_CONTEXT bit. Commit 46d3c2bf4ddd227193b98861f1e632498fe547d8 accidentally moved the call to brw_init_state() before creating a hardware context. This meant brw_upload_initial_gpu_state would always early return. Except on Gen6+, we stopped uploading the initial GPU state via state atoms, so it never happened. Fixes a regression since 46d3c2bf4ddd227193b98861f1e632498fe547d8. Cc: "10.0 10.1" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 3663bbe773187dee341556ef29e58b1143ef2f5c)
* nouveau: fix chipset checks for nv1a by using the oclass insteadIlia Mirkin2014-03-043-7/+8
| | | | | | | | | | | | | Commit f4ebcd133b9 ("dri/nouveau: NV17_3D class is not available for NV1a chipset") fixed this partially by using the correct 3d class. However there were a lot of checks left over comparing against the chipset. Reported-and-tested-by: John F. Godfrey <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: 9.2 10.0 10.1 <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 0c8b165366d68291e3013c7308b8b1fdd5ade2a2)
* dri/nouveau: Pass the API into _mesa_initialize_contextEmil Velikov2014-03-046-10/+16
| | | | | | | | | | | | Currently we create a OPENGL_COMPAT context regardless of what was requested by the program. Correct that by retaining the program's request and passing it into _mesa_initialize_context. Based on a similar commit for radeon/r200 by Ian Romanick. Cc: "9.1 9.2 10.0" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 76d9f6d9729db1c999317a6b44818aa90c30a0b3)
* i965/blorp: do not use unnecessary hw-blending supportTopi Pohjolainen2014-03-041-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is really not needed as blorp blit programs already sample XRGB normally and get alpha channel set to 1.0 automatically by the sampler engine. This is simply copied directly to the payload of the render target write message and hence there is no need for any additional blending support from the pixel processing pipeline. The blending formula is anyway broken for color components, it multiplies the color component with itself (blend factor is the component itself). Alpha blending in turn would not fix the alpha to one independent of the source but simply used the source alpha as is instead (1.0 * src_alpha + 0.0 * dst_alpha). Quoting Eric: "If we want to actually make the no-alpha-bits-present thing work, we need to override the bits in the surface state or in the generated code. In the normal draw path, it's done for sampling by the swizzling code in brw_wm_surface_state.c, and the blending overrides is just to fix up the alpha blending stage which doesn't pay attention to that for the destination surface." If one modifies piglit test gl-3.2-layered-rendering-blit to use color component values other than zero or one, this change will kick in on IVB. No regressions on IVB. This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d: i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 933be19cdf97aed977cd656e5c15c99cbdb52b7f)
* meta: Consistenly use non-Apple VAO functionsIan Romanick2014-03-041-4/+4
| | | | | | | | | | | For these objects, meta was already using the non-Apple function to delete the objects. Everywhere else in the file uses _mesa_GenVertexArrays and _mesa_BindVertexArrays. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]> (cherry picked from commit abfa65ca811099332c8683dada9a2ee44cc01dc9)
* meta: Fallback to software for GetTexImage of compressed ↵Ian Romanick2014-03-041-1/+2
| | | | | | | | | | | | | | GL_TEXTURE_CUBE_MAP_ARRAY The hardware decompression path isn't even close to being able to handle this. This converts the crash (assertion failure) in "EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a plain old failure. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]> (cherry picked from commit 070f55d8935af6fee62506b54bc86c1bf5049a82)
* meta: Release resources used by _mesa_meta_DrawPixelsIan Romanick2014-03-041-0/+19
| | | | | | | | | | | _mesa_meta_DrawPixels creates a VAO and (potentially) two fragment programs, but none of them are ever released. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]> (cherry picked from commit fcb498302bff912ca4f3169d37cc04b58e77d0fa)
* meta: Release resources used by decompress_texture_imageIan Romanick2014-03-041-0/+21
| | | | | | | | | | | | decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a sampler object, but none of them are ever released. Later patches will add program objects, exacerbating the problem. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]> (cherry picked from commit 2d3f92e881dbd9d1aff17bba0d182c8ef645a2ca)
* radeon: move driContextSetFlags(ctx) call after ctx var is initializedBrian Paul2014-03-041-2/+3
| | | | | | CC: "10.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit f51ca46f0c7c3b87b62f6047b266faeffbaae619)
* r200: move driContextSetFlags(ctx) call after ctx var is initializedBrian Paul2014-03-041-2/+3
| | | | | | | | Otherwise, ctx was a garbage value. CC: "10.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> (cherry picked from commit 2d6d69bab6c74d92514b81a68c6c8b1dc428182a)
* i965: Ignore 'centroid' interpolation qualifier in case of persample shadingAnuj Phogat2014-01-312-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]> (cherry picked from commit f5cfb4ae21df8eebfc6b86c0ce858b1c0a9160dd) and i965: Ignore 'centroid' interpolation qualifier in case of persample shading I missed this change in commit f5cfb4a. It fixes the incorrect rendering caused in Dolphin Emulator. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915 Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Markus Wick <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit dc2f94bc786768329973403248820a2e5249f102)
* i965: Use sample barycentric coordinates with per sample shadingAnuj Phogat2014-01-314-6/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]> (cherry picked from commit a92e5f7cf63d496ad7830b5cea4bbab287c25b8e)
* i965/gen6/blorp: Emit more flushes to workaround hangsChad Versace2014-01-312-15/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a squash of three related cherry-picks from master. [PATCH 1/3] i965/gen6/blorp: Set need_workaround_flush immediately after primitive This patch makes the workaround code in gen6 blorp follow the pattern established in the regular draw path. It shouldn't result in any behavioral change. On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim() and gen6_blorp_emit_primitive(). brw_emit_prim() sets need_workaround_flush immediately after emitting the primitive, but blorp does not. Blorp sets need_workaround_flush at the bottom of brw_blorp_exec(). This patch moves the need_workaround_flush from brw_blorp_exec() to gen6_blorp_emit_primitive(). There is no need to set need_workaround_flush in gen7_blorp_emit_primitive() because the workaround applies only to gen6. Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]> (cherry picked from commit 5e0cd58de4261e9dca7a15037192e7e9426a0207) [PATCH 2/3] i965/gen6/blorp: Set need_workaround_flush at top of blorp Unconditionally set brw->need_workaround_flush at the top of gen6 blorp state emission. The art of emitting workaround flushes on Sandybridge is mysterious and not fully understood. Ken and I believe that intel_emit_post_sync_nonzero_flush() may be required when switching from regular drawing to blorp. This is an extra safety measure to prevent undiscovered difficult-to-diagnose gpu hangs. I verified that on ChromeOS, pre-patch, need_workaround_flush was not set at the top of blorp, as Paul expected. To verify, I inserted the following debug code at the top of gen6_blorp_exec(), restarted the ui, and inspected the logs in /var/log/ui. The abort gets triggered so early that the browser never appears on the display. static void gen6_blorp_exec(...) { if (!brw->need_workaround_flush) { fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__); abort(); } ... } CC: Kenneth Graunke <[email protected]> CC: Stéphane Marchesin <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]> (cherry picked from commit 6a5c86f48675d2ca0975d69e0899e72afaab29e5) [PATCH 3/3] i965/gen6/blorp: Remove redundant HiZ workaround Commit 1a92881 added extra flushes to fix a HiZ hang in WebGL Google Maps. With the extra flushes emitted by the previous two patches, the flushes added by 1a92881 are redundant. Tested with the same criteria as in 1a92881: by zooming in and out continuously for 2 hours on Sandybridge Chrome OS (codename Stumpy) without a hang. CC: Kenneth Graunke <[email protected]> CC: Stéphane Marchesin <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]> (cherry picked from commit 90368875e733171350c64c8dda52f81bd0705dd0) Conflicts: src/mesa/drivers/dri/i965/gen6_blorp.cpp
* radeon / r200: Pass the API into _mesa_initialize_contextIan Romanick2014-01-284-3/+5
| | | | | | | | | | | Otherwise an application that requested an OpenGL ES 1.x context would actually get a desktop OpenGL context. Signed-off-by: Ian Romanick <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 33214679bb632a80d4339ffa0f28f7620d510658)
* i965: Ensure that all necessary state is re-emitted if we run out of aperture.Paul Berry2014-01-253-0/+21
| | | | | | | | | | | | | | | | | | | | | | Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit fb6d9798a0c6eefd512f5b0f19eed34af8f4f257)
* i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.Eric Anholt2014-01-091-1/+2
| | | | | | | | | | | | | | | | | We definitely want to fall through to the unsynchronized map case, instead of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49) performance regression on aa10perf when teaching glamor to provide the GL_INVALIDATE_RANGE_BIT information. This is a performance fix, which I usually wouldn't cherry-pick to stable. But this was really was just a bug in the code, its presence would discourage developers from giving us the best information they can, and I think we've got fairly high confidence in the unsynchronized map path already. Cc: 10.0 9.2 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit f46563fe1c8a5560e4de0adf03e3d8770b7fc734)
* i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.Eric Anholt2014-01-091-1/+3
| | | | | | | | | | | Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and cairo-gl. Cc: 10.0 9.2 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit e186b927b8254ce62e0d47db90d16cd4253b3db5)
* i965: fold offset into coord for textureOffset(gsampler2DRect)Chris Forbes2014-01-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | The hardware is broken with nonzero texel offsets and unnormalized coordinates; instead of doing correct offsetting, we get garbage. This just extends the existing workaround for ir_txf and ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect. Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is not enabled; also fixes the new piglit test 'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'. Has been broken ~forever; suggesting including this in only 10.0 because the lowering pass doesn't exist in 9.2 or earlier so would require quite a different patch. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: Lee Salzman <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 9e99735f301ebf85f8d0bfdce2bad441a5aac7f8)
* i965/gen6: Fix HiZ hang in WebGL Google MapsChad Versace2014-01-021-0/+15
| | | | | | | | | | | | | | | | | | | | | | Emitting flushes before depth and hiz resolves at the top of blorp's state emission fixes the hang. Marchesin and I found the fix experimentally, as opposed to adhering to a documented hardware workaround. A more minimal fix likely exists, but this gets the job done. Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS. Tested by zooming in and out continuously for 2 hours. This patch is based on https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/8bc07bb70163c3706fb4ba5f980e57dc942f56dd CC: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740 Signed-off-by: Stéphane Marchesin <[email protected]> Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 1a928816a1b717201f3b3cc998a42731b280e6ba)
* i915: Add support for gl_FragData[0] reads.Henri Verbeet2014-01-021-0/+1
| | | | | | | | | | | | Similar to 556a47a2621073185be83a0a721a8ba93392bedb, without this reading from gl_FragData[0] would cause a software fallback. Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964 Signed-off-by: Henri Verbeet <[email protected]> Cc: 10.0 9.2 9.1 <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit b094b3b9f4c7b40056c31e3480ab7dc530da56e7)
* i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.Kenneth Graunke2014-01-021-1/+1
| | | | | | | | | | When adding geometry shader support, we accidentally reversed the size and offset parameters. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 51c9cfc296318760aab421a79da727acd0e36311)
* dri_util: Don't assume __DRIcontext->driverPrivate is a gl_contextKristian Høgsberg2014-01-0214-10/+34
| | | | | | | | | | | | | | | | | | The driverPrivate pointer is opaque to the driver and we can't assume it's a struct gl_context in dri_util.c. Instead provide a helper function to set the struct gl_context flags from the incoming DRI context flags. v2 (idr): Modify the other classic drivers to also use driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests with i965 and classic swrast without regressions. Signed-off-by: Kristian Høgsberg <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Eric Anholt <[email protected]> Tested-by: Ilia Mirkin <[email protected]> [v1 on Gallium nouveau] Cc: "10.0" <[email protected]> (cherry picked from commit 38366c0c6e715314367b15680702e382d5c46a4a)
* swrast: fix readback regression since inversion fixDave Airlie2013-12-121-1/+1
| | | | | | | | | | | | | | | | | This readback from the frontbuffer with swrast was broken, that bug just made it more obviously broken, this fixes it by inverting the sub image gets. Also fixes a few other piglits. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325 (for 9.2 the patches this depends on were asked to be backported separately in an email). Cc: "9.2" "10.0" [email protected] Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit 0b16042377a6981ff9bba92387889524a3547b3f)
* dri megadriver_stub: add compatibility for older DRI loadersJordan Justen2013-12-091-0/+126
| | | | | | | | | | | | | | | | | | | To help the transition period when DRI loaders are being updated to support the newer __driDriverExtensions_foo mechanism, we populate __driDriverExtensions with the extensions returned by __driDriverExtensions_foo during a library contructor function. We find the driver foo's name by using the dladdr function which gives the path of the dynamic library's name that was being loaded. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Keith Packard <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 4859d492b25cba61f43bb883d878d6388be742be)
* i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)Chad Versace2013-12-061-24/+11
| | | | | | | | | | | | | | | | | | | | | | | The BSpec states that the aligment for the non-msrt clear rectangle must be doubled; the BSpec does not restricit the workaround to specific hardware. Commit 9a1a67b applied the workaround to Haswell GT3. Commit 8b659ce expanded the workaround to all Haswell variants. This commit expands it to all hardware. No Piglit regressions on Ivybridge 0x0166. No fixes either. I know no Ivybridge nor Baytrail bug related to this workaround. However, the BSpec says the extra alignment is required, so let's do it. v2: Apply to all hardware, not just gen7. CC: "9.2, 10.0" <[email protected]> CC: Anuj Phogat <[email protected]> Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]> (cherry picked from commit 998018d7be1380f055fb577b0782004657cc9509)
* i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTsChad Versace2013-12-061-6/+12
| | | | | | | | | | | | | | | | | | | | | | | Pre-patch, the workaround was applied to only HSW GT3. However, the workaround also fixes render corruption on the HSW GT1 Chromebook, codenamed Falco. Also, update the BSpec quote that discusses the workaround to reflect the latest BSpec. The BSpec states that the workaround is required for Ivybridge and Baytrail as well as Haswell. But, we apply the workaround to only Haswell because (a) we suspect that is the only hardware where it is actually required and (b) we haven't yet validated the workaround for the other hardware. CC: "9.2, 10.0" <[email protected]> CC: Anuj Phogat <[email protected]> OTC-Tracker: CHRMOS-812 Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]> (cherry picked from commit 8b659cef3a244b1acdbbca0beb704a66b6bc2fbc)
* i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats.Paul Berry2013-12-061-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | On gen6, multisamble resolve blits use the SAMPLE message to blend together the 4 samples for each texel. For some reason, SAMPLE doesn't blend together the proper samples when the source format is L32_FLOAT or I32_FLOAT, resulting in blocky artifacts. To work around this problem, sample from the source surface using R32_FLOAT. This shouldn't affect rendering correctness, because when doing these resolve blits, the destination format is R32_FLOAT, so the channel replication done by L32_FLOAT and I32_FLOAT is unnecessary. Fixes piglit tests on Sandy Bridge: - spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float - spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float No piglit regressions on Sandy Bridge. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601 Cc: Kenneth Graunke <[email protected]> Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit c4cf487315f1f5375534f1677177983fa496d577)
* i965: Always reserve binding table space for at least one render target.Kenneth Graunke2013-11-281-1/+4
| | | | | | | | | | | | | | | | | | | | | | In brw_update_renderbuffer_surfaces(), if there are no color draw buffers, we always set up a null render target at surface index 0 so we have something to use with the FB write marking the end of thread. However, when we recently began computing surface indexes dynamically, we failed to reserve space for it. This meant that the first texture would be assigned surface index 0, and our closing FB write would clobber the texture. Fixes Piglit's EXT_packed_depth_stencil/fbo-blit-d24s8 test on Gen4-5, which regressed as of commit 4e5306453da6a1c076309e543ec92d999e02f67a ("i965/fs: Dynamically set up the WM binding table offsets.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70605 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Tested-by: lu hua <[email protected]> Cc: "10.0" [email protected] (cherry picked from commit c4815f6cd6f659acd361f1b4cf63473a46ca7de9)
* dri: Allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in driCreateContextAttribsIan Romanick2013-11-281-2/+4
| | | | | | | | | Signed-off-by: Ian Romanick <[email protected]> Reported-by: Zhenyu Wang <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 73e9aa9e3f73d69ce4f0b68e74702d67842a230c)
* i965: Only enable __DRI2_ROBUSTNESS if kernel support is availableIan Romanick2013-11-283-19/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a squash of the following two cherry-picked patches: i965: Only enable __DRI2_ROBUSTNESS if kernel support is available Rather than always advertising the extension but failing to create a context with reset notifiction, just don't advertise it. I don't know why it didn't occur to me to do it this way in the first place. NOTE: Kristian requested that I provide a follow-up for master that dynamically generates the list of DRI extensions instead of selected between two hardcoded lists. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 9b1c68638d8096304d3c4e0cceb97bb4dc61acc5) and i965: Properly reject __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS when __DRI2_ROBUSTNESS is not enabled Only allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in brwCreateContext if intelInitScreen2 also enabled __DRI2_ROBUSTNESS (thereby enabling GLX_ARB_create_context). This fixes a regression in the piglit test "glx/GLX_ARB_create_context/invalid flag" v2: Remove commented debug code. Noticed by Jordan. Signed-off-by: Ian Romanick <[email protected]> Reported-by: Paul Berry <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 53a65e547c0bf769fff48b4ccb41d1477daa70de)
* i965/gs: Set GS prog_data to NULL if there is no GS program.Paul Berry2013-11-261-0/+7
| | | | | | | | | | | | | | The previous commit fixes a bug wherein we would incorrectly refer to stale geometry shader prog_data when no geometry shader was active. This patch reduces the likelihood of that sort of bug occurring in the future by setting prog_data to NULL whenever there is no GS program. Cc: [email protected] Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 37bdde1087584f4c1839e14db75c157b83246ebd)
* i965/gs: Properly skip GS binding table upload when no GS active.Paul Berry2013-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, in brw_gs_upload_binding_table(), we checked whether brw->gs.prog_data was NULL in order to determine whether a geometry shader was active. This didn't work: brw->gs.prog_data starts off as NULL, but it is set to non-NULL when a geometry shader program is built, and then never set to NULL again. As a result, if we called brw_gs_upload_binding_table() while there was no geometry shader active, but a geometry shader had previously been active, it would refer to a stale (and possibly freed) prog_data structure. This patch fixes the problem by modifying brw_gs_upload_binding_table() to use the proper technique to determine whether a geometry shader is active: by checking whether brw->geometry_program is NULL. This fixes the crash reported in comment 2 of bug 71870 (the incorrect rendering remains, however). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Cc: [email protected] Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 2714ca81b9bad3dec3894fac97f34502c80b1697)
* i965: Use __attribute__((flatten)) on fast tiled teximage code.Kenneth Graunke2013-11-261-2/+8
| | | | | | | | | | | | | | | | | | | | The fast tiled texture upload code does not compile with GCC 4.8's -Og optimization flag. memcpy() has the always_inline attribute set. This poses a problem, since {x,y}tile_copy_faster calls it indirectly via {x,y}tile_copy, and {x,y}tile_copy normally aren't inlined at -Og. Using __attribute__((flatten)) tells GCC to inline every function call inside the function, which I believe was the author's intent. Fix suggested by Alexander Monakov. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]> Cc: [email protected] (cherry picked from commit ad542a10c5f2284c05036f1df8ce5b69bea66e50)
* i965: Fix streamed state dumping/annotation after the blorp-flush change.Eric Anholt2013-11-231-1/+0
| | | | | | | | | | I think I was thinking of the batch command packet cache when I pasted this in, but this counter is only used for dumping out streamed state for INTEL_DEBUG=batch and for putting annotations in our aub files. Cc: "10.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 5891f981452c1c5ed45b5a7e5fe54a9884ced2b6)
* i965: Fix fast clear of depth buffers.Paul Berry2013-11-231-2/+10
| | | | | | | | | | | | | | | | | | From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the fast depth clear path. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-depth". Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> (cherry picked from commit 08315233509f1fa7dc1e877aed2a8517296cf86e)
* i965: Fix blorp clear of layered framebuffers.Paul Berry2013-11-231-7/+19
| | | | | | | | | | | | | | | | | | From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the blorp clear path for color buffers. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-color". Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> (cherry picked from commit c1019670ea89505ea7411629c052d662c8eb6be6)
* i965: refactor blorp clear code in preparation for layered clears.Paul Berry2013-11-231-53/+66
| | | | | | | | Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> (cherry picked from commit 1ec5365429b46a39a06186092502c8e66fb4140e)
* mesa: Track number of layers in layered framebuffers.Paul Berry2013-11-233-3/+3
| | | | | | | | | | | | | | | | | | | | | In order to properly clear layered framebuffers, we need to know how many layers they have. The easiest way to do this is to record it in the gl_framebuffer struct when we check framebuffer completeness. This patch replaces the gl_framebuffer::Layered boolean with a gl_framebuffer::NumLayers integer, which is 0 if the framebuffer is not layered, and equal to the number of layers otherwise. v2: Remove gl_framebuffer::Layered and make gl_framebuffer::NumLayers always have a defined value. Fix factor of 6 error in the number of layers in a cube map array. Cc: "10.0" <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 95140740ad1c6cd8a34002c307556f5c49a34589)
* mesa/swrast: fix inverted front buffer rendering with old-school swrastDave Airlie2013-11-231-2/+2
| | | | | | | | | | | | | | | I've no idea when this broke, but we have some people who wanted it fixed, so here's my attempt. reproducer, run readpix with swrast hit f, or run trival tri -sb things are upside down, after this patch they aren't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62142 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66213 Cc: <[email protected]>" Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit a43b49dfb13dc7e6d2d6d7ceee892077038d7102)
* i965: Link -ldl after libmesa.laMatt Turner2013-11-231-1/+1
| | | | | | | | | DLOPEN_LIBS is part of DRI_LIB_DEPS. Cc: "10.0" <[email protected]>" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71512 Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 1f9092958d365c94af825c3b3b6664688c27b5a1)
* i965/vec4: Fix broken IR annotation in debug output.Paul Berry2013-11-231-1/+0
| | | | | | | | | | | | | Commit 70953b5 (i965: Initialize all member variables of vec4_instruction on construction) inadvertently added a line to the vec4_instruction constructor setting this->ir to NULL, wiping out the previously set value. As a result, ever since then, the output of INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations. Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 60b1a118e123493624324ae191d05870e95968f3)
* i965/gen7: Emit workaround flush when changing GS enable state.Paul Berry2013-11-237-22/+72
| | | | | | | | | | | v2: Don't go to extra work to avoid extraneous flushes. (Previous experiments in the kernel have suggested that flushing the pipeline when it is already empty is extremely cheap). Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 7dfb4b2d00ddb8e5ee24d4c58eb9415dc4ccc21c)
* i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case.Vinson Lee2013-11-231-0/+2
| | | | | | | | | | Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit b570c4229fe9c621b56bb9475d77a344039444d4)
* i965: Fix vertical alignment for multisampled buffers.Paul Berry2013-11-151-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit Size): j [vertical alignment] = 4 for any render target surface is multisampled (4x) From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most messages), under the "Surface Vertical Alignment" heading: This field is intended to be set to VALIGN_4 if the surface was rendered as a depth buffer, for a multisampled (4x) render target, or for a multisampled (8x) render target, since these surfaces support only alignment of 4. Back in 2012 when we added multisampling support to the i965 driver, we forgot to update the logic for computing the vertical alignment, so we were often using a vertical alignment of 2 for multisampled buffers, leading to subtle rendering errors. Note that the specs also require a vertical alignment of 4 for all Y-tiled render target surfaces; I plan to address that in a separate patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit b4c3b833ec8ec6787658ea90365ff565cd8846c7)
* dri: Change value param to unsignedIan Romanick2013-11-154-4/+4
| | | | | | | | | | This silences some compiler warnings in i915 and i965. See also 75982a5. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit a15a19f0d1b024f5f18f1dfe878ae8d399e38469)
* i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiBIan Romanick2013-11-151-3/+7
| | | | | | | | | | | Systems with little physical memory installed will report less than 2GiB, and some systems may (hypothetically?) have a larger address space for the GPU. My IVB still reports 1534. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit cb6182bdfa7c8a5a773ec21112886f742df99840)
* i915: Use drm_intel_get_aperture_sizes instead of drmAgpSizeIan Romanick2013-11-151-2/+6
| | | | | | | | | Send the zombie back to the grave before it infects the townsfolk. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]> (cherry picked from commit 9fe108db0942ebf41cd1cce0ca6a6444901fbb0a)