summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: don't make a fake ir_texture in the Mesa IR frontendConnor Abbott2014-10-151-14/+5
| | | | | | | | | Now that we've made all the texture emit code mostly independent of GLSL IR, this isn't necessary any more. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Refactor the texture emission logic into a single function.Kenneth Graunke2014-10-153-104/+144
| | | | | | | | | | | | | | | | | Before, we had 3 different emit functions for various different gen's, as well as some ancilliary work that was the same across all gen's which was either contained in functions or duplicated across the GLSL IR and Mesa IR backends. Now, we have a single method, emit_texture(), that takes all the information needed to make a texture instruction and handles all the setup, and all we have to do to emit a texture instruction while converting from GLSL IR, Mesa IR, or any new backend is to extract the information emit_texture() needs and then call it. v2: Significant rebasing (by Ken). Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make gather_channel() not use ir_texture.Connor Abbott2014-10-152-5/+4
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make swizzle_result() not use ir_texture.Connor Abbott2014-10-153-8/+9
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: fix integer textures with swizzlesConnor Abbott2014-10-151-0/+1
| | | | | | | | | This happened to work before, but it would convert the output to a float and then back to an integer which seems bad. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't pass in ir_texture to emit_texture_*Connor Abbott2014-10-153-24/+23
| | | | | | | | At this point, the only thing it's used for is the opcode. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't use ir->type in emit_texture_gen4()Connor Abbott2014-10-151-4/+1
| | | | | | | | We already have the type from the original destination. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.Connor Abbott2014-10-153-18/+31
| | | | | | | | | | | | | This drops a dependency on ir_texture objects. v2 (Ken): Rename lod_components to grad_components, as it only has a meaningful value for ir_txd. We could set it to 1 for TXL, but there's no real need. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Don't use ir->coordinate in emit_texture_*.Connor Abbott2014-10-153-31/+39
| | | | | | | | This drops a dependency on ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: make rescale_texcoord() not use ir_texture.Connor Abbott2014-10-153-8/+8
| | | | | | | | | | Our new IR won't have ir_texture objects, but using glsl_type is fine. v2 (Ken): Drop redundant ir->coordinate NULL check; rebase. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make emit_mcs_fetch() not use ir_texture.Connor Abbott2014-10-152-4/+4
| | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Rename "length" to "components" in emit_mcs_fetch().Kenneth Graunke2014-10-151-6/+6
| | | | | | | This is slightly clearer. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Make brw_texture_offset() not use ir_texture.Connor Abbott2014-10-154-12/+15
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't use ir->offset in emit_texture_gen5.Connor Abbott2014-10-153-5/+8
| | | | | | | | v2 (Ken): Refactor the Gen7 code separately; rebase. Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move texel offset handling to visit(ir_texture *).Kenneth Graunke2014-10-153-11/+29
| | | | | | | | | | | | This moves the handling of non-constant texel offset subexpression trees to the place where we visit other such subtrees. It also removes some uses of ir->offset in emit_texture_gen7, which will be useful when we write the backend for our new upcoming IR. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Drop ir->op != ir_txf condition in offset checking.Kenneth Graunke2014-10-152-4/+3
| | | | | | | | | brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the texelFetchOffset workarounds, so there's no need to special case it here---there won't be an offset for ir_txf. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Restore a lost comment about TXF offset bugs.Kenneth Graunke2014-10-151-0/+5
| | | | | | | | | | | Eric's original code to work around TXF offset bugs contained a comment explaining the problem, which was lost when Chris generalized it to an IR transformation (in commit 598ca510b8a118c3c7e18b5d031a2b116120e0a6). This commit adds the original comment to the newer code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Allow CSE on Gen4-5 unary math.Kenneth Graunke2014-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+ math instruction: it's a single instruction (SEND) with a GRF source. The difference is that it also implicitly clobbers a message register. The only visible effect is that CSE will remove the MRF-clobbering from later math operations. This should be fine; compute_to_mrf and remove_redundant_mrf_writes don't look at the values populated by implied writes, so they can't rely on those values being present. Less interference may actually help those passes make more progress. Binary math is still problematic, since it involves a separate MOV instruction to load the second operand. We continue disallowing CSE for binary math operations. total instructions in shared programs: 3340303 -> 3340100 (-0.01%) instructions in affected programs: 26927 -> 26724 (-0.75%) Nothing hurt, gained, or lost. ~6% reduction on a few shaders. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Use the correct regs_written on unspill instructionsJason Ekstrand2014-10-141-0/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nouveau: 3d textures are unsupported, limit 3d levels to 1Ilia Mirkin2014-10-141-0/+3
| | | | | | | | | | Ideally there would be a swrast fallback, but the driver isn't ready for that. This should avoid crashes if someone tries to use 3d textures though. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Cc: [email protected]
* i965: Use unsynchronized maps for the program cache on LLC platforms.Kenneth Graunke2014-10-131-7/+28
| | | | | | | | | | | | | | | | | | | | There's no reason to stall on pwrite - the CPU always appends to the buffer and never modifies existing contents, and the GPU never writes it. Further, the CPU always appends new data before submitting a batch that requires it. This code predates the unsynchronized mapping feature, so we simply didn't have the option when it was written. Ideally, we would do this for non-LLC platforms too, but unsynchronized mapping support only exists for LLC systems. Saves a bunch of stall avoidance copies when uploading shaders. v2: Rebase on changes to previous patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
* i965: Issue performance warnings when copying the program cache BO.Kenneth Graunke2014-10-131-0/+3
| | | | | | | | | | | | | We don't really want unnecessary buffer copying, so it'd be nice to know when it's happening. v2: Drop stall warnings when doing a read-only CPU mapping of the cache BO. The GPU also uses it in a read-only fashion, so there won't be any stalls, even though the buffer is busy. (Thanks to Chris Wilson for catching this mistake.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> [v1]
* i965: Issue performance warnings on MapBufferRange stalls.Kenneth Graunke2014-10-131-3/+4
| | | | | | | | This is easy: we just need to use brw_map_bo instead of mapping it directly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* mesa: fix error reported on gTexSubImage2D when level not validTapani Pälli2014-10-101-1/+1
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965: Fix register write checks.Kenneth Graunke2014-10-101-0/+2
| | | | | | | | | | | | | When mapping the buffer a second time, we need to use the new pointer, not the one from the previous mapping. Otherwise, we will most likely crash. Apparently, we've just been getting lucky and getting the same bo->virtual pointer in both cases. libdrm probably has a hand in that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: [email protected]
* i965: Skip uploading border color when unnecessary.Kenneth Graunke2014-10-091-2/+20
| | | | | | | | | | | | | | The border color is only needed when using the GL_CLAMP_TO_BORDER or (deprecated) GL_CLAMP wrap modes; all others ignore it, including the common GL_CLAMP_TO_EDGE and GL_REPEAT wrap modes. In those cases, we can skip uploading it entirely, saving a bit of space in the batchbuffer. Instead, we just point it at the start of the batch (offset 0); we have to program something, and that address is safe to read. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use BDW_MOCS_PTE for renderbuffers.Kenneth Graunke2014-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Write-back caching cannot be used for buffers being scanned out by the display engine; surfaces used for scan-out must be write-through or uncached. I originally chose WT for render targets because it works in all cases. However, we really want to use write-back caching where possible, as it is more efficient. Most renderbuffers are not used for scanout - off-screen FBOs certainly are fine, and non-pageflipped backbuffers should be fine as well. So in most cases WB will work. However, we don't know what will be used for scan-out, so we instead simply use the PTE value specified by the kernel, as it knows these things. This matches our MOCS choice on Haswell. Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5 in a microbenchmark (spotted by Eero Tamminen). Improves performance in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a Broadwell GT2. Improves performance in a bunch of other microbenchmarks by ~15% or so. Signed-off-by: Kenneth Graunke <[email protected]> Reported-by: Eero Tamminen <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Cc: [email protected]
* i965: Add a BRW_MOCS_PTE #define.Kenneth Graunke2014-10-091-3/+7
| | | | | | | | | | | | | | Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all three caches (L3, LLC, and eLLC where available), but leaves the LLC caching mode up to the kernel's page table entry. This allows the kernel to pick WB/WT/UC based on whether it's using a buffer for scanout. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Cc: [email protected]
* mesa: Make _mesa_print_arrays use stderr.Kenneth Graunke2014-10-091-3/+3
| | | | | | | | | These days, most driver debug output happens via stderr, not stdout. Some applications (such as Xephyr) also appear to close stdout which makes these messages go nowhere. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965/compaction: Disable compaction on SNB temporarily.Matt Turner2014-10-031-0/+6
| | | | Will investigate after XDC.
* Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7."Matt Turner2014-10-031-2/+2
| | | | | | | | This reverts commit 54e30dbf4db437748509d1319c3f6e4185f76c69. Will investigate after XDC. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557
* i965/fs: Remove dead generate_rep_fb_write prototype.Matt Turner2014-10-031-1/+0
| | | | Added in commit f9dc7aab.
* mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION errorBrian Paul2014-10-031-1/+35
| | | | | | | | | | | | | | | | | | On Windows, the Piglit primitive-restart test was failing a glGetError()==0 assertion when it was run w/out any command line arguments. Piglit's all.py script only runs primitive-restart with arguments so this case isn't normally hit during a full piglit run. The basic problem is Microsoft's opengl32.dll calls glFlush from wglGetProcAddress() and Piglit uses wglGetProcAddress() to resolve glPrimitiveRestartNV() which is called inside glBegin/End. See comments in the code for more info. Plus, improve the comments for _mesa_alloc_dispatch_table(). Cc: <[email protected]> Acked-by: Sinclair Yeh <[email protected]>
* mesa: fix GetTexImage for 1D array depth texturesDave Airlie2014-10-031-2/+7
| | | | | | | | | | | | | | | | While running piglit in virgl, I hit an assert in intel driver. "qemu-system-x86_64: intel_tex.c:219: intel_map_texture_image: Assertion `tex_image->TexObject->Target != 0x8C18 || h == 1' failed." Thanks to Eric and Ken for pointing me in the right direction, Fix the get_tex_depth to do the same fixup as get_tex_rgba does for 1D array textures. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: Fix paths used in Android buildsTomasz Figa2014-10-033-0/+6
| | | | | | | | | | | | | | | | | | With current makefiles the build fails because source and build paths are generated incorrectly. With Android build system the top_srcdir and top_builddir variables are undefined and all paths are relative to where Android.mk is located. This ends up with path likes external/mesa/src/mesa/src/mesa/ for both source and build paths, which are obviously wrong. This patch fixes this by overriding resulting SRCDIR and BUILDDIR variables with empty string, so that paths end up being relative to Android.mk file again. Appending correct build path to generated files is already done in Android.gen.mk. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/mesa: Generate format_info.c in Android buildsTomasz Figa2014-10-031-0/+9
| | | | | | | | | | Current Android makefiles lack generation of format_info.c, which is a dependency of main/format.c. This patch adds necessary code to Android.gen.mk. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* util: Include in Android buildsTomasz Figa2014-10-034-2/+5
| | | | | | | | | | This patch fixes Android build failures by including src/util directory in compilation. Files inside of this directory are compiled into libmesa_util static library and linked with resulting libGLES_mesa. Signed-off-by: Tomasz Figa <[email protected]> CC: <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965/fs: Use the correct base_mrf for spilling pairs in SIMD8Jason Ekstrand2014-10-021-3/+4
| | | | | | | | | | Before, we were hard-coding the base_mrf based on dispatch width not number of registers spilled at a time. This caused us to emit instructions with a base_mrf or 14 and a mlen of 3 so we used the magical non-existant m16 register. This fixes the problem. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add a MAX_GRF_SIZE define and use it various placesJason Ekstrand2014-10-024-6/+9
| | | | | | | | | | | | Previously, we had a MAX_SAMPLER_MESSAGE_SIZE which we used instead. However, some FB write messages can validly be longer than this so we need something different. Since MAX_SAMPLER_MESSAGE_SIZE is validly useful on its own, we leave it alone and add a new MAX_GRF_SIZE that's big enough for FB writes. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84539 Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Use the actual regsister width in brw_reg_from_fs_regJason Ekstrand2014-10-021-0/+13
| | | | | | | | This fixes a bug where 1-wide operations don't properly translate down to 1-wide instructions. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs_fp: Use null_reg from fs_visitor instead of rolling our ownJason Ekstrand2014-10-021-6/+4
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84529 Reviewed-by: Matt Turner <[email protected]>
* mesa: relax draw api validation on ES2Tapani Pälli2014-10-021-3/+2
| | | | | | | | | | | Patch fixes failing test in WebGL conformance test 'point-no-attributes' when running Chrome on OpenGL ES. (Shader program may draw points using constant data in shader.) No Piglit regressions. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Fix make check failures in setup_glsl_msaa_blit_scaled_shader()Anuj Phogat2014-10-011-8/+9
| | | | | | | introduced by commit 68ee950. Signed-off-by: Anuj Phogat <[email protected]> Reported-by: Mark Janes <[email protected]>
* mesa: fix _mesa_alloc_dispatch_table() declarationBrian Paul2014-10-011-1/+1
| | | | Insert 'void' parameter to match declaration in api_exec.h. Trivial.
* meta: (trivial) remove accidental double semicolonRoland Scheidegger2014-10-011-1/+1
|
* i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8Anuj Phogat2014-10-011-2/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* meta: Implement ext_framebuffer_multisample_blit_scaled extensionAnuj Phogat2014-10-012-13/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extension enables doing a multisample buffer resolve and buffer scaling using a single glBlitFrameBuffer() call. Currently, we have this extension implemented in BLORP which is only used by SNB and IVB. This patch implements the extension in meta path which makes it available to Broadwell. Implementation features: - Supports scaled resolves of 2X, 4X and 8X multisample buffers. - Avoids unnecessary shader compilations by storing the pre compiled shaders for each supported sample count. - Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed behavior in the extension's spec. - I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT filter. It made the edges in the image look little smoother but the image gets blurred causing no overall quality improvement. For now I have dropped the idea of doing different filtering for nicest filter. V2: - Minor changes to simplify the fragment shader. - Refactor the code to move i965 specific sample_map computation out of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized by the driver. - Use a simple msaa resolve shader for scaled resolves with scaling factor = 1.0. V3: - Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x variables and use it in fragment shader. V4: - Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x variables. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Initialize the SampleMap{2,4,8}x variablesAnuj Phogat2014-10-013-0/+55
| | | | | | | | | | | | | with values specific to Intel hardware. V2: Define and use gen6_get_sample_map() function to initialize the variables. V3: Change the function name to gen6_set_sample_maps() and use memcpy() to fill in the data. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: Add new variables in gl_context to store sample layoutAnuj Phogat2014-10-011-0/+32
| | | | | | | | | | | | | SampleMap{2,4,8}x variables are used in later patches to implement EXT_framebuffer_multisample_blit_scaled extension. V2: Use integer array instead of a string. Bump up the comment. V3: Use uint8_t type array. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: Avoid flagging _NEW_VIEWPORT on redundant viewport updates.Kenneth Graunke2014-10-011-0/+6
| | | | | | | | | Cuts the number of i965 color calculator viewport uploads by 100x (11017983 -> 113385) in 'x11perf -gc' with Glamor in Xephyr. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>