summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Use -Bsymbolic when linking libEGL.soCarl Worth2013-09-301-1/+1
| | | | | | | | | | | | | | | | | | | For some reason that I don't yet fully understand, Glaze does not work with libEGL unless libEGL is linked with -Bsymbolic.[*] Beyond that specific reason, all of the reasons for which libGL.so is linked with -Bsymbolic, (see the commit history), should also apply here. [*] The specific behavior I am seeing is that when Glaze calls dlopen for libEGL.so, ifunc resolvers within Glaze for EGL functions are called before the dlopen returns. These resolvers cannot succeed, as they need the return value from dlopen in order to find the functions to resolve to. I don't know what's causing these resolvers to be called, but I have verified that linking libEGL with -Bsymbolic causes this problematic behavior to stop. CC: "9.1 and 9.2" <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: retype destination register for texture SEND instruction to UW.Paul Berry2013-09-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | From the bspec documentation of the SEND instruction: "destination region cannot cross the 256-bit register boundary." To avoid violating this restriction when executing SIMD16 texturing operations (such as those used by blorp), we need to ensure that the destination of the SEND instruction doesn't exceed 256 bits in size. An easy way to do this is to set the type of the destination register to UW (unsigned word), since 16 unsigned words can fit inside a 256-bit register. Fortunately, this has no effect on the sampling operation, since the sampler always infers the destination data type from the sampler message rather than from the type of the instruction operand. Previously, we did this for texturing operations issued by the vec4 and fs back-ends, but not for blorp. This patch makes blorp use the same trick. I haven't observed any behavioural difference on actual hardware due to this patch, but it avoids a warning from the simulator so it seems like the right thing to do. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Chad Versace <[email protected]>
* i965: Add a real native TexStorage path.Eric Anholt2013-09-301-0/+63
| | | | | | | | | | | | We originally had a path just did the loop and called ctx->Driver.AllocTextureImageBuffer(), which I moved into Mesa core. But we can do better, avoiding incorrect miptree size guesses and later texture validations by just directly allocating the miptree and setting it to all the images. v2: drop debug printf. Reviewed-by: Chad Versace <[email protected]>
* i965: Add missing license to intel_tex_validate.c.Eric Anholt2013-09-301-0/+23
| | | | | | I've rewritten a lot of this file. Reviewed-by: Chad Versace <[email protected]>
* i965: Always allocate validated miptrees from level 0.Eric Anholt2013-09-301-6/+5
| | | | | | | No change in copies during a piglit run, but it's one less first_level != 0 in our codebase. Reviewed-by: Chad Versace <[email protected]>
* i965: Don't relayout a texture just for baselevel changes.Eric Anholt2013-09-302-24/+39
| | | | | | | | | | | | As long as the baselevel, maxlevel still sit inside the range we had previously validated, there's no need to reallocate the texture. I also hope this makes our texture validation logic much more obvious. It's taken me enough tries to write this change, that's for sure. Reduces miptree copy count on a piglit run by 1.3%, though the change in amount of data moved is much smaller. Reviewed-by: Chad Versace <[email protected]>
* i965: Don't allocate a 1-level texture when GL_GENERATE_MIPMAP is set.Eric Anholt2013-09-301-1/+2
| | | | | | | | | | Given that a teximage that calls us with this flag set will immediately proceed to allocate the other levels, we can probably just go ahead and allocate those levels now. Reduces miptree copies in piglit by about .05%. Reviewed-by: Chad Versace <[email protected]>
* i965: Stop allocating miptrees with first_level != 0.Eric Anholt2013-09-301-17/+6
| | | | | | | | | | | | If the caller shows up with GL_BASE_LEVEL != 0, it doesn't mean that the texture will over the course of its lifetime have that nonzero baselevel, it means that the caller is filling the texture from the bottom up for some reason (one could imagine demand-loading detailed texture layers at runtime, for example). If we allocate from just the current baselevel, it means when they come along with the next level up, we'll have to allocate a new miptree and copy all of our bits out of the first miptree. Reviewed-by: Chad Versace <[email protected]>
* i965: Drop a special case for guessing small miptree levels.Eric Anholt2013-09-301-43/+30
| | | | | | | | | | | | | | | | Let's say you started allocating your 2D texture with level 2 of a tree as a 1x1 image. The driver doesn't know if this means that level 0 is 4x4 or 4x1 or 1x4, so we would just allocate a single 1x1 and let it get copied in to the real location at texture validate time later. Since this is just a temporary allocation that *will* get copied, the extra space allocation of just taking the normal path which will happen to producing a 4x1 level 0, 2x1 level 1, and 1x1 level 2 is the right way to go, to reduce complexity in the normal case. No change in miptree copies over the course of a piglit run. Reviewed-by: Chad Versace <[email protected]>
* i965: Totally switch around how we handle nonzero baselevel-first_level.Eric Anholt2013-09-304-19/+12
| | | | | | | | | | | | | | | | | | | This has no effect currently, because intel_finalize_mipmap_tree() always makes mt->first_level == tObj->BaseLevel. The change I made before to handle it (b1080cfbdb0a084122fcd662cd27b4748c5598fd) got very close to working, but after fixing some unrelated bugs in the series, it still left tex-miplevel-selection producing errors when testing textureLod(). The problem is that for explicit LODs, the sampler's LOD clamping is ignored, and only the surface's MIP clamping is respected. So we need to use surface mip clamping, which applies on top of the sampler's mip clamping, so the sampler change gets backed out. Now actually tested with a non-regressing series producing a non-zero computed baselevel. Reviewed-by: Chad Versace <[email protected]>
* i965: Always look up from the object's mt when setting up texturing state.Eric Anholt2013-09-302-5/+2
| | | | | | | | | | | We know that the object's mt is equal to the firstimage's mt because it's gone through intel_finalize_mipmap_tree(). Saves a lookup of firstimage on pre-gen7. v2: Merge in the warning fix that appeared later in the series (noted by Chad) Reviewed-by: Chad Versace <[email protected]>
* r600g/sb: Move variable dereference after null check.Vinson Lee2013-09-301-1/+2
| | | | | | | Fixes "Deference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vadim Girlin <[email protected]>
* st/mesa: fix comment typoBrian Paul2013-09-301-1/+1
|
* r600g,radeonsi: workaround for late shared screen initializationMarek Olšák2013-09-302-1/+5
| | | | Accidentally broken by the consolidation.
* r600g: Fix build failure introduced with r600_texture.c consolidationLaurent Carlier2013-09-291-4/+4
| | | | | | It seems that case with opencl enabled was forgotten Signed-off-by: Marek Olšák <[email protected]>
* radeon: make texture logging more usefulMarek Olšák2013-09-295-26/+23
| | | | | | | This has been very useful for tracking down bugs in libdrm. The *_PRINT_TEXDEPTH environment variables were probably never used, so I removed them.
* r600g,radeonsi: share r600_texture.cMarek Olšák2013-09-2918-1228/+367
| | | | | | | | | The function r600_choose_tiling is new and needs a review. The only change in functionality is that it enables 2D tiling for compressed textures on SI. It was probably accidentally turned off. v2: don't make scanout buffers linear
* r600g: remove compute_global_transfer_* calls from texture_transfer_map/unmapMarek Olšák2013-09-291-9/+0
| | | | Textures can never have target==PIPE_BUFFER.
* r600g: move the low-level buffer functions for multiple rings to drivers/radeonMarek Olšák2013-09-2911-88/+87
| | | | Also slightly optimize r600_buffer_map_sync_with_rings.
* r600g,radeonsi: consolidate tiling_info initializationMarek Olšák2013-09-2912-217/+148
| | | | and the util_format_s3tc_init calls too.
* radeonsi: implement clear_buffer using CP DMA, initialize CMASK with itMarek Olšák2013-09-294-19/+108
| | | | | | | | More work needs to be done for this to be entirely shared with r600g. I'm just trying to share r600_texture.c now. The reason I put the implementation to si_descriptors.c is that the emit function had already been there.
* r600g: move aux_context and r600_screen_clear_buffer to drivers/radeonMarek Olšák2013-09-296-66/+74
| | | | This will be used in the next commit.
* radeonsi: move debug options to R600_DEBUGMarek Olšák2013-09-296-38/+41
|
* r600g: move some debug options to drivers/radeonMarek Olšák2013-09-2910-52/+61
|
* r600g,radeonsi: share the async dma interfaceMarek Olšák2013-09-298-51/+61
| | | | r600_texture.c is one step closer to r600g.
* radeonsi: move radeonsi-specific functions out of r600_texture.cMarek Olšák2013-09-294-46/+38
|
* r600g,radeonsi: remove unused codeMarek Olšák2013-09-292-4/+0
|
* r600g: move r600g-specific functions out of r600_texture.cMarek Olšák2013-09-294-467/+461
|
* r600g,radeonsi: consolidate r600_texture structuresMarek Olšák2013-09-293-42/+26
|
* r600g: get rid of r600_texture::is_ratMarek Olšák2013-09-292-8/+1
| | | | It's always 0.
* r600g: get rid of r600_texture::array_modeMarek Olšák2013-09-293-25/+4
|
* r600g,radeonsi: consolidate transfer, cmask, and fmask structuresMarek Olšák2013-09-299-127/+94
|
* radeon drivers: handle PIPE_CAP_MAX_VIEWPORTSMarek Olšák2013-09-293-0/+9
|
* radeon/llvm: fix TGSI_OPCODE_UCMPMarek Olšák2013-09-291-3/+7
| | | | | | | This doesn't fix any known issue (I haven't run piglit with this yet), but the code was obviously completely wrong. It looks like copy-pasted from CMP. Reviewed-by: Tom Stellard <[email protected]>
* st/mesa: fix GLSL mix(.., .., bvecN)Marek Olšák2013-09-291-1/+8
| | | | v2: use CMP on drivers without native integer support
* configure.ac: Add a more informative warning when libclc.pc is not found v2Tom Stellard2013-09-271-6/+11
| | | | | | | v2: - Don't display an error message when the user doesn't ask for libclc. Reviewed-by: Matt Turner <[email protected]>
* mesa: Include stdint.h in mtypes.h for uint32_t symbol.Vinson Lee2013-09-261-0/+2
| | | | | | | | | | | | | This patch fixes the MSVC build error introduced with commit b2e327e08f8519da131dd382adcc99240d433404. api_arrayelt.c src\mesa\main/mtypes.h(1809) : error C2061: syntax error : identifier 'uint32_t' src\mesa\main/mtypes.h(1810) : error C2059: syntax error : '}' src\mesa\main/mtypes.h(1825) : error C2079: 'Minimum' uses undefined union 'gl_perf_monitor_counter_value' src\mesa\main/mtypes.h(1828) : error C2079: 'Maximum' uses undefined union 'gl_perf_monitor_counter_value' Signed-off-by: Vinson Lee <[email protected]>
* i965/fs: Don't double-accept operands of logical and/or/xor operations.Kenneth Graunke2013-09-261-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | If the argument to emit_bool_to_cond_code() is an ir_expression, we loop over the operands, calling accept() on each of them, which generates assembly code to compute that subexpression. We then emit one or two final instruction that perform the top-level operation on those operands. If it's not an expression (say, a boolean-valued variable), we simply call accept() on the whole value. In commit 80ecb8f1 (i965/fs: Avoid generating extra AND instructions on bool logic ops), Eric made logic operations jump out of the expression path to the non-expression path. Unfortunately, this meant that we would first accept() the two operands, skip generating any code that used them, then accept() the whole expression, generating code for the operands a second time. Dead code elimination would always remove the first set of redundant operand assembly, since nothing actually used them. But we shouldn't generate it in the first place. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add #define for MI_REPORT_PERF_COUNT on Gen6+.Kenneth Graunke2013-09-261-0/+2
| | | | | | This appears in Volume 1 Part 1 of the Sandybridge PRM on page 48. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add support for GL_AMD_performance_monitor on Ironlake.Kenneth Graunke2013-09-266-0/+420
| | | | | | | | | | | | | | | | | | | Ironlake's counters are always enabled; userspace can simply send a MI_REPORT_PERF_COUNT packet to take a snapshot of them. This makes it easy to implement. The counters are documented in the source code for the intel-gpu-tools intel_perf_counters utility. v2: Adjust for core data structure changes. Add a table mapping buffer object offsets to exposed counters (which changes each generation). Finally, add report ID assertions to sanity check the BO layout (thanks to Carl Worth). v3: Update for core BeginPerfMonitor hook changes (requested by Brian). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add core support for the GL_AMD_performance_monitor extension.Kenneth Graunke2013-09-2613-0/+913
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides an interface for applications (and OpenGL-based tools) to access GPU performance counters. Since the exact performance counters available vary between vendors and hardware generations, the extension provides an API the application can use to get the names, types, and minimum/maximum values of all available counters. Counters are also organized into groups. Applications create "performance monitor" objects, select the counters they want to track, and Begin/End monitoring, much like OpenGL's query API. Multiple monitors can be in flight simultaneously. v2: Pass ctx to all driver hooks (suggested by Christoph), and attempt to fix overallocation of bitsets (caught by Christoph). Incomplete. v3: Significantly rework core data structures. Store counters in groups rather than in a global list. Use their array index in the group's counter list as the ID rather than trying to store a globally unique counter ID. Use bitsets for active counters within a group, and also track which groups are active so that's easy to query. v4: Remove _mesa_ prefix on static functions; detect out of memory conditions in new_performance_monitor(); make BeginPerfMonitor hook return a boolean rather than setting m->Active or raising an error. Switch to GLuint/unsigned for NumGroups, NumCounters, and MaxActiveCounters (which also means switching a bunch of temporary variable types). All suggested by Brian Paul. Also, remove commented out code at the bottom of the block. Finally, fix the dispatch sanity test (noticed by Ian Romanick). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> [v3] Reviewed-by: Ian Romanick <[email protected]>
* glsl: Create and use a has_uniform_buffer_objects() helper.Kenneth Graunke2013-09-263-7/+8
| | | | | | | | | | | This is better than overriding the extension enable based on the language version; it's robust against shaders that do: #version 140 #extension GL_ARB_uniform_buffer_object : disable Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Create and use a has_explicit_attrib_location() helper.Kenneth Graunke2013-09-264-6/+7
| | | | | | | | | | | | | | | | | | | Explicit attribute locations are supported with GLSL 3.30, GLSL ES 3.00, or "#extension GL_ARB_explicit_attrib_location: enable". Using a helper function makes it easy to check for this. This enables support in GLSL 3.30, which was previously missing. Previously, we overrode the extension enable flag for ES 3.00. This is not robust against a shader such as: #version 330 #extension GL_ARB_explicit_attrib_location : disable Disabling extensions should not remove core language functionality. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Remove 'invalidate_state' parameter to _mesa_dirty_texobj().Kenneth Graunke2013-09-266-14/+10
| | | | | | | Every caller passed true. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Remove some remaining FEATURE_* detritus.Eric Anholt2013-09-268-47/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix cube array coordinate normalizationChris Forbes2013-09-261-5/+11
| | | | | | | | | | | | | | | | | | | | | | Hardware requires the magnitude of the largest component to not exceed 1; brw_cubemap_normalize ensures that this is the case. Unfortunately, we would previously multiply the array index for cube arrays by the normalization factor. The incorrect array index would then cause the sampler to attempt to access either the wrong cube, or memory outside the cube surface entirely, resulting in garbage rendering or in the worst case, hangs. Alter the normalization pass to only multiply the .xyz components. Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit, which was recently adjusted to provoke this behavior. V2: Fix indent. Signed-off-by: Chris Forbes <[email protected]> Cc: "9.2" [email protected] Reviewed-by: Eric Anholt <[email protected]>
* draw/clip: don't emit so many empty trianglesZack Rusin2013-09-251-0/+39
| | | | | | | | | | | | Compress empty triangles (don't emit more than one in a row) and never emit empty triangles if we already generated a triangle covering a non-null area. We can't skip all null-triangles because c_primitives expects ones that were generated from vertices exactly at the clipping-plane, to be emitted. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: count c_primitives before discarding null primsZack Rusin2013-09-251-7/+6
| | | | | | | | | We need to count the clipper primitives before the rasterizer discards one it considers to be null. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: we need to subdivide if fb is bigger in either directionZack Rusin2013-09-251-1/+1
| | | | | | | | | We need to subdivide triangles if either of the dimensions is larger than the max edge length, not when both of them are larger. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* radeon/llvm: fix shadow cube texturing for GL3.0Marek Olšák2013-09-251-23/+15
| | | | | | | | | The fix is at the end (TGSI_TEXTURE_SHADOWCUBE handling), but I also restructured the code for it to be more readable. Fixes spec/!OpenGL 3.0/sampler-cube-shadow. Reviewed-by: Michel Dänzer <[email protected]>