summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* glsl/linker: initialize explicit uniform locationsTapani Pälli2014-06-162-0/+119
| | | | | | | | | | | Patch initializes the UniformRemapTable for explicit locations. This needs to happen before optimizations to make sure all inactive uniforms get their explicit locations correctly. v2: fix initialization bug, introduce define for inactive uniforms (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add glsl_type::uniform_locations() helper functionTapani Pälli2014-06-162-0/+32
| | | | | | | | This function calculates the number of unique values from glGetUniformLocation for the elements of the type. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add new enum MAX_UNIFORM_LOCATIONSTapani Pälli2014-06-165-0/+12
| | | | | | | | | | | | | Patch adds new implementation dependent value required by the GL_ARB_explicit_uniform_location extension. Default value for user assignable locations is calculated as sum of MaxUniformComponents for each stage. v2: fix descriptor in get_hash_params.py (Petri) v3: simpler formula for calculating initial value (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add enable bit for ARB_explicit_uniform_locationTapani Pälli2014-06-162-0/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glapi: add GL_ARB_explicit_uniform_locationTapani Pälli2014-06-161-0/+6
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Use the sampler for pull constant loads on Broadwell.Kenneth Graunke2014-06-151-8/+8
| | | | | | | | | | | | | | | | | | | | | We've used the LD sampler message for pull constant loads on earlier hardware for some time, and also were already using it for the FS on Broadwell. This patch makes us use it for Broadwell VS/GS as well. I believe that when I wrote this code in 2012, we still used the data port in some cases, and I somehow neglected to convert it while rebasing. Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821% (n = 17). Many other applications should benefit similarly: this speeds up uniform array access in the VS, which is commonly used for skinning shaders, among other things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Tested-by: Ben Widawsky <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing newlines to a few perf_debug messages.Kenneth Graunke2014-06-151-2/+2
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.Kenneth Graunke2014-06-152-4/+0
| | | | | | | | | I actually added MOCS support for these things, but forgot to delete the corresponding perf_debug() warnings. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.Kenneth Graunke2014-06-151-3/+1
| | | | | | | | Somehow I missed this when adding all of the other MOCS values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/vec4: Fix dead code elimination for VGRFs of size > 1.Kenneth Graunke2014-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | When faced with code such as: mov vgrf31.0:UD, 960D mov vgrf31.1:UD, vgrf30.xxxx:UD The dead code eliminator didn't consider reg_offsets, so it decided that the second instruction was writing was writing to the same register as the first one, and eliminated the first one. But they're actually different registers. This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above code, vgrf31.0 represents the offset into the shader_time buffer where the data should be written, and vgrf31.1 represents the actual time data. With a completely undefined offset, results were...unexpected. I think this is probably one of the few cases (maybe only case) where we generate multiple MOVs to a large VGRF. Normally, we just use them as texturing results; the other SEND-from-GRF uses a size 1 VGRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* i965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode.Kenneth Graunke2014-06-151-0/+2
| | | | | | | "shader_time_add" is a lot more informative than "op152". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Fix clang mismatched-tags warnings with glsl_type.Vinson Lee2014-06-151-1/+1
| | | | | | | | | | | | | | | | | | Fix clang mismatched-tags warnings introduced with commit 4f5445a45d3ed02e00a061b10c943c0b079c6020. ./glsl_symbol_table.h:37:1: warning: class 'glsl_type' was previously declared as a struct [-Wmismatched-tags] class glsl_type; ^ ./glsl_types.h:86:8: note: previous use is here struct glsl_type { ^ ./glsl_symbol_table.h:37:1: note: did you mean struct here? class glsl_type; ^~~~~ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa/drivers: Fix clang constant-logical-operand warnings.Vinson Lee2014-06-144-13/+13
| | | | | | | | | | | | | | | | This patch fixes several clang constant-logical-operand warnings such as the following. ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: warning: use of logical '||' with constant operand [-Wconstant-logical-operand] if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL) ^ ~~~~~~~~~~~ ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: note: use '|' for a bitwise operation if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL) ^~ | Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Correct more typosChris Forbes2014-06-152-2/+2
| | | | Signed-off-by: Chris Forbes <[email protected]>
* radeon/compute: Always report at least 1 compute unitTom Stellard2014-06-131-1/+1
| | | | | Some apps will abort if they detect 0 compute units. This fixes crashes in some OpenCV tests.
* meta_blit: properly compute texture width for the CopyTexSubImage fallbackJason Ekstrand2014-06-131-1/+1
| | | | | | | Cc: "10.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno/a3xx: vtx formatsRob Clark2014-06-132-63/+79
| | | | | | Add support for more vertex buffer formats. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2014-06-134-16/+23
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: try for more squarish tile dimensionsRob Clark2014-06-131-3/+9
| | | | | | Worth about ~0.5fps in xonotic, for example. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix for null texturesRob Clark2014-06-132-6/+10
| | | | | | | | Some apps seem to give us a null sampler/view for texture slots which come before the last used texture slot. In particular 0ad triggers this. Signed-off-by: Rob Clark <[email protected]>
* llvmpipe: increase number of queries which can be binned simultaneously to 64Roland Scheidegger2014-06-131-1/+1
| | | | | | | | | | | Gallium (but not OpenGL) does allow nesting of queries, but there's no limit specified (d3d10 has no limit neither). Nevertheless, for practical purposes we need some limit in llvmpipe, otherwise we'd need more complex handling of queries as we need to keep track of all binned queries (this only affects queries which gather data past setup). A limit of 16 is too small though, while 64 would suffice. Reviewed-by: Jose Fonseca <[email protected]>
* radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITSBruno Jiménez2014-06-133-0/+15
| | | | | | | | v2: Add RADEON_INFO_ACTIVE_CU_COUNT as a define, as suggested by Tom Stellard Reviewed-by: Tom Stellard <[email protected]>
* Remove _mesa_is_type_integer and _mesa_is_enum_format_or_type_integerNeil Roberts2014-06-132-36/+0
| | | | | | | | | | | | | | | | | | | | | | | The comment for _mesa_is_type_integer is confusing because it says that it returns whether the type is an “integer (non-normalized)” format. I don't think it makes sense to say whether a type is normalized or not because it depends on what format it is used with. For example, GL_RGBA+GL_UNSIGNED_BYTE is normalized but GL_RGBA_INTEGER+GL_UNSIGNED_BYTE isn't. If the normalized comment is just a mistake then it still doesn't make much sense because it is missing the packed-pixel types such as GL_UNSIGNED_INT_5_6_5. If those were added then it effectively just returns type != GL_FLOAT. That function was only used in _mesa_is_enum_format_or_type_integer. This function effectively checks whether the format is non-normalized or the type is an integer. I can't think of any situation where that check would make sense. As far as I can tell neither of these functions have ever been used anywhere so we should just remove them to avoid confusion. These functions were added in 9ad8f431b2a47060bf05517246ab0fa8d249c800. Reviewed-by: Brian Paul <[email protected]>
* clover: query driver for the max number of compute unitsBruno Jiménez2014-06-123-1/+8
| | | | | Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium: Add PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITSBruno Jiménez2014-06-122-1/+4
| | | | | Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* r600g/compute: solve a bug introduced by ↵Bruno Jiménez2014-06-121-1/+1
| | | | | | | | | | | | | | | | | | | 2e01b8b440c1402c88a2755d89f40292e1f36ce5 That commit made possible that the items could be one just after the other when their size was a multiple of ITEM_ALIGNMENT. But compute_memory_prealloc_chunk still looked to leave a gap between items. Resulting in that we got an infinite loop when trying to add an item which would left no space between itself and the next item. Fixes piglit test: cl-custom-r600-create-release-buffer-bug And the test for alignment I have just sent: http://lists.freedesktop.org/archives/piglit/2014-June/011135.html Sorry about this. Reviewed-by: Tom Stellard <[email protected]>
* egl/gallium: Set defines for supported APIs when using automakeNiels Ole Salscheider2014-06-123-0/+28
| | | | | | | | | | | | This fixes automake builds which are broken since b52a530ce2aada1967bc8fefa83ab53e6a737dae. v2: This patch also adds the FEATURE_* defines back to targets/egl-static for Android and Scons that have been removed in the mentioned commit. Signed-off-by: Niels Ole Salscheider <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79885 Reviewed-by: Emil Velikov <[email protected]>
* mesa: glx: Reduce error log levelCourtney Goeltzenleuchter2014-06-121-1/+1
| | | | | | | | | | | The code that parses LIBGL_DRIVERS_PATH was printing an error for every attempted dlopen. It's not an error to have to check multiple items in the path, only an error if no suitable library is found. Reduced the load error to a warning to match behavior of dynamic linker. Signed-off-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* cso: fix stream-out clean up in cso_release_all()Brian Paul2014-06-121-1/+1
| | | | | | | | | Use the has_streamout flag as we do elsewhere to check if we need to call pipe->set_stream_output_targets(). The driver might implement the set_stream_output_targets() function, but not for all hardware configurations. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Set the fast clear color value for texture surfacesNeil Roberts2014-06-122-2/+6
| | | | | | | | | | | | | | When a multisampled texture is used for sampling the fast clear color value needs to be programmed into the surface state. This was being left as all zeroes so if the surface was cleared to a value other than black then it wouldn't work properly. This doesn't matter for single-sample textures because in that case the MCS buffer is resolved before it is used as a texture source. https://bugs.freedesktop.org/show_bug.cgi?id=79729 Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "10.1 10.2" <[email protected]>
* glsl: Fix typo in comment.Chris Forbes2014-06-121-1/+1
| | | | Signed-off-by: Chris Forbes <[email protected]>
* i965: Fix disassembly of BLORP clear programs.Kenneth Graunke2014-06-121-1/+1
| | | | | | | Too many levels of indirection. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Move FB write default state mashing in a level.Kenneth Graunke2014-06-121-7/+7
| | | | | | | | | | We only need to alter the default state if we're emitting MOVs for header related fields. So, we can simply move the push/pop of state in to the if (header_present) block, bypassing it in the common case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* i965: Fix Haswell discard regressions since Gen4-5 line AA fix.Kenneth Graunke2014-06-121-2/+7
| | | | | | | | | | | | | | | | In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally moved fire_fb_write() above the brw_pop_insn_state(), which caused the SEND to lose its predication and change from WE_normal to WE_all. Haswell uses predicated SENDs for discards, so this broke Piglit's tests for discards. We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked, but the actual FB write itself should respect those. So, pop state first, and force it again around the single MOV. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* gbm: Remove 64x64 restriction from GBM_BO_USE_CURSORMichel Dänzer2014-06-124-16/+12
| | | | | | | | | | GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to build, but it no longer rejects widths or heights other than 64. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809 Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-115-11/+17
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Clean up tabs in brw_fs_cse.cpp.Matt Turner2014-06-111-43/+43
| | | | I'm adding vec4 CSE, and I want to diff the files.
* meta: save and restore swizzle for _GenerateMipmapRobert Bragg2014-06-111-0/+12
| | | | | | | | | | This makes sure to use a no-op swizzle while iteratively rendering each level of a mipmap otherwise we may loose components and effectively apply the swizzle twice by the time these levels are sampled. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Emit smarter code for b2f of a comparisonIan Romanick2014-06-112-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | Previously we would emit the comparison, emit an AND to mask off extra bits from the comparison result, then convert the result to float. Now, do the comparison, then use a cleverly constructed SEL to pick either 0.0f or 1.0f. No piglit regressions on Ivybridge. total instructions in shared programs: 1642311 -> 1639449 (-0.17%) instructions in affected programs: 136533 -> 133671 (-2.10%) GAINED: 0 LOST: 0 Programs that are affected appear to save between 1 and 5 instuctions (just by skimming the output from shader-db report.py. v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove extraneous fix_3src_operand (suggested by Matt). The latter change required swapping the order of the operands and using predicate_inverse. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Silence a couple unused parameter warningsIan Romanick2014-06-111-2/+2
| | | | | | | | brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter] brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Store gl_uniform_driver_storage::format as the actual typeIan Romanick2014-06-112-6/+3
| | | | | | | | And delete the incorrect comment. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* softpipe: fix pt->resource assert placementDave Airlie2014-06-111-1/+1
| | | | | | oops meant to move this. Signed-off-by: Dave Airlie <[email protected]>
* softpipe: enable AMD_vertex_shader_layer.Dave Airlie2014-06-111-1/+1
| | | | | | | This passes tests now on softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: enable GLSL 3.30 support.Dave Airlie2014-06-111-1/+1
| | | | | | | This enables GL3.3 on softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: bump the softpipe geometry limitsDave Airlie2014-06-111-1/+1
| | | | | | | This just aligns the limits with llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: use defines for max inputs/outputsDave Airlie2014-06-112-4/+4
| | | | | | | | | This fixes the limits for GL 3.2, and subsequently fixes some segfaults in some varying packing tests and max varying tests after the limits bumped. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add layered rendering support.Dave Airlie2014-06-117-9/+55
| | | | | | | This adds support for GL 3.2 layered rendering to softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add layering to the surface tile cache.Dave Airlie2014-06-115-72/+112
| | | | | | | | | | This adds the layer info to the tile cache. This changes clear_flags to be dynamically allocated as MAX_LAYERS seems like a too big step. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add depth clamping support. (v2)Dave Airlie2014-06-112-6/+30
| | | | | | | | | | | | | This passes the piglit depth clamp tests. this is required for GL 3.2. v2: move min/max up one level, could go further, thanks to Roland for suggestion. v1: Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/gs: bound max output vertices in shaderDave Airlie2014-06-112-0/+9
| | | | | | | | | This limits the number of emitted vertices to the shaders max output vertices, and avoids us writing things into memory that isn't big enough for it. Reviewed-by: Zack Rusin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>