aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Move FB write default state mashing in a level.Kenneth Graunke2014-06-121-7/+7
| | | | | | | | | | We only need to alter the default state if we're emitting MOVs for header related fields. So, we can simply move the push/pop of state in to the if (header_present) block, bypassing it in the common case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* i965: Fix Haswell discard regressions since Gen4-5 line AA fix.Kenneth Graunke2014-06-121-2/+7
| | | | | | | | | | | | | | | | In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally moved fire_fb_write() above the brw_pop_insn_state(), which caused the SEND to lose its predication and change from WE_normal to WE_all. Haswell uses predicated SENDs for discards, so this broke Piglit's tests for discards. We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked, but the actual FB write itself should respect those. So, pop state first, and force it again around the single MOV. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* gbm: Remove 64x64 restriction from GBM_BO_USE_CURSORMichel Dänzer2014-06-124-16/+12
| | | | | | | | | | GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to build, but it no longer rejects widths or heights other than 64. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809 Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-115-11/+17
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Clean up tabs in brw_fs_cse.cpp.Matt Turner2014-06-111-43/+43
| | | | I'm adding vec4 CSE, and I want to diff the files.
* configure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic.Matt Turner2014-06-111-9/+3
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Alphabetize AC_CONFIG_FILES.Matt Turner2014-06-111-7/+7
| | | | | | | This isn't supposed to be difficult. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: Remove single quotes to fix syntax highlighting.Matt Turner2014-06-111-2/+2
| | | | | | | Please stop adding them. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meta: save and restore swizzle for _GenerateMipmapRobert Bragg2014-06-111-0/+12
| | | | | | | | | | This makes sure to use a no-op swizzle while iteratively rendering each level of a mipmap otherwise we may loose components and effectively apply the swizzle twice by the time these levels are sampled. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Emit smarter code for b2f of a comparisonIan Romanick2014-06-112-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | Previously we would emit the comparison, emit an AND to mask off extra bits from the comparison result, then convert the result to float. Now, do the comparison, then use a cleverly constructed SEL to pick either 0.0f or 1.0f. No piglit regressions on Ivybridge. total instructions in shared programs: 1642311 -> 1639449 (-0.17%) instructions in affected programs: 136533 -> 133671 (-2.10%) GAINED: 0 LOST: 0 Programs that are affected appear to save between 1 and 5 instuctions (just by skimming the output from shader-db report.py. v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove extraneous fix_3src_operand (suggested by Matt). The latter change required swapping the order of the operands and using predicate_inverse. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Silence a couple unused parameter warningsIan Romanick2014-06-111-2/+2
| | | | | | | | brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter] brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Store gl_uniform_driver_storage::format as the actual typeIan Romanick2014-06-112-6/+3
| | | | | | | | And delete the incorrect comment. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* softpipe: fix pt->resource assert placementDave Airlie2014-06-111-1/+1
| | | | | | oops meant to move this. Signed-off-by: Dave Airlie <[email protected]>
* softpipe: enable AMD_vertex_shader_layer.Dave Airlie2014-06-111-1/+1
| | | | | | | This passes tests now on softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: enable GLSL 3.30 support.Dave Airlie2014-06-111-1/+1
| | | | | | | This enables GL3.3 on softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: bump the softpipe geometry limitsDave Airlie2014-06-111-1/+1
| | | | | | | This just aligns the limits with llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: use defines for max inputs/outputsDave Airlie2014-06-112-4/+4
| | | | | | | | | This fixes the limits for GL 3.2, and subsequently fixes some segfaults in some varying packing tests and max varying tests after the limits bumped. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add layered rendering support.Dave Airlie2014-06-117-9/+55
| | | | | | | This adds support for GL 3.2 layered rendering to softpipe. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add layering to the surface tile cache.Dave Airlie2014-06-115-72/+112
| | | | | | | | | | This adds the layer info to the tile cache. This changes clear_flags to be dynamically allocated as MAX_LAYERS seems like a too big step. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add depth clamping support. (v2)Dave Airlie2014-06-112-6/+30
| | | | | | | | | | | | | This passes the piglit depth clamp tests. this is required for GL 3.2. v2: move min/max up one level, could go further, thanks to Roland for suggestion. v1: Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/gs: bound max output vertices in shaderDave Airlie2014-06-112-0/+9
| | | | | | | | | This limits the number of emitted vertices to the shaders max output vertices, and avoids us writing things into memory that isn't big enough for it. Reviewed-by: Zack Rusin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Add GPU BLIT of texture image to PBO in Intel driverJon Ashburn2014-06-101-0/+110
| | | | | | | | | | | | | | | | | | | Add Intel driver hook for glGetTexImage to accelerate the case of reading texture image into a PBO. This case gets huge performance gains by using GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed by memcpy. No regressions on Piglit tests with Intel driver. Performance gain (1280 x 800 FBO, Ivybridge): glGetTexImage + glMapBufferRange with patch 1.45 msec glGetTexImage + glMapBufferRange without patch 4.68 msec v3: (by Kenneth Graunke) - Fix compile after Eric's change to drop the tiling argument to intel_miptree_create_for_bo. - Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit regressions. - Squash in several whitespace and coding style fixes.
* i965: Invalidate live intervals when inserting Gen4 SEND workarounds.Kenneth Graunke2014-06-101-0/+6
| | | | | | | | | We need to invalidate the live intervals when inserting new instructions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.Kenneth Graunke2014-06-101-1/+1
| | | | | | | | | | | When walking backwards, we want to stop at the head sentinel, which is where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL. Fixes random crashes, as well as valgrind errors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* meta: Label the meta GLSL clear program.Kenneth Graunke2014-06-101-0/+1
| | | | | | | | | Giving the meta clear program a meaningful name makes it easier to find in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time. We already did so for integer programs, but neglected to label the primary program. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Combine generate_math[12]_gen6 methods.Kenneth Graunke2014-06-102-33/+13
| | | | | | | | | | | | | | These used to call different math emitters (brw_math vs. brw_math2). Now that they both call gen6_math, they're virtually identical. When unrolling SIMD16 to multiple SIMD8 operations, we should take care not to apply sechalf to brw_null_reg for src1. Otherwise, we'd end up with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's valid. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Drop the generate_math[12]_gen7 methods.Kenneth Graunke2014-06-102-30/+5
| | | | | | | | | | These functions are basically identical, so we should combine them. However, they're so trivial, we may as well just fold them into their only call sites. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/vec4: Combine generate_math[12]_gen6 methods.Kenneth Graunke2014-06-102-28/+12
| | | | | | | | | These are trivial to combine: we should just avoid checking the second operand if it's brw_null_reg. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/vec4: Drop the generate_math2_gen7() method.Kenneth Graunke2014-06-102-14/+1
| | | | | | | | | It's now a single line of code, so we may as well fold it into the caller. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Rename brw_math to gen4_math.Kenneth Graunke2014-06-106-46/+46
| | | | | | | | | Usually, I try to use "brw" for functions that apply to all generations, and "gen4" for dead end/legacy code that is only used on Gen4-5. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Split Gen4-5 and Gen6+ MATH instruction emitters.Kenneth Graunke2014-06-104-89/+39
| | | | | | | | | | | | | | | | | | | | Our existing functions, brw_math and brw_math2, had unclear roles: Gen4-5 used brw_math for both unary and binary math functions; it never used brw_math2. Since operands are already in message registers, this is reasonable. Gen6+ used brw_math for unary math functions, and brw_math2 for binary math functions, duplicating a lot of code. The only real difference was that brw_math used brw_null_reg() for src1. This patch improves brw_math2's assertions to allow both unary and binary operations, renames it to gen6_math(), and drops the Gen6+ code out of brw_math(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Make src_reg::equals() take a constant reference, not a pointer.Kenneth Graunke2014-06-103-14/+14
| | | | | | | This is more typical C++ style. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Don't set the "switch" flag on control flow instructions on Gen6+.Kenneth Graunke2014-06-101-6/+4
| | | | | | | | | | | | | | Thread switching on control flow instructions is a documented workaround for Gen4-5 errata. As far as I can tell, it hasn't been needed since Sandybridge. Thread switching is not free, so in theory this may help performance slightly. Flow control instructions with the "switch" flag cannot be compacted, so removing it will make these instructions compactable. (Of course, we still have to implement compaction for flow control instructions...) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Allow CSE on math opcodes on Gen6+.Kenneth Graunke2014-06-101-0/+11
| | | | | | | | | | total instructions in shared programs: 2081469 -> 2081248 (-0.01%) instructions in affected programs: 22606 -> 22385 (-0.98%) No programs were hurt by this patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: Remove unused include in expr.flatt.Thomas Helland2014-06-101-2/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include in ir.cppThomas Helland2014-06-101-1/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from ir_constant_expression.cppThomas Helland2014-06-101-1/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from ir_basic_block.cppThomas Helland2014-06-101-2/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from hir_field_selection.cppThomas Helland2014-06-101-1/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from glsl_symbol_table.hThomas Helland2014-06-101-1/+1
| | | | | | | | | | | | | Only function-defs use glsl_type so forward declare instead. Compile-tested on my Ivy-bridge system. IWYU also suggests removing #include <new>, and this compiles fine. I'm not familiar enough with memory management in C/C++ that I feel comfortable removing this. Insights would be appreciated. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from glsl_types.cppThomas Helland2014-06-101-3/+1
| | | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Added comment about core.h being used for MAX2. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include from builtin_variables.cppThomas Helland2014-06-101-1/+0
| | | | | | | | Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused include in ast_to_hir.cppThomas Helland2014-06-101-1/+0
| | | | | | | | | | Found with IWYU. Comment says it's for struct gl_extensions. Grepping for gl_extensions shows no uses. Tested by compiling on my Ivy-bridge system. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused includes in link_uniform_block_active_visitor.hThomas Helland2014-06-101-2/+0
| | | | | | | | | Found with IWYU, compile-tested on my Ivy-bridge system. This is not used in the header, and is included in the source. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* glsl: Remove unused includes in link_uniform_init.Thomas Helland2014-06-101-2/+0
| | | | | | | | | | | | Found with IWYU, confirmed with grepping for "hash" and "symbol". No negative effects on compilation. IWYU also reported core.h and linker.h could be removed, but I'm unsure if those are false positives. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Thomas Helland <[email protected]>
* i965: Replace open-coded linked list with exec_list.Matt Turner2014-06-104-62/+45
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add an exec_node_init() function, usable from C.Matt Turner2014-06-101-0/+7
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Make foreach macros usable from C by adding struct keyword.Matt Turner2014-06-101-11/+11
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Make exec_list members just wrap the C API.Matt Turner2014-06-101-77/+13
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Make exec_node members just wrap the C API.Matt Turner2014-06-101-27/+11
| | | | Reviewed-by: Ian Romanick <[email protected]>