aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Move FB write default state mashing in a level.Kenneth Graunke2014-06-121-7/+7
| | | | | | | | | | We only need to alter the default state if we're emitting MOVs for header related fields. So, we can simply move the push/pop of state in to the if (header_present) block, bypassing it in the common case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* i965: Fix Haswell discard regressions since Gen4-5 line AA fix.Kenneth Graunke2014-06-121-2/+7
| | | | | | | | | | | | | | | | In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally moved fire_fb_write() above the brw_pop_insn_state(), which caused the SEND to lose its predication and change from WE_normal to WE_all. Haswell uses predicated SENDs for discards, so this broke Piglit's tests for discards. We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked, but the actual FB write itself should respect those. So, pop state first, and force it again around the single MOV. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-115-11/+17
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Clean up tabs in brw_fs_cse.cpp.Matt Turner2014-06-111-43/+43
| | | | I'm adding vec4 CSE, and I want to diff the files.
* meta: save and restore swizzle for _GenerateMipmapRobert Bragg2014-06-111-0/+12
| | | | | | | | | | This makes sure to use a no-op swizzle while iteratively rendering each level of a mipmap otherwise we may loose components and effectively apply the swizzle twice by the time these levels are sampled. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Emit smarter code for b2f of a comparisonIan Romanick2014-06-112-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | Previously we would emit the comparison, emit an AND to mask off extra bits from the comparison result, then convert the result to float. Now, do the comparison, then use a cleverly constructed SEL to pick either 0.0f or 1.0f. No piglit regressions on Ivybridge. total instructions in shared programs: 1642311 -> 1639449 (-0.17%) instructions in affected programs: 136533 -> 133671 (-2.10%) GAINED: 0 LOST: 0 Programs that are affected appear to save between 1 and 5 instuctions (just by skimming the output from shader-db report.py. v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove extraneous fix_3src_operand (suggested by Matt). The latter change required swapping the order of the operands and using predicate_inverse. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Silence a couple unused parameter warningsIan Romanick2014-06-111-2/+2
| | | | | | | | brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter] brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Store gl_uniform_driver_storage::format as the actual typeIan Romanick2014-06-111-1/+1
| | | | | | | | And delete the incorrect comment. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: Add GPU BLIT of texture image to PBO in Intel driverJon Ashburn2014-06-101-0/+110
| | | | | | | | | | | | | | | | | | | Add Intel driver hook for glGetTexImage to accelerate the case of reading texture image into a PBO. This case gets huge performance gains by using GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed by memcpy. No regressions on Piglit tests with Intel driver. Performance gain (1280 x 800 FBO, Ivybridge): glGetTexImage + glMapBufferRange with patch 1.45 msec glGetTexImage + glMapBufferRange without patch 4.68 msec v3: (by Kenneth Graunke) - Fix compile after Eric's change to drop the tiling argument to intel_miptree_create_for_bo. - Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit regressions. - Squash in several whitespace and coding style fixes.
* i965: Invalidate live intervals when inserting Gen4 SEND workarounds.Kenneth Graunke2014-06-101-0/+6
| | | | | | | | | We need to invalidate the live intervals when inserting new instructions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.Kenneth Graunke2014-06-101-1/+1
| | | | | | | | | | | When walking backwards, we want to stop at the head sentinel, which is where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL. Fixes random crashes, as well as valgrind errors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* meta: Label the meta GLSL clear program.Kenneth Graunke2014-06-101-0/+1
| | | | | | | | | Giving the meta clear program a meaningful name makes it easier to find in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time. We already did so for integer programs, but neglected to label the primary program. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Combine generate_math[12]_gen6 methods.Kenneth Graunke2014-06-102-33/+13
| | | | | | | | | | | | | | These used to call different math emitters (brw_math vs. brw_math2). Now that they both call gen6_math, they're virtually identical. When unrolling SIMD16 to multiple SIMD8 operations, we should take care not to apply sechalf to brw_null_reg for src1. Otherwise, we'd end up with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's valid. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Drop the generate_math[12]_gen7 methods.Kenneth Graunke2014-06-102-30/+5
| | | | | | | | | | These functions are basically identical, so we should combine them. However, they're so trivial, we may as well just fold them into their only call sites. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/vec4: Combine generate_math[12]_gen6 methods.Kenneth Graunke2014-06-102-28/+12
| | | | | | | | | These are trivial to combine: we should just avoid checking the second operand if it's brw_null_reg. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/vec4: Drop the generate_math2_gen7() method.Kenneth Graunke2014-06-102-14/+1
| | | | | | | | | It's now a single line of code, so we may as well fold it into the caller. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Rename brw_math to gen4_math.Kenneth Graunke2014-06-106-46/+46
| | | | | | | | | Usually, I try to use "brw" for functions that apply to all generations, and "gen4" for dead end/legacy code that is only used on Gen4-5. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Split Gen4-5 and Gen6+ MATH instruction emitters.Kenneth Graunke2014-06-104-89/+39
| | | | | | | | | | | | | | | | | | | | Our existing functions, brw_math and brw_math2, had unclear roles: Gen4-5 used brw_math for both unary and binary math functions; it never used brw_math2. Since operands are already in message registers, this is reasonable. Gen6+ used brw_math for unary math functions, and brw_math2 for binary math functions, duplicating a lot of code. The only real difference was that brw_math used brw_null_reg() for src1. This patch improves brw_math2's assertions to allow both unary and binary operations, renames it to gen6_math(), and drops the Gen6+ code out of brw_math(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Make src_reg::equals() take a constant reference, not a pointer.Kenneth Graunke2014-06-103-14/+14
| | | | | | | This is more typical C++ style. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Don't set the "switch" flag on control flow instructions on Gen6+.Kenneth Graunke2014-06-101-6/+4
| | | | | | | | | | | | | | Thread switching on control flow instructions is a documented workaround for Gen4-5 errata. As far as I can tell, it hasn't been needed since Sandybridge. Thread switching is not free, so in theory this may help performance slightly. Flow control instructions with the "switch" flag cannot be compacted, so removing it will make these instructions compactable. (Of course, we still have to implement compaction for flow control instructions...) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Allow CSE on math opcodes on Gen6+.Kenneth Graunke2014-06-101-0/+11
| | | | | | | | | | total instructions in shared programs: 2081469 -> 2081248 (-0.01%) instructions in affected programs: 22606 -> 22385 (-0.98%) No programs were hurt by this patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Replace open-coded linked list with exec_list.Matt Turner2014-06-104-62/+45
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: Fix substitution of large shadersCody Northrop2014-06-101-3/+14
| | | | | Signed-off-by: Cody Northrop <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Make gen7_pi field of brw_instruction use unsigned instead of GLuintKristian Høgsberg2014-06-091-12/+12
| | | | | | | | Nothing else uses GL-types here. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't include mtypes.h in brw_disasm.cKristian Høgsberg2014-06-091-2/+0
| | | | | | | | It's not used. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: initialize src as reg_undef for texture opcodes on Gen4.Matt Turner2014-06-091-6/+6
| | | | Untested.
* i965/fs: initialize src as reg_undef for texture opcodes on Gen5/6.Tapani Pälli2014-06-091-9/+9
| | | | | | | | | | | | Commit 07af0ab changed fs_inst to have 0 sources for texture opcodes in emit_texture_gen5 (Ironlake, Sandybrige) while fs_generator still uses a single source from brw_reg struct. Patch sets src as reg_undef which matches the behavior before the constructor got changed. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79534
* android, dricore: undefined reference to _mesa_streaming_load_memcpyAdrian Negreanu2014-06-091-0/+5
| | | | | | | | | | _mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c I'm adding it to the dricore lib Cc: "10.1 10.2" <[email protected]> Signed-off-by: Adrian Negreanu <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* android, mesa_gen_matypes: pull in timespec POSIX definitionAdrian Negreanu2014-06-092-0/+2
| | | | | | | | | | | This fixes: include/c11/threads_posix.h: In function 'cnd_timedwait': include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known Cc: "10.1 10.2" <[email protected]> Signed-off-by: Adrian Negreanu <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* android: add src/gallium/auxiliary as include path for libmesa_dricoreAdrian Negreanu2014-06-091-1/+2
| | | | | | | | | | | | | This fixes: In file included from /home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0: /home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38: fatal error: util/u_format_r11g11b10f.h: No such file or directory Cc: "10.1 10.2" <[email protected]> Signed-off-by: Adrian Negreanu <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* android: adapt to the megadriver mechanismAdrian Negreanu2014-06-092-0/+18
| | | | | | | | | | | | | | | | | | | | | | Fixes linker error: ld: .../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o): in function globalDriverAPI:dri_util.c(.data.rel+0x0): error: undefined reference to 'driDriverAPI' As an example, you can see that mesa_dri_drivers also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am) The _stub part might be confusing, but it actually provides the dri-driver shared lib constructor, megadriver_stub_init, which will later on load the real platform dependent part and call l __driDriverGetExtensions_<platform> Cc: "10.1 10.2" <[email protected]> Signed-off-by: Adrian Negreanu <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* add megadriver_stub_FILESAdrian Negreanu2014-06-092-1/+4
| | | | | | | | | So that android part can also use $(megadriver_stub_FILES) Cc: "10.1 10.2" <[email protected]> Signed-off-by: Adrian Negreanu <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965/disasm: Properly debug negate source modifier for logical instructionsAbdiel Janulgue2014-06-091-3/+21
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/vec4: skip copy-propate for logical instructions with negated src entriesAbdiel Janulgue2014-06-091-0/+17
| | | | | | | | | The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: skip copy-propate for logical instructions with negated src entriesAbdiel Janulgue2014-06-091-0/+17
| | | | | | | | | The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: Refactor check for potential copy propagated instructions.Abdiel Janulgue2014-06-091-10/+17
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965: Ensure that we end instruction streams properly.Iago Toral Quiroga2014-06-091-0/+2
| | | | | | | | | | | | | Threads must terminate with a SEND message to a particular shared function, such as a URB write or FB write, so the instruction stream really shouldn't ever end in an IF/ELSE/ENDIF or similar block structure. However, if the instruction stream (incorrectly) ends in a block structure the last block's end pointer will not be set, leading to a crash later on in fs_live_variables::setup_def_use(). It is better to detect this earlier, so assert on that. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add Gen < 6 runtime checks for line antialiasing.Iago Toral Quiroga2014-06-092-27/+67
| | | | | | | | | | | | | | | | | | | | | | | In Gen < 6 the hardware generates a runtime bit that indicates whether AA data has to be sent as part of the framebuffer write SEND message. This affects the specific case where we have setup antialiased line rendering and we render polygons which have one face setup in GL_LINE mode (line antialiasing will be used) and the other one in GL_FILL mode (no line antialiasing needed). Currently we are not doing this runtime test and instead we always send AA data, which produces incorrect rendering of the GL_FILL face of the polygon in in the aforementioned scenario (verified in ironlake and gm45). In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In Gen5 this has never worked properly. Gen > 5 are not affected by this. The patch fixes the problem by adding the appropriate runtime check and adjusting the framebuffer write message accordingly in the conflictive scenario. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Let the gen < 8 generator know about runtime_check_aads_emitIago Toral Quiroga2014-06-094-3/+7
| | | | | | | In gen < 6 we need to produce conditional code based on this flag when doing framebuffer writes. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add extension enable for ARB_compressed_texture_pixel_storageChris Forbes2014-06-101-0/+1
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add pixel storage support for GetCompressedTexImageChris Forbes2014-06-101-33/+40
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Compute proper strides for compressed texture pixel storage.Chris Forbes2014-06-101-0/+35
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Extract computation of compressed pixel store paramsChris Forbes2014-06-102-14/+50
| | | | | | | | | | This logic is reusable across CompressedTex*Image* and GetCompressedTexImage; the strides calculated will also be needed in the PBO validation functions to ensure that the referenced range of bytes is valid. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Emit errors for inconsistent compressed pixel store stateChris Forbes2014-06-103-1/+60
| | | | | | | V2: Use bool rather than GLboolean for internal function Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add new pixel pack/unpack state forChris Forbes2014-06-103-0/+78
| | | | | | | ARB_compressed_texture_pixel_storage Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* tests: Add new enum strings for ARB_compressed_texture_pixel_storageChris Forbes2014-06-101-0/+8
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Make CompressedTexSubImage errors more consistentChris Forbes2014-06-101-3/+3
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Trim down PixelStorei implementationChris Forbes2014-06-101-119/+56
| | | | | | | | | | Move _mesa_error call for INVALID_VALUE to one place. Remove checks for previous value matching -- this was important when we were flushing vertices before the update, but that hasn't happened for a long time now. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).José Fonseca2014-06-081-1/+5
| | | | | | | | | | | A recent ApiTrace change, that tries to dump more buffer state causes Mesa from my distro (10.1.4) to segfaults here. I haven't actually confirm this fixes it (I can't repro on master), but it seems a good idea to be defensive here anyway. Cc: "10.1 10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* Revert "i965: Move brw_land_fwd_jump() to compilation unit of its use."Iago Toral Quiroga2014-06-073-16/+21
| | | | | | | | | | This reverts commit f3cb2e6ed7059b22752a6b7d7a98c07ba6b5552e. brw_land_fwd_jump() is convenient wherever we produce JMPI instructions and we will use JMPI to implement framebuffer writes that involve line antialiasing in gen < 6. Reviewed-by: Kenneth Graunke <[email protected]>