summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Don't implicitly set predicate default state in brw_CMP.Kenneth Graunke2014-05-275-39/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, brw_CMP with a null destination implicitly set the default state to make future instructions predicated. This is messy and confusing - emitting a CMP that populates the flag register and later using it to predicate instructions are logically separate. With the main compiler, we may even schedule instructions between the CMP and the user of the flag value. This patch simplifies brw_CMP to just emit a CMP instruction, and not mess with predication. It also updates all necessary callers. These mostly fell into two patterns: 1. brw_CMP followed by brw_IF. We don't need to do anything special here; brw_IF already sets up predication appropriately. 2. brw_CMP followed by a single predicated instruction. The old model was to call brw_CMP, emit the next (predicated) instruction, then disable predication for any instructions beyond that. Instead, just explicitly set predicate_control on the single instruction we want to predicate. It's no more code, and requires less cross-module knowledge. This drops setting flag_value to 0xff as well, which is a field only used by the SF compile. There is only one brw_CMP call in the SF code, which is in do_twoside_caller, and called at the start of brw_emit_tri_setup, where flag_value is already 0xff. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Drop unnecessary predication default state resets in clip code.Kenneth Graunke2014-05-271-6/+0
| | | | | | | | | Presumably, this was to reset the default state of predication_control from brw_CMP. But brw_CMP only sets that if dst is ARF null, which it isn't here. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Reset flag_value to 0xff before emitting SF subroutines.Kenneth Graunke2014-05-272-15/+4
| | | | | | | | | | | | | | | | | | | | | When compiling any of the SF program variants, flag_value starts off as 0xff and will be modified when generating code. brw_emit_anyprim_setup emits several subroutines, saving and restoring flag_value across each of them. Since it starts out as 0xff, this is equivalent to simply setting it to 0xff at the start of each subroutine. Resetting the value makes more logical sense; each subroutine doesn't know whether one of the others even executed, much less what it did to the flag register. This also lets us to drop the brw_set_predicate_control_flag_value call from brw_init_compile: predicate is already initialized to BRW_PREDICATE_NONE by the memset, and the value of flag_value is irrelevant (as it's only used by the SF compiler). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/meta: Store stencil texturing modeTopi Pohjolainen2014-05-271-0/+1
| | | | | | | | | | | | | | | | | Meta path needs to keep the current texture object's state. Fixes the following gles3 cts tests on bdw: framebuffer_blit_functionality_negative_width_blit.test: fail framebuffer_blit_functionality_all_buffer_blit.test: fail framebuffer_blit_functionality_negative_height_blit.test: fail framebuffer_blit_functionality_missing_buffers_blit.test: fail framebuffer_blit_functionality_negative_dimensions_blit.test: fail framebuffer_blit_functionality_minifying_blit.test: fail framebuffer_blit_functionality_magnifying_blit.test: fail Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta/blit: Add stencil texturing mode save and restoreTopi Pohjolainen2014-05-272-3/+14
| | | | | | | | v2 (Ken): Only restore the mode if it has changed. Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Switch types D->UD when possible to allow compaction.Matt Turner2014-05-261-0/+21
| | | | | | Number of compacted instructions: 827404 -> 833045 (0.68%) Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965: Don't make instructions with a null dest a barrier to scheduling."Matt Turner2014-05-261-8/+4
| | | | | | | This reverts commit 42a26cb5e441a01d5288b299980f23affaad53fe. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
* Revert "i965/fs: Simplify interference scan in register coalescing."Matt Turner2014-05-261-9/+13
| | | | | | | This reverts commit 5ff1e446d44bb9d50f84883c7058635cb070e069. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
* Revert "i965/fs: Give up in interference check if we see a WHILE."Matt Turner2014-05-261-1/+1
| | | | | | This reverts commit 55de1c035cbca2b7087b3aa21a8c3dfc900a4ad9. Cc: "10.2" <[email protected]>
* Revert "i965/fs: Reduce restrictions on interference in register coalescing."Matt Turner2014-05-261-0/+13
| | | | | | | This reverts commit f770123f58b46459e8dbd27525162ee8ba89f30b. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
* mesa/st: fix color outputs in presence of sample mask outputIlia Mirkin2014-05-261-13/+17
| | | | | | | | | | Commit c5d822dad90 added support for sample mask incorrectly. It became treated as a color output, and messed up the color output indices. Revert the hunk that did that, and add explicit support just like for depth/stencil writes. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
* mesa/x86: Fix build with clang <= 3.3.Vinson Lee2014-05-251-0/+2
| | | | | | | clang <= 3.3 cpuid.h does not define contants for feature bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79095 Signed-off-by: Vinson Lee <[email protected]>
* i965: Don't treat HW_REGs as barriers if they're immediates.Matt Turner2014-05-251-4/+12
| | | | | | | | We had a handful of cases where we'd used brw_imm_*() to generate an immediate, rather than fs_reg(). We shouldn't do that but we shouldn't limit scheduling flexibility on account of immediate arguments either. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't use brw_imm_* unnecessarily.Matt Turner2014-05-252-5/+5
| | | | | | | | | | Using brw_imm_* creates a source with file=HW_REG, and the scheduler inserts barrier dependencies when it sees HW_REG. None of these are hardware-registers in the sense that they're special and scheduling shouldn't touch them. A few of the modified cases already have HW_REGs for other sources, so it won't allow extra flexibility in some cases. Reviewed-by: Kenneth Graunke <[email protected]>
* dri_util: keep __dri2ConfigOptions symbol privateEmil Velikov2014-05-251-1/+1
| | | | | | | | | | | | The symbol was added with commit 45e2b51c853(DRI2/GLX: check for vblank_mode in DRI2 GLX code) but was never used as such according to git log. Possibly it was marked as public due to confusion with __driConfigOptions which was used for dri1 drivers. Acked-by: Jesse Barnes <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* dri_util: set implemented version of the DRI_CORE extensionEmil Velikov2014-05-251-1/+1
| | | | | | | ... rather than the one defined in our internal interface (dri_interface.h) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Don't modify ann_count if not debugging.Matt Turner2014-05-252-2/+8
| | | | | | | If we make ann_count non-zero, annotation_finalize() won't bail. Not modifying it seems to make the code more clear than would modifying annotation_finalize().
* Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6"Matt Turner2014-05-241-4/+7
| | | | | | | | | | | | | | | | | | | | | This reverts commit a6860100b87415ab510d0d210cabfeeccebc9a0a. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707 Acked-by: Kenneth Graunke <[email protected]>
* Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6"Matt Turner2014-05-241-6/+10
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 2dfbbeca50b95ccdd714d9baa4411c779f6a20d9 with the comment about MAC and implicit accumulator removed. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703 Acked-by: Kenneth Graunke <[email protected]>
* i965: Remove useless typo'd debugging messages.Matt Turner2014-05-241-6/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move brw_land_fwd_jump() to compilation unit of its use.Matt Turner2014-05-243-23/+16
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use next_insn_offset rather than nr_insn.Matt Turner2014-05-242-4/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit 0.0:F sources with type VF instead.Matt Turner2014-05-241-0/+16
| | | | | | Number of compacted instructions: 817752 -> 827404 (1.18%) Reviewed-by: Eric Anholt <[email protected]>
* i965: Emit ARF:UD for non-present src1 on Gen6+.Matt Turner2014-05-241-2/+26
| | | | | | Enables the next commits to compact more instructions. Reviewed-by: Eric Anholt <[email protected]>
* i965: Support compacted instructions with immediate sources.Matt Turner2014-05-241-20/+63
| | | | | | | | | | Note the weirdness with src1 subregs. The compacted immediate fields are uncompacted to bits [127:96] and the high five bits of the subreg mapping maps to bits [100:96]. Number of compacted instructions: 790085 -> 817752 (3.50%) Reviewed-by: Eric Anholt <[email protected]>
* i965: Use next_offset() in instruction compaction code.Matt Turner2014-05-241-17/+3
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: Move next_offset() to brw_eu.h for use elsewhere.Matt Turner2014-05-242-11/+12
| | | | | | | Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename next_ip() -> next_offset().Matt Turner2014-05-241-30/+33
| | | | | | | | | | That we were comparing its return value with offsets should have been a clue. :) Make it take a void *store in preparation for making the function useful elsewhere. Reviewed-by: Eric Anholt <[email protected]>
* i965: Print disassembly after compaction.Matt Turner2014-05-249-283/+198
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Make patch_discard_jumps_to_fb_writes return bool.Matt Turner2014-05-243-6/+8
| | | | | | | | ... to tell us whether it emitted any code. Will be used to determine whether we need to skip an annotation for it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965: Add annotation data structure and support code.Matt Turner2014-05-2411-9/+183
| | | | | | | | | | | | | | | | Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs+blorp: Remove left over dump_file arguments.Matt Turner2014-05-245-19/+15
| | | | | | | Were used by the blorp unit test programs. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965/fs: Don't hardcode DEBUG_WM in generic fs code.Matt Turner2014-05-246-27/+25
| | | | | | | Similar to Paul's commit e9fa3a944 except brw_fs_generator's debug_flag is for DEBUG_WM and DEBUG_BLORP. Reviewed-by: Eric Anholt <[email protected]>
* i965: Pass in start_offset to brw_compact_instructions().Matt Turner2014-05-248-17/+17
| | | | | | | Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <[email protected]>
* i965: Delete unused brw_blorp_blit_test_compile().Matt Turner2014-05-241-11/+0
|
* i965/cfg: Make DO instruction begin a basic block.Matt Turner2014-05-241-9/+12
| | | | | | | | | | | | | | | | | The DO instruction doesn't exist on Gen6+. Since before this commit, DO always ended a basic block, if it also happened to start one (e.g., a while loop inside an if statement) the block containing only the DO would actually contain no hardware instructions. Pre-Gen6's WHILE instructions jumps to the instruction following the DO, so strictly speaking we won't be modeling that properly, but I claim there is actually no functional difference. This will simplify an upcoming change where we want to mark the first hardware instruction in the loop as beginning a block, and the last instruction before the loop as ending one. Reviewed-by: Eric Anholt <[email protected]>
* i965: Properly return *RESET* status in glGetGraphicsResetStatusARBPavel Popov2014-05-231-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The glGetGraphicsResetStatusARB from ARB_robustness extension always returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty context with LOSE_CONTEXT_ON_RESET_ARB strategy. This is because Mesa returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel driver never reset batch_active and this variable always > 0 for guilty context. The same behaviour also can be observed for batch_pending and INNOCENT_CONTEXT_RESET_ARB. But ARB_robustness spec says: If a reset status other than NO_ERROR is returned and subsequent calls return NO_ERROR, the context reset was encountered and completed. If a reset status is repeatedly returned, the context may be in the process of resetting. 8. How should the application react to a reset context event? RESOLVED: For this extension, the application is expected to query the reset status until NO_ERROR is returned. If a reset is encountered, at least one *RESET* status will be returned. Once NO_ERROR is encountered, the application can safely destroy the old context and create a new one. The main problem is the context may be in the process of resetting and in this case a reset status should be repeatedly returned. But looks like the kernel driver returns nonzero active/pending only if the context reset has already been encountered and completed. For this reason the *RESET* status cannot be repeatedly returned and should be returned only once. The reset_count and brw->reset_count variables can be used to control that glGetGraphicsResetStatusARB returns *RESET* status only once for each context. Note the i915 triggers reset_count twice which allows to return correct reset count immediately after active/pending have been incremented. v2 (idr): Trivial reformatting of comments. Signed-off-by: Pavel Popov <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "10.1 10.2" <[email protected]>
* Make DRI dependencies and build depend on the targetJon TURNEY2014-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Don't require xcb-dri[23] etc. if we aren't building for a target with DRM, as we won't be using dri[23] - Enable a more fine-grained control of what DRI code is built, so that a libGL using direct swrast can be built on targets which don't have DRM. The HAVE_DRI automake conditional is retired in favour of a number of other conditionals: HAVE_DRI2 enables building of code using the DRI2 interface (and possibly DRI3 with HAVE_DRI3) HAVE_DRISW enables building of DRI swrast HAVE_DRICOMMON enables building of target-independent DRI code, and also enables some makefile cases where a more detailled decision is made at a lower level. HAVE_APPLEDRI enables building of an Apple-specific direct rendering interface, still which requires additional fixing up to build properly. v2: Place xfont.c and drisw_glx.c into correct categories. Update 'make check' as well Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Jeremy Huddleston Sequoia <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* Fix build for darwinJon TURNEY2014-05-231-1/+1
| | | | | | | | | | | | | | | | | | | Fix build for darwin, when ./configured --disable-driglx-direct - darwin ld doesn't support -Bsymbolic or --version-script, so check if ld supports those options before using them - define GLX_ALIAS_UNSUPPORTED as config/darwin used to, as aliasing of non-weak symbols isn't supported - default to -with-dri-drivers=swrast v2: Use -Wl,-Bsymbolic, as before, not -Bsymbolic Test that ld --version-script works, rather than just looking for it in ld --help Don't use -Wl,--no-undefined on darwin, either Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Jeremy Huddleston Sequoia <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENTJames Legg2014-05-231-0/+6
| | | | | | | | | glFramebufferRender(..., GL_DEPTH_STENCIL_ATTACHMENT, ..., 0) only detached the depth buffer and not the stencil buffer. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=79115 Reviewed-by: Brian Paul <[email protected]> Cc: "10.1 10.2" <[email protected]>
* mesa/x86: Fix build with clang 3.4.José Fonseca2014-05-231-0/+4
| | | | | | | | It defines bit_SSE41 instead of bit_SSE4_1. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=79095 Trivial.
* mesa: Move declaration to top of block.José Fonseca2014-05-231-1/+3
| | | | To fix MSVC build. Trivial.
* meta blit: Set Z texcoord during meta blit to sample the correct layerJordan Justen2014-05-231-1/+8
| | | | | | | | | | | | If the source renderbuffer has a depth > 0, then send a Z texcoord which is set to the source attachment Z offset. This fixes piglit's gl-3.2-layered-rendering-gl-layer-render with the GL_TEXTURE_2D_MULTISAMPLE_ARRAY case test on i965/gen8. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "10.2" <[email protected]>
* i965: Listen to BRW_NEW_FRAGMENT_PROGRAM for 3DSTATE_PS_BLEND.Kenneth Graunke2014-05-232-2/+3
| | | | | | | | | | | | | | brw_color_buffer_write_enabled depends on brw->fragment_program, which means we have to listen to BRW_NEW_FRAGMENT_PROGRAM. On most generations, this was only called from a function that already subscribed. However, on Broadwell, we failed to listen to the necessary event in the atom that emits 3DSTATE_PS_BLEND. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.2" <[email protected]>
* i965: Use WE_all for FB write header setup on Broadwell.Kenneth Graunke2014-05-231-6/+7
| | | | | | | | | | | | I forgot to disable writemasking on the OR and MOV which set the render target index and "source 0 alpha present to render target" bit. Using get_element_ud is equivalent and avoids a line-wrap. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.2" <[email protected]>
* mesa/x86: fix a typos in SSE4.1 detectionTobias Klausmann2014-05-221-2/+2
| | | | | | | | Commit a2fb71e23 introduced 32-bit code for SSE4.1. Fix compilation, and make sure to check ecx for the SSE4.1 bit. [imirkin: switch sse4.1 to look at ecx] Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Rely on USE_X86_64_ASM.José Fonseca2014-05-223-5/+5
| | | | | | | This fixes MinGW x64 builds. We don't use assembly on any of the Windows builds, to avoid divergence between MSVC and MinGW when testing. Reviewed-by: Matt Turner <[email protected]>
* scons: Fix x86_64 build.José Fonseca2014-05-221-0/+1
| | | | | | x86/common_x86.c is required also for x86_64 builds. Reviewed-by: Matt Turner <[email protected]>
* mesa/x86: Brown bag fix for undeclared variable.Matt Turner2014-05-221-1/+1
|
* i965: Use SSE4.1 runtime detection for intel_miptree_map.Matt Atwood2014-05-221-8/+3
| | | | | | Previous it was a compile-time decision. Reviewed-by: Matt Turner <[email protected]>