aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Give dump_instruction() a FILE* argument.Matt Turner2014-06-015-100/+115
| | | | | | | Use function overloading rather than default arguments, since gdb doesn't know about default arguments. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add envvar to debug the optimization passes.Matt Turner2014-06-012-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i915: add a missing NULL pointer checkLubomir Rintel2014-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | mesaVisual can be NULL with configless context since this commit: commit 551d459af421a2eb937e9e16301bb64da4624f89 Author: Neil Roberts <[email protected]> Date: Fri Mar 7 18:05:47 2014 +0000 Add the EGL_MESA_configless_context extension ... Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. We attempt to dereference the visual in i915 and now we don't create a zeroed-out one one it crashes, breaking at least weston in an i915. There's no point in doing so as it would be zero anyway. v2: Fixed a typo in commit message. Added some tags. Signed-off-by: Lubomir Rintel <[email protected]> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1100967 Cc: "10.2" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Allow writemasking on math instructions on Gen7+.Matt Turner2014-05-301-2/+2
| | | | | | | | | | The math instruction was Align1-only on Gen6 and we never updated this to let it use Align16 features like writemasking on newer platforms. total instructions in shared programs: 1686120 -> 1685507 (-0.04%) instructions in affected programs: 48593 -> 47980 (-1.26%) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.Pavel Popov2014-05-301-1/+1
| | | | | | Cc: "10.1 10.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Pavel Popov <[email protected]>
* mesa/drivers: Add extra null check in blitframebuffer_texture()Juha-Pekka Heikkila2014-05-301-0/+3
| | | | | | | | If texObj == NULL here it mean there is already GL_INVALID_VALUE or GL_OUT_OF_MEMORY error set to context. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Set correct number of regs_written for MCS fetches.Matt Turner2014-05-291-3/+3
| | | | | | | regs_written is in units of virtual GRFs. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix repeated usage of rectangle texture coordinate scaling.Kenneth Graunke2014-05-281-7/+20
| | | | | | | | | | | | | | | | | | Previously, we set up new entries in the params[] array on every access of a rectangle texture. Unfortunately, we only reserve space for (2 * MaxTextureImageUnits) extra entries, so programs which accessed rectangle textures more times than that would write off the end of the array and likely crash. We don't really have a decent mapping between the index returned by _mesa_add_state_reference and our index into the params array, so we have to manually search for it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78691 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: [email protected]
* meta/blit: Use gl_FragColor also in the msaa blit shaderTopi Pohjolainen2014-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use the meta path. No piglit regressions on IVB. Further input from Ken: "Unfortunately, this doesn't fix MRT for integer data. In the single-sampled case, since we're directly copying data, we were read/copy/write data as "float" values, which actually contained the integer bits. Here, we can't do that since we need to process the actual integer data. I do wonder if we could use intBitsToFloat/uintBitsToFloat to stuff the integer bits in the float gl_FragColor output. Just a crazy idea. In the long term (post 10.2), I think we should draft an extension that allows you to do "layout(location = all)" on user-defined fragment shader outputs. (Or some similar syntax.)" Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/sf: Replace push/pop in brw_emit_anyprim_setup.Kenneth Graunke2014-05-271-15/+11
| | | | | | | | | | | | | | Each of the subroutine emitters alter the predication state, but otherwise don't change anything (or put it back when they do). Resetting predication at the end makes these functions idempotent with regard to the default instruction state - which is a nice property. With that in place, push/pop is no longer necessary. v2: Improve whitespace (requested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Drop unnecessary push/pop in copy_z_inv_w.Kenneth Graunke2014-05-271-4/+0
| | | | | | | | brw_MOV doesn't alter the default instruction state, so this does nothing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Drop unnecessary push/pop in flatshading code.Kenneth Graunke2014-05-271-8/+0
| | | | | | | | brw_JMPI sets predicate_control to BRW_PREDICATE_NONE, but that's already the value coming in. Otherwise, nothing changes state. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Move brw_compile::flag_value to brw_sf_compile.Kenneth Graunke2014-05-273-21/+24
| | | | | | | | | | | This field is only used to track the current value of the flag register during the SF compile. It has no place in the common compiler code. While we're changing every call, drop the 'brw' prefix from the function since it's static. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c.Kenneth Graunke2014-05-273-19/+14
| | | | | | | | Only the Gen4-5 SF program compiler actually uses this function; move it there. Soon the fields will be moved out of brw_compile. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Drop useless push/pop state from flag register mashing code.Kenneth Graunke2014-05-271-2/+0
| | | | | | | | There's no point in pushing and popping the default state; the code between the two stack operations doesn't alter anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Drop unnecessary push/pop in do_twoside_color.Kenneth Graunke2014-05-271-2/+0
| | | | | | | | None of the assembly emitters called between push and pop actually change the state. So, we can drop these. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Don't implicitly set predicate default state in brw_CMP.Kenneth Graunke2014-05-275-39/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, brw_CMP with a null destination implicitly set the default state to make future instructions predicated. This is messy and confusing - emitting a CMP that populates the flag register and later using it to predicate instructions are logically separate. With the main compiler, we may even schedule instructions between the CMP and the user of the flag value. This patch simplifies brw_CMP to just emit a CMP instruction, and not mess with predication. It also updates all necessary callers. These mostly fell into two patterns: 1. brw_CMP followed by brw_IF. We don't need to do anything special here; brw_IF already sets up predication appropriately. 2. brw_CMP followed by a single predicated instruction. The old model was to call brw_CMP, emit the next (predicated) instruction, then disable predication for any instructions beyond that. Instead, just explicitly set predicate_control on the single instruction we want to predicate. It's no more code, and requires less cross-module knowledge. This drops setting flag_value to 0xff as well, which is a field only used by the SF compile. There is only one brw_CMP call in the SF code, which is in do_twoside_caller, and called at the start of brw_emit_tri_setup, where flag_value is already 0xff. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Drop unnecessary predication default state resets in clip code.Kenneth Graunke2014-05-271-6/+0
| | | | | | | | | Presumably, this was to reset the default state of predication_control from brw_CMP. But brw_CMP only sets that if dst is ARF null, which it isn't here. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Reset flag_value to 0xff before emitting SF subroutines.Kenneth Graunke2014-05-272-15/+4
| | | | | | | | | | | | | | | | | | | | | When compiling any of the SF program variants, flag_value starts off as 0xff and will be modified when generating code. brw_emit_anyprim_setup emits several subroutines, saving and restoring flag_value across each of them. Since it starts out as 0xff, this is equivalent to simply setting it to 0xff at the start of each subroutine. Resetting the value makes more logical sense; each subroutine doesn't know whether one of the others even executed, much less what it did to the flag register. This also lets us to drop the brw_set_predicate_control_flag_value call from brw_init_compile: predicate is already initialized to BRW_PREDICATE_NONE by the memset, and the value of flag_value is irrelevant (as it's only used by the SF compiler). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/meta: Store stencil texturing modeTopi Pohjolainen2014-05-271-0/+1
| | | | | | | | | | | | | | | | | Meta path needs to keep the current texture object's state. Fixes the following gles3 cts tests on bdw: framebuffer_blit_functionality_negative_width_blit.test: fail framebuffer_blit_functionality_all_buffer_blit.test: fail framebuffer_blit_functionality_negative_height_blit.test: fail framebuffer_blit_functionality_missing_buffers_blit.test: fail framebuffer_blit_functionality_negative_dimensions_blit.test: fail framebuffer_blit_functionality_minifying_blit.test: fail framebuffer_blit_functionality_magnifying_blit.test: fail Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta/blit: Add stencil texturing mode save and restoreTopi Pohjolainen2014-05-272-3/+14
| | | | | | | | v2 (Ken): Only restore the mode if it has changed. Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Switch types D->UD when possible to allow compaction.Matt Turner2014-05-261-0/+21
| | | | | | Number of compacted instructions: 827404 -> 833045 (0.68%) Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965: Don't make instructions with a null dest a barrier to scheduling."Matt Turner2014-05-261-8/+4
| | | | | | | This reverts commit 42a26cb5e441a01d5288b299980f23affaad53fe. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
* Revert "i965/fs: Simplify interference scan in register coalescing."Matt Turner2014-05-261-9/+13
| | | | | | | This reverts commit 5ff1e446d44bb9d50f84883c7058635cb070e069. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
* Revert "i965/fs: Give up in interference check if we see a WHILE."Matt Turner2014-05-261-1/+1
| | | | | | This reverts commit 55de1c035cbca2b7087b3aa21a8c3dfc900a4ad9. Cc: "10.2" <[email protected]>
* Revert "i965/fs: Reduce restrictions on interference in register coalescing."Matt Turner2014-05-261-0/+13
| | | | | | | This reverts commit f770123f58b46459e8dbd27525162ee8ba89f30b. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
* i965: Don't treat HW_REGs as barriers if they're immediates.Matt Turner2014-05-251-4/+12
| | | | | | | | We had a handful of cases where we'd used brw_imm_*() to generate an immediate, rather than fs_reg(). We shouldn't do that but we shouldn't limit scheduling flexibility on account of immediate arguments either. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't use brw_imm_* unnecessarily.Matt Turner2014-05-252-5/+5
| | | | | | | | | | Using brw_imm_* creates a source with file=HW_REG, and the scheduler inserts barrier dependencies when it sees HW_REG. None of these are hardware-registers in the sense that they're special and scheduling shouldn't touch them. A few of the modified cases already have HW_REGs for other sources, so it won't allow extra flexibility in some cases. Reviewed-by: Kenneth Graunke <[email protected]>
* dri_util: keep __dri2ConfigOptions symbol privateEmil Velikov2014-05-251-1/+1
| | | | | | | | | | | | The symbol was added with commit 45e2b51c853(DRI2/GLX: check for vblank_mode in DRI2 GLX code) but was never used as such according to git log. Possibly it was marked as public due to confusion with __driConfigOptions which was used for dri1 drivers. Acked-by: Jesse Barnes <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* dri_util: set implemented version of the DRI_CORE extensionEmil Velikov2014-05-251-1/+1
| | | | | | | ... rather than the one defined in our internal interface (dri_interface.h) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Don't modify ann_count if not debugging.Matt Turner2014-05-252-2/+8
| | | | | | | If we make ann_count non-zero, annotation_finalize() won't bail. Not modifying it seems to make the code more clear than would modifying annotation_finalize().
* Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6"Matt Turner2014-05-241-4/+7
| | | | | | | | | | | | | | | | | | | | | This reverts commit a6860100b87415ab510d0d210cabfeeccebc9a0a. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707 Acked-by: Kenneth Graunke <[email protected]>
* Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6"Matt Turner2014-05-241-6/+10
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 2dfbbeca50b95ccdd714d9baa4411c779f6a20d9 with the comment about MAC and implicit accumulator removed. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703 Acked-by: Kenneth Graunke <[email protected]>
* i965: Remove useless typo'd debugging messages.Matt Turner2014-05-241-6/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move brw_land_fwd_jump() to compilation unit of its use.Matt Turner2014-05-243-23/+16
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use next_insn_offset rather than nr_insn.Matt Turner2014-05-242-4/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit 0.0:F sources with type VF instead.Matt Turner2014-05-241-0/+16
| | | | | | Number of compacted instructions: 817752 -> 827404 (1.18%) Reviewed-by: Eric Anholt <[email protected]>
* i965: Emit ARF:UD for non-present src1 on Gen6+.Matt Turner2014-05-241-2/+26
| | | | | | Enables the next commits to compact more instructions. Reviewed-by: Eric Anholt <[email protected]>
* i965: Support compacted instructions with immediate sources.Matt Turner2014-05-241-20/+63
| | | | | | | | | | Note the weirdness with src1 subregs. The compacted immediate fields are uncompacted to bits [127:96] and the high five bits of the subreg mapping maps to bits [100:96]. Number of compacted instructions: 790085 -> 817752 (3.50%) Reviewed-by: Eric Anholt <[email protected]>
* i965: Use next_offset() in instruction compaction code.Matt Turner2014-05-241-17/+3
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: Move next_offset() to brw_eu.h for use elsewhere.Matt Turner2014-05-242-11/+12
| | | | | | | Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename next_ip() -> next_offset().Matt Turner2014-05-241-30/+33
| | | | | | | | | | That we were comparing its return value with offsets should have been a clue. :) Make it take a void *store in preparation for making the function useful elsewhere. Reviewed-by: Eric Anholt <[email protected]>
* i965: Print disassembly after compaction.Matt Turner2014-05-249-283/+198
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Make patch_discard_jumps_to_fb_writes return bool.Matt Turner2014-05-243-6/+8
| | | | | | | | ... to tell us whether it emitted any code. Will be used to determine whether we need to skip an annotation for it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965: Add annotation data structure and support code.Matt Turner2014-05-2411-9/+183
| | | | | | | | | | | | | | | | Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs+blorp: Remove left over dump_file arguments.Matt Turner2014-05-245-19/+15
| | | | | | | Were used by the blorp unit test programs. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* i965/fs: Don't hardcode DEBUG_WM in generic fs code.Matt Turner2014-05-246-27/+25
| | | | | | | Similar to Paul's commit e9fa3a944 except brw_fs_generator's debug_flag is for DEBUG_WM and DEBUG_BLORP. Reviewed-by: Eric Anholt <[email protected]>
* i965: Pass in start_offset to brw_compact_instructions().Matt Turner2014-05-248-17/+17
| | | | | | | Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <[email protected]>
* i965: Delete unused brw_blorp_blit_test_compile().Matt Turner2014-05-241-11/+0
|
* i965/cfg: Make DO instruction begin a basic block.Matt Turner2014-05-241-9/+12
| | | | | | | | | | | | | | | | | The DO instruction doesn't exist on Gen6+. Since before this commit, DO always ended a basic block, if it also happened to start one (e.g., a while loop inside an if statement) the block containing only the DO would actually contain no hardware instructions. Pre-Gen6's WHILE instructions jumps to the instruction following the DO, so strictly speaking we won't be modeling that properly, but I claim there is actually no functional difference. This will simplify an upcoming change where we want to mark the first hardware instruction in the loop as beginning a block, and the last instruction before the loop as ending one. Reviewed-by: Eric Anholt <[email protected]>