summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Finally kill struct brw_wm_compile (better known as 'c').Kenneth Graunke2014-05-182-16/+11
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Stop copying the program key.Kenneth Graunke2014-05-181-6/+4
| | | | | | | | | We already have a perfectly good copy of the program key, and nobody is going to modify it. The only reason we copied it was because the brw_wm_compile structure embedded the key rather than pointing to it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Rip struct brw_wm_compile out of the visitors and generators.Kenneth Graunke2014-05-189-28/+33
| | | | | | | | | Instead, just pass the key and prog_data as separate parameters. This moves it up a level - one step further toward getting rid of it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Plumb a mem_ctx all the way through the FS compile.Kenneth Graunke2014-05-188-15/+23
| | | | | | | | 'c' is going away, but we still need a memory context that lives for the duration of the compile. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Use 'c' as the mem_ctx in fs_visitor.Kenneth Graunke2014-05-181-2/+1
| | | | | | | | | | | | | | | | | | | | | | | Previously, the memory context situation was a bit of a mess: fs_visitor allocated its own memory context, and freed it in the destructor. However, some data produced by fs_visitor (such as the list of instructions) needs to live beyond when fs_visitor is "done", so the caller can pass it to fs_generator. Everything worked out because brw_wm_fs_emit's fs_visitor variables happen to not go out of scope until the end of the function. But that meant that moving the declaration of, say, the SIMD16 fs_visitor instance, could cause everything to explode. Using a memory context that exists for the duration of the compile is clearer, and should be equivalent. Ultimately, we don't want to use 'c', but this matches the behavior of fs_generator and gen8_fs_generator, so it'll be simple to change later. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Actually free program data on the error path.Kenneth Graunke2014-05-181-1/+3
| | | | | | | | | We throw away the data generated during compilation on the success path, so we really ought to on the failure path as well. The caller has no access to it anyway, so it's purely leaked. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Replace c->key with a direct reference in the generators.Kenneth Graunke2014-05-183-15/+18
| | | | | | | | | | 'c' is going away. This is also a bit shorter. Marking the key pointer as const will also deter people from changing it in these classes, as that's absolutely not OK. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Replace c->key with a direct reference in fs_visitor.Kenneth Graunke2014-05-185-47/+49
| | | | | | | | | | 'c' is going away. This is also shorter. Marking the key pointer as const will also deter people from changing it in fs_visitor, as it's absolutely not OK to modify it there. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Replace c->prog_data with a direct reference in the generators.Kenneth Graunke2014-05-183-24/+28
| | | | | | | 'c' is going away. This is also a bit shorter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Replace c->prog_data with a direct reference in fs_visitor.Kenneth Graunke2014-05-183-26/+28
| | | | | | | 'c' is going away. This is also a bit shorter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move some flags that affect code generation to fs_visitor.Kenneth Graunke2014-05-185-8/+10
| | | | | | | | | runtime_check_aads_emit isn't actually used currently, but I believe we should be using it on Gen4-5, so I haven't eliminated it. See https://bugs.freedesktop.org/show_bug.cgi?id=78679 for details. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move payload register info from brw_wm_compile to fs_visitor.Kenneth Graunke2014-05-186-45/+53
| | | | | | | | | | | | This data is created by fs_visitor and only used when emitting code, so keeping it in fs_visitor makes sense. I decided it would be reasonable to group these all together in a struct, since they're highly related. v2: s/nr_payload_regs/payload.num_regs/ in some comments (chrisf). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Simplify gl_SampleMaskIn handling.Kenneth Graunke2014-05-183-12/+3
| | | | | | | | | | | | | As far as I can tell, there's no point in allocating an extra register and generating a MOV---we can just use the copy provided as part of our thread payload directly. It's already in the right format. Of course, there are zero Piglit tests for this. We don't actually ship the extension (GL_ARB_gpu_shader5) that exposes this functionality either. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Rename c->sample_mask_reg to sample_mask_in_reg.Kenneth Graunke2014-05-182-3/+3
| | | | | | | | This is actually for gl_SampleMaskIn, which is quite different than gl_SampleMask. Renaming should help avoid confusion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move c->last_scratch into fs_visitor.Kenneth Graunke2014-05-185-6/+8
| | | | | | | | Nothing outside of fs_visitor uses it, so we may as well keep it internal. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move total_scratch calculation into fs_visitor::run().Kenneth Graunke2014-05-182-4/+5
| | | | | | | | | | | With this one use gone, c->last_scratch is now only used inside fs_visitor. The rest of the driver uses prog_data->total_scratch. We already compute similar prog_data fields in fs_visitor, so this seems reasonable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move perf_debug about register spilling to a more obvious spot.Kenneth Graunke2014-05-182-4/+4
| | | | | | | | | | | | | The if (!allocated_without_spills) block is an obvious spot for this performance warning message. In the Vec4 backend, scratch is also used for indirect access of temporary arrays. The FS backend doesn't implement that yet, but if it did, this message would be inaccurate, since scratch access wouldn't necessarily mean spilling. Moving it preemptively fixes that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.Kenneth Graunke2014-05-1814-23/+24
| | | | | | | "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst.Kenneth Graunke2014-05-187-8/+11
| | | | | | | | We're going to use "disassemble" for the function that disassembles the whole program. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix dump_prog_cache to handle compacted instructions.Kenneth Graunke2014-05-181-13/+5
| | | | | | | | | | | dump_prog_cache has interpreted compacted instructions as full size instructions, decoding garbage and complaining about invalid values. We can just use brw_dump_compile to handle this correctly in less code. The output format changes slightly, but it's still perfectly acceptable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use brw_dump_compile for clip, SF, and old GS programs.Kenneth Graunke2014-05-183-13/+3
| | | | | | | | | | | | Looping over the instructions and calling brw_disasm doesn't handle compacted instructions. In most cases, this hasn't been a problem since we don't compact prior to Sandybridge. However, Sandybridge's transform feedback GS program should already be compacted, and so this ought to fix decoding of that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: fix double-freeing of dispatch tables inside glBegin/End.Brian Paul2014-05-161-2/+2
| | | | | | | | | | | We allocate dispatch tables for BeginEnd and OutsideBeginEnd. But when we destroy the context we were freeing the BeginEnd and Exec tables. If Exec==BeginEnd we did a double-free. This would happen if the context was destroyed while inside a glBegin/End pair. Now free the BeginEnd and OutsideBeginEnd pointers. Cc: "10.1", "10.2" <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* i965: Use binary literals counter select.Matt Turner2014-05-151-2/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl_to_tgsi: Make sure the 'shader' member is always initializedMichel Dänzer2014-05-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | Fixes the valgrind report below and random crashes with piglit on radeonsi. ==30005== Conditional jump or move depends on uninitialised value(s) ==30005== at 0xB13584E: st_translate_program (st_glsl_to_tgsi.cpp:5100) ==30005== by 0xB14698B: st_translate_fragment_program (st_program.c:747) ==30005== by 0xB14777D: st_get_fp_variant (st_program.c:824) ==30005== by 0xB11219C: get_color_fp_variant (st_cb_drawpixels.c:1042) ==30005== by 0xB1131AE: st_DrawPixels (st_cb_drawpixels.c:1154) ==30005== by 0xAFF8806: _mesa_DrawPixels (drawpix.c:162) ==30005== by 0x4EB86DB: stub_glDrawPixels (generated_dispatch.c:6640) ==30005== by 0x4F1DF08: piglit_visualize_image (piglit-util-gl.c:1574) ==30005== by 0x40691D: draw_image_to_window_system_fb(int, bool) (draw-buffers-common.cpp:733) ==30005== by 0x406C8B: draw_reference_image(bool, bool) (draw-buffers-common.cpp:854) ==30005== by 0x40722A: piglit_display (alpha-to-coverage-dual-src-blend.cpp:117) ==30005== by 0x4EA7168: run_test (piglit_fbo_framework.c:52) Cc: "10.1 10.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen8: Make disassembly function match brw's signature.Matt Turner2014-05-154-9/+12
| | | | | | | | gen8_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pass brw_context and assembly separately to brw_dump_compile.Matt Turner2014-05-156-14/+12
| | | | | | | | brw_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pull brw_compact_instructions() out of brw_get_program().Matt Turner2014-05-157-9/+10
| | | | | Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: Align send instruction meta-information with dst.Matt Turner2014-05-151-0/+1
| | | | | | | Has been misaligned since we added instruction offset prefixes. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: Disassemble the compaction control bit.Matt Turner2014-05-159-10/+18
| | | | | | | | | brw_disasm doesn't disassemble compacted instructions, so we uncompact before disassembling them which would unset the compaction control bit. Instead pass it as a separate argument. Acked-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cfg: Embed exec_node in bblock_link.Matt Turner2014-05-155-23/+25
| | | | | | | In order to remove bblock_link's inheritance of exec_node. Also makes linked list walk code much nicer. Acked-by: Eric Anholt <[email protected]>
* i965/cfg: Make brw_cfg.h closer to C-includable.Matt Turner2014-05-151-13/+23
| | | | | | Only bblock_link's inheritance left. Acked-by: Eric Anholt <[email protected]>
* i965/cfg: Protect brw_cfg.h from multiple inclusion.Matt Turner2014-05-151-0/+6
| | | | Acked-by: Eric Anholt <[email protected]>
* i965/fb: Use meta path for stencil up/downsamplingTopi Pohjolainen2014-05-151-1/+8
| | | | | | Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/meta: Stencil blit for miptree updownsamplingTopi Pohjolainen2014-05-152-0/+38
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fb: Use meta path for stencil blitsTopi Pohjolainen2014-05-151-0/+9
| | | | | | | | | This is effective only on gen8 for now as previous generations still go through blorp. Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/meta: Stencil blitsTopi Pohjolainen2014-05-153-0/+497
| | | | | | | | | | | | | v2: Create the intel renderbuffer with level hardcoded to zero instead of overriding it in the surface state configuration. Also moved the dimension adjustments for tiling, mip level, msaa into the render buffer creation. Finally prepares for another blit path needed for miptree updownsampling. v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()" Cc: "10.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Extend brw_get_rb_for_first_slice() for specified level/layerTopi Pohjolainen2014-05-152-7/+29
| | | | | | | | | | v2: Configure stencil directly for final dimensions instead of adjusting bit by bit for tiling, mip level and msaa. v3 (Ken): Used non-static constant for horizontal alignment Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Surface state overriding for stencilTopi Pohjolainen2014-05-151-13/+21
| | | | | | | | | | | | | | v2: Allow hardware to offset accesses to individual layers. Also leave the mip-level overriding for the creator of the intel renderbuffer to handle. Merged with "i965/gen8: Allow stencil buffers to be configured as single sampled" Ken: I left the "_mesa_problem()" still in place. I think it is clearer to remove it in a separate patch. Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/wm: Surface state overrides for configuring w-tiled as y-tiledTopi Pohjolainen2014-05-152-0/+30
| | | | | | | | | | v2: Use intel_mipmap_tree::total_width in order to get correct alignment automatically. Also use "mt->total_height / mt->physical_depth0" as surface height allowing hardware to offset to correct slice. Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965 meta up/downsample: Fix renderbuffer _BaseFormatJordan Justen2014-05-151-1/+2
| | | | | | | | | | | | mt->format is of type mesa_format, and therefore can't be used with _mesa_base_fbo_format which requires a GLenum input. On gen8, this fixes various piglit fbo-depthstencil tests with samples > 1. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.2" <[email protected]>
* i965: Delete current_insn() function.Matt Turner2014-05-152-7/+2
|
* i965: Remove blorp unit tests.Matt Turner2014-05-153-1099/+1
| | | | | | | | | They've served their purpose (in transitioning blorp to using fs_generator) and now they just necessitate large amounts of manual labor to regenerate if the disassembler changes. Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* mesa/st: fix number of ubos being declared in a shaderRoland Scheidegger2014-05-151-3/+5
| | | | | | | | | | | | Previously the code used the total number of ubos being declared in the linked program (so the ubos of all shaders combined), use the number from the particular shader instead. This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks seen in llvmpipe since 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f as it now emits code for each declared buffer, not just the ones actually used. CC: "10.1 10.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa/st: provide native integers implementation of ir_unop_anyIlia Mirkin2014-05-141-24/+76
| | | | | | | | | | | Previously, ir_unop_any was implemented via a dot-product call, which uses floating point multiplication and addition. The multiplication was completely pointless, and the addition can just as well be done with an or. Since we know that the inputs are booleans, they must already be in canonical 0/~0 format, and the final SNE can also be avoided. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965: Reformat brw_set_src1 so it can be easily found with grep.Matt Turner2014-05-131-3/+4
|
* i965: fix size assert for gen7 in brw_init_compaction_tables()Samuel Iglesias Gonsalvez2014-05-131-4/+4
| | | | | | | | It should compare with it's own size. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
* i965: Relax accumulator dependency scheduling on Gen < 6Iago Toral Quiroga2014-05-133-59/+36
| | | | | | | | | | | Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 Reviewed-by: Matt Turner <[email protected]>
* mesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump.Kenneth Graunke2014-05-131-1/+26
| | | | | | | | | | As far as I can tell, Mesa hasn't had a convenient way to dump ARB_vp/fp source until now. Using MESA_GLSL=dump is convenient, since it means you can use a single environment variable to dump a program's shaders, no matter which language they're written in. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.Kenneth Graunke2014-05-131-1/+1
| | | | | | | | | | | | | | | The point of copytexsubimage_using_blit_framebuffer is to use a hardware accelerated BlitFramebuffer path. If that fails, we shouldn't do a swrast blit---we should try our CTSI fallback code. This is especially important for i965 and GLES, where we don't even create a swrast context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "10.2" <[email protected]>
* i965/gen8: Set depth extent fieldJordan Justen2014-05-131-1/+1
| | | | | | | | | | | | The depth extent field is used to limit the allowed slice range that can be rendered to. With the previous setting, only slice 0 could be rendered. This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>