summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965
Commit message (Collapse)AuthorAgeFilesLines
* i965: Remove blorp unit tests.Matt Turner2014-05-153-1099/+1
| | | | | | | | | They've served their purpose (in transitioning blorp to using fs_generator) and now they just necessitate large amounts of manual labor to regenerate if the disassembler changes. Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Reformat brw_set_src1 so it can be easily found with grep.Matt Turner2014-05-131-3/+4
|
* i965: fix size assert for gen7 in brw_init_compaction_tables()Samuel Iglesias Gonsalvez2014-05-131-4/+4
| | | | | | | | It should compare with it's own size. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
* i965: Relax accumulator dependency scheduling on Gen < 6Iago Toral Quiroga2014-05-133-59/+36
| | | | | | | | | | | Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 Reviewed-by: Matt Turner <[email protected]>
* i965/gen8: Set depth extent fieldJordan Justen2014-05-131-1/+1
| | | | | | | | | | | | The depth extent field is used to limit the allowed slice range that can be rendered to. With the previous setting, only slice 0 could be rendered. This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen8 depth: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-2/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen7 depth: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-2/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-1/+1
| | | | | | | | Fixes piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-1/+1
| | | | | | | | | | | | If blorp is disabled for color clears, then piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' will fail. Currently, gen8 fails similarly on this test because gen8 does not use blorp. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()Iago Toral Quiroga2014-05-131-7/+6
| | | | | | | | We always call brw_merge_inputs() right before looping over the primitives but this can be called inside the loop for each primitive too. In the case we do it for the first primitive the call is redundant and can be skipped. Reviewed-by: Eric Anholt <[email protected]>
* i965: Stop doing remapping of "special" regs.Eric Anholt2014-05-121-37/+0
| | | | | | | | Now that we aren't using pixel_[xy] in live variables, nothing is looking at these regs after the visitor stage. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Generalize the pixel_x/y workaround for all UW types.Eric Anholt2014-05-121-4/+4
| | | | | | | | | | | | | | | This is the only case where a fs_reg in brw_fs_visitor is used during optimization/code generation, and it meant that optimizations had to be careful to not move pixel_x/y's register number without updating it. Additionally, it turns out we had a couple of other UW values that weren't getting this treatment (like gl_SampleID), so this more general fix is probably a good idea (though I wasn't able to replicate problems with either pixel_[xy]'s values or gl_SampleID, even when telling the register allocator to reuse registers immediately) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move has_hiz from the slice to the level.Eric Anholt2014-05-125-30/+25
| | | | | | | | The value depends only on the level, so no need to store the bool per slice. Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an existing hole in intel_mipmap_level. Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: Expose coordinate scissoring and mirroringTopi Pohjolainen2014-05-124-118/+213
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Use helper variables for surface parametersTopi Pohjolainen2014-05-121-4/+8
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* Revert "i965: Fix depth (array slices) computation for 1D_ARRAY render targets."Kenneth Graunke2014-05-092-5/+0
| | | | | | | | | | This reverts commit e6967270c75a5b669152127bb7a746d55f4407a6. Chris Forbes pointed out that this is broken for texture views which restrict the number of slices. He committed a better fix which makes this unnecessary. Cc: "10.2" <[email protected]>
* i965: Fix GPU hangs on Broadwell in shaders with some control flow.Kenneth Graunke2014-05-091-7/+7
| | | | | | | | | | | | | | | | | | According to the documentation, we need to set the source 0 register type to IMM for flow control instructions that have both JIP and UIP. Fixes GPU hangs in approximately 10 Piglit tests, 5 es3conform tests, Unigine Crypt, a WebGL raytracer demo, and several Steam titles. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75478 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75878 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76939 Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Topi Pohjolainen <[email protected]> Tested-by: Kristian Høgsberg <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/Gen8: Set up layer constraints properly for depth buffersChris Forbes2014-05-091-9/+6
| | | | | | | | | | | | | | | | | Same issues as the previous commit fixed for Gen7: - Bogus physical->logical layer conversion; depth/stencil surfaces are still IMS layout on Gen8. - mt_layer ignored in layered rendering case, which breaks handling of views with MinLayer. - Render target array extent not set correctly for arrays. I'm not able to test this one since I can't get a Broadwell yet, but it's the same set of fixes as for Gen7. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/Gen7: Set up layer constraints properly for depth buffersChris Forbes2014-05-091-9/+6
| | | | | | | | | | | | | | | Again, a few problems: - Layered attachments did not honor MinLayer. - Non-layered MSAA attachments rendered to the wrong layer due to dividing by the layer count. All depth buffers use the IMS layout, so the physical layer count == logical layer count. - Layered attachments were not limited to irb->layer_count, so we could render off the end of the texture. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/Gen8: Set up layer constraints properly for renderbuffersChris Forbes2014-05-091-10/+5
| | | | | | | | | | | Fixing the same issues the previous commit does for Gen7. Note that I can't test this one, since I don't have a Broadwell. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/Gen7: Set up layer constraints properly for renderbuffersChris Forbes2014-05-091-10/+7
| | | | | | | | | | | | | | | | There were a few problems here, which mostly just broke layered rendering into a view: - Render target view extent was always set to be == depth. This is benign for non-layered-rendering, but allows writes off the end of the render target for layered rendering, which ends badly. - Layered rendering did not honor the mt_layer setting, so would not properly handle MinLayer being set on a view. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix typo in assert messageChris Forbes2014-05-091-1/+1
| | | | Signed-off-by: Chris Forbes <[email protected]>
* i965: Fix depth (array slices) computation for 1D_ARRAY render targets.Kenneth Graunke2014-05-072-0/+5
| | | | | | | | | | | 1D array targets store the number of slices in the Height field. Fixes Piglit's spec/!OpenGL 3.2/layered-rendering/clear-color-all-types 1d_array single_level, at least when used with Meta clears. Cc: "10.2 10.1 10.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Enable GL_ARB_texture_view on Broadwell.Kenneth Graunke2014-05-072-12/+21
| | | | | | | | | This is a port of commit c9c08867ed07ceb10b67ffac5f0a33812710a5e8. A tiny bit of extra work was necessary to not break stencil texturing. Cc: "10.2" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Always intel_prepare_render() after invalidating front buffers.Kenneth Graunke2014-05-061-0/+2
| | | | | | | | | | | | | | | Fixes glean/texture_srgb, which hit recursive-flush prevention assertions in vbo_exec_FlushVertices. This probably hurts the performance of front buffer rendering, but very few people in their right mind do front buffer rendering. Fixes Glean's texture_srgb test. Cc: "10.2" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Anuj Phogat <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Set miptree target field when creating from a BO.Kenneth Graunke2014-05-021-0/+1
| | | | | | | | | | | | | | | | | Prior to commit 8435b60a3577d2d905eae189cd7e770500177e99, the region equivalent of this function called intel_miptree_create_layout, which set mt->target to target. With that commit, it no longer copied target. Piglit's ext_image_dma_buf_import-sample_[xa]rgb8888 tests would then hit an assertion failure, where image->TexObject->Target was GL_TEXTURE_EXTERNAL_OES, and mt->target was GL_TEXTURE_2D. Copying the target fixes this assertion failure. Cc: "10.2" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move push constant state packets to push constant update time.Eric Anholt2014-05-028-46/+42
| | | | | | | -0.553779% +/- 0.423394% effect on cairo-perf-trace runtime on glamor (n=612) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Merge gen8_upload_constant_state into gen7_upload_constant_state.Eric Anholt2014-05-025-34/+16
| | | | | | | The two paths are really similar, and the extra conditionals will be dwarfed by the cost of the actual upload. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Refactor gen7_upload_constant_state to look more like gen8.Eric Anholt2014-05-021-25/+15
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop unnecessary state flag for units on NEW_BINDING_TABLE.Eric Anholt2014-05-026-6/+0
| | | | | | | | | | | Commit 30259856a8a82a55c030df1ad052e505c61144bc moved the state packets to table generation time, but forgot to make this change. Apparently the performance win there was about not reemitting the table pointers on unrelated state changes. No performance difference on cairo on glamor (n=118). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7+: Move sampler state packets to the stage sampler state table update.Eric Anholt2014-05-029-44/+24
| | | | | | | | | | | | Now that we have the stage state coming into our setup of sampler states, it's easy to drop an identifier into it of which stage the stage_state is, and then look up which packet to emit in a little table. No performance difference on cairo on glamor (n=492). v2: Don't forget to do the workaround flush on IVB. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Don't update unit state when samplers change.Eric Anholt2014-05-022-3/+2
| | | | | | There's no remaining dependency between these two packets that I can find. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop a NEW_SAMPLER annotation for use of sampler_count.Eric Anholt2014-05-023-3/+0
| | | | | | | The sampler count is set up from the gl_program at draw time, not at sampler change time. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify sampler setup by passing the stage state.Eric Anholt2014-05-023-29/+13
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make batch dumping go to stderr, too.Eric Anholt2014-05-021-0/+1
| | | | | | All our other debug goes there. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix a stale comment referenceEric Anholt2014-05-021-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Enable INTEL_performance_query for Gen5+.Petri Latvala2014-05-021-1/+3
| | | | | Signed-off-by: Petri Latvala <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Remove support for desktop OpenGL GL_EXT_separate_shader_objectsIan Romanick2014-05-021-1/+0
| | | | | | | | | | | | | | | | I don't know of any applications that actually use it. Now that Mesa supports GL_ARB_separate_shader_objects in all drivers, this extension is just cruft. The entrypoints for the extension remain in the XML. This is done so that a new libGL will continue to provide dispatch support for old drivers that try to expose this extension. Future patches will add OpenGL ES GL_EXT_separate_shader_objects, but that's a different thing. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix the file comment for intel_image.hEric Anholt2014-05-011-5/+8
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Rename intel_regions.h to something more appropriate now.Eric Anholt2014-05-016-6/+6
| | | | | | | | We had the EGLimage structure laying around in intel_regions.h, but now it's the only thing left in the file. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Delete the intel_regions.c code.Eric Anholt2014-05-0122-256/+4
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Drop region usage from DRI2 winsys-allocated buffers.Eric Anholt2014-05-011-13/+17
| | | | | | | v2: Fix bad pointer on unreference (caught by Chad) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Drop a funny assert about mt pitch.Eric Anholt2014-05-011-1/+0
| | | | | | | | | I slipped this in in the region->pitch change from pixels to bytes, but I don't see any reason for it any more -- the libdrm code doesn't appear to divide pitch by a cpp. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix intel_bufferobj_buffer range for blit drawpixels.Eric Anholt2014-05-011-3/+2
| | | | | | | | If the stride wasn't width*cpp, we wouldn't track how much of the src is busy, and allow a subdata into the end to proceed unsynchronized. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Drop use of intel_region from miptrees.Eric Anholt2014-05-0120-256/+242
| | | | | | | | | | | | Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Replace the region in DRIimage with just a BO pointer and stride.Eric Anholt2014-05-016-162/+83
| | | | | | | | | | | | | Regions aren't refcounted safely for multithreaded applications, and they're not terribly useful wrappers of a BO, so I'm trying to remove them. Even the stride I added here could probably be reduced to use of an existing field in the __DRIimageRec, but I want this to be as mechanical of a change as possible. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Make intel_set_texture_region just take a BO and pitch.Eric Anholt2014-05-011-29/+27
| | | | | | | | | I want to do this to get the region removed from DRI images. However, it does mean that we won't share the intel_region between the rb and the texture for texture_from_pixmap. I think that's fine. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Stop making a pointless region for DRI2 to just throw it away.Eric Anholt2014-05-013-37/+43
| | | | | | | | | | | | I noticed that we were doing this while changing the DRI3 path to not use regions, which involved changing the signature of intel_update_winsys_renderbuffer_miptree() this way. v2: Replace my comment with Chad's version. Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Kristian Høgsberg <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]>
* i965: Drop the global GEM name from regions.Eric Anholt2014-05-015-25/+12
| | | | | | | Once a buffer has been named, drm_intel_bo_flink() is just a getter. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Drop the tiling argument to intel_miptree_create_for_bo.Eric Anholt2014-05-016-11/+10
| | | | | | | | | The drm function to get the tiling is just a getter storing the two pointers, so we don't need to go out of our way to avoid it. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]>