aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965
Commit message (Collapse)AuthorAgeFilesLines
* i965/dri: Combine declaration and assignment in intelCreateBufferChad Versace2017-05-301-2/+1
| | | | | | Trivial cleanup. Reviewed-by: Tapani Pälli <[email protected]>
* i965/dri: Rewrite comment for intelCreateBufferChad Versace2017-05-301-1/+5
| | | | | | | The old comment pinned this function to X11 windows. In reality, this function serves more than X11 and more than just windows. Reviewed-by: Tapani Pälli <[email protected]>
* i965: Always scissor on Gen4-5 instead of disabling guardband.Kenneth Graunke2017-05-292-28/+13
| | | | | | | | See commit ece0e535a44c228dd994861592deb155c14740d8. This makes Gen4-5 follow the behavior we use on Gen6+. It seems to have worked out there. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Unify Gen4-5 and Gen6 SF_VIEWPORT/CLIP_VIEWPORT code.Kenneth Graunke2017-05-293-114/+9
| | | | | | | This brings the improved guardbanding we implemented on Gen6+ back to the older Gen4-5 code. It also deletes piles of code. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a set_scissor_bits helper function.Kenneth Graunke2017-05-291-33/+40
| | | | | | | | Gen4-5 include a single SCISSOR_RECT in SF_VIEWPORT. Making a helper function will allow us to reuse this code for Gen4-5. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use GENX(packet_length) rather than hardcoded dword counts.Kenneth Graunke2017-05-291-9/+12
| | | | | | This is clearer and less likely to break in the future. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move the scissoring code up near the viewport code.Kenneth Graunke2017-05-291-86/+86
| | | | | | | | These are fairly related. Gen4-5 combine the scissor rectangle and SF_VIEWPORT. Co-locating them will allow me to avoid forward declarations of helper functions in a few patches. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Replace brw->gen and devinfo->gen with GEN_GEN.Kenneth Graunke2017-05-291-6/+4
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rework Sandybridge 3DSTATE_VIEWPORT_STATE_POINTERS.Kenneth Graunke2017-05-291-33/+15
| | | | | | | | | | | | | On Gen7+ we emit 3DSTATE_VIEWPORT_STATE_POINTERS_{SF_CL,CC} when emitting a new viewport. This patch makes us take the same approach on Sandybridge - but because we have a combined command, we just set the appropriate "change" bits. This eliminates an atom, some dirty flagging, and some brw->*.vp_offset writes. It does mean we'll emit two 3DSTATE_VIEWPORT_STATE_POINTERS instead of one if both change, but that's probably fine. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Port CC_VIEWPORT to genxml.Kenneth Graunke2017-05-293-52/+55
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/copy_image: Use the blitter on gen5Jason Ekstrand2017-05-261-1/+1
| | | | | | | This was just an accidental typo in the refactoring. The intention was to try the blitter on gen4-5, not just gen4. Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Support copyteximage on gen4-5Jason Ekstrand2017-05-261-4/+7
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use blorp for CopyImageSubData on gen4-5Jason Ekstrand2017-05-261-123/+17
| | | | | | | | We keep the blit path because it's probably faster when it works. However, now that we can use blorp, we can delete that nasty CPU fall-back path. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Round copy size to the nearest block in intel_miptree_copyJason Ekstrand2017-05-261-2/+2
| | | | | | | | | | | The width and height of the copy don't have to be aligned to the block size if they specify the right or bottom edges of the image. (See also the comment and asserts right above). We need to round them up when we do the division in order to get it 100% right. Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.0 17.1" <[email protected]>
* i965: Use BLORP for color clears on gen4-5Jason Ekstrand2017-05-261-2/+1
| | | | | | | | We don't support replicated data clears yet. Those take a bit more work and enabling replicated data clears in its own commit is probably better for bisectibility anyway. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use blorp for color blits on gen4-5Jason Ekstrand2017-05-262-53/+30
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add blorp support for gen4-5Jason Ekstrand2017-05-266-3/+227
| | | | | | | | | | Due to complications with things such as URB setup on gen4-5, it's easier to keep gen4 support in blorp completely internal to i965. This makes things a bit awkward because that means there's a file in i965 that includes blorp_priv.h but it's either that or have a file in blorp that includes brw_context.h. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/gen4: Expose the guts of URB recalculation as a helperJason Ekstrand2017-05-262-5/+12
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add support for gen4-5 SF programsJason Ekstrand2017-05-261-1/+2
| | | | | | | | As part of enabling support for SF programs, we plumb the SF URB size through to emit_urb_config. For now, it's always zero but, on gen4, it may be something larger. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move clip program compilation to the compilerJason Ekstrand2017-05-2610-2339/+22
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move SF compilation to the compilerJason Ekstrand2017-05-267-981/+12
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/clip: Make brw_clip_prog_key::interp_mode an arrayJason Ekstrand2017-05-262-2/+6
| | | | | | | | | Having it be a pointer means that we end up caching clip programs based on a pointer to wm_prog_data rather than the actual interpolation modes. We've been caching one clip program per FS ever since 91d61fbf7cb61a44a where Timothy rewrote brw_setup_vue_interpolation(). Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/sf: make brw_sf_prog_key::interp_mode an arrayJason Ekstrand2017-05-262-2/+6
| | | | | | | | | Having it be a pointer means that we end up caching clip programs based on a pointer to wm_prog_data rather than the actual interpolation modes. We've been caching one clip program per FS ever since 91d61fbf7cb61a44a where Timothy rewrote brw_setup_vue_interpolation(). Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Sampler state is a pointer on gen4-5Jason Ekstrand2017-05-261-1/+1
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Properly handle mt->first_levelJason Ekstrand2017-05-261-0/+3
| | | | | | | | The guts of blorp and ISL don't understand i965's partial miptrees. Instead, we need to subtract off first_level before we hand anything off to blorp. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Take first_level into account when converting to ISLJason Ekstrand2017-05-261-1/+1
| | | | | | | ISL doesn't have a concept of a partial miptree. Instead, we need to subtract off first_level. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use blorp_copy for doing r8 stencil updates on HSWJason Ekstrand2017-05-261-15/+4
| | | | | | | | | | The blorp_copy entrypoint is designed for doing memcpy like operations which is what we need to do here while blorp_blit is for handling format conversion and scaling. Using blorp_copy is much simpler and prevents us from getting formats wrong. While we're here, we get rid of the layers_per_blit thing since stencil always uses interleaved MSAA. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Do and end-of-pipe sync on both sides of fast-clear opsJason Ekstrand2017-05-261-18/+38
| | | | | | | | | | | | | | | | We've discovered in the Vulkan driver that simply doing the end-of-pipe sync afterwards is insufficient. The specific requirement stated in the PRM is that you have to do one every time you transition between the tree modes of "clear", "render", and "resolve". This is GL, so we could track it but any attempt to do so would most likely get it wrong. For now, it's easier to just assume that every fast-clear op is an island and do the sync both before and after. This also removes the unneeded flush and stall after slow-clear operations. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "17.0 17.1" <[email protected]>
* i965: use mmap64 for AndroidRob Herring2017-05-251-16/+3
| | | | | | | | | Simplify the handling of mmap for Android by using mmap64 instead. mmap64 may have not existed for Android when this was written, but it's been around since 2013. Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* i965: Enable ASTC HDR for BroxtonNanley Chery2017-05-221-0/+3
| | | | | | | | This platform passes the following GLES3 tests: ES3-CTS.functional.texture.compressed.astc.endpoint_value_hdr_cem_* Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa: GL_ARB_shader_subroutine is not optional in core profileIan Romanick2017-05-221-1/+0
| | | | | | | | | | | text data bss dec hex filename 7038459 235248 37280 7310987 6f8e8b 32-bit i965_dri.so before 7038227 235248 37280 7310755 6f8da3 32-bit i965_dri.so after 6681438 303400 50608 7035446 6b5a36 64-bit i965_dri.so before 6681254 303400 50608 7035262 6b597e 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* driconf: Add allow_glsl_builtin_variable_redeclaration optionJohn Brooks2017-05-202-0/+4
| | | | | | | | | | | | | | | This option will allow GLSL builtins to be redeclared verbatim (e.g. redeclaring "in int gl_VertexID" in a vertex shader). This is not strictly valid and would normally fail to compile, but some applications (such as newer Techland ports) do it and need more leniency. v2 (Samuel Pitoiset): - Rename allow_glsl_builtin_redeclaration -> allow_glsl_builtin_variable_redeclaration Signed-off-by: John Brooks <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* i965: Use the upload BO for push constants on Gen7.5-Gen8.Kenneth Graunke2017-05-202-2/+2
| | | | | | | | | | | | | | | | | | | | We can easily use the upload BO for push constants on Gen7.5/Gen8 too, at the cost of a relocation when emitting 3DSTATE_CONSTANT_XS. We can simply switch to using constant buffer pointer 2 instead of pointer 0, like we do on Gen9+. Ivybridge and Baytrail can't do this trick because they require the constant buffers to be enabled in order, starting with 0. We'd have to set the INSTPM bit to make the constant buffer pointer not relative to dynamic state base address, which would need kernel command parser support. Improves performance in GLBenchmark 2.7/TRex Offscreen by: - Broadwell GT2: 0.305608% +/- 0.19877% (n = 68) - Braswell: No difference proven (n = 742) - Haswell GT3e: 0.180755% +/- 0.0237505% (n = 30) Reviewed-by: Chris Forbes <[email protected]>
* i965: Use the upload BO for push constants on Gen9+.Kenneth Graunke2017-05-202-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Shaders can use quite a bit of uniform data. Better to put it in the upload buffers, like we do for client vertex data, rather than the batch buffer state area, which is primarly used for indirect state. This should free up batch space, allowing us to emit more commands in a batch before flushing. Because BRW_NEW_BATCH also causes a lot of state to be re-emitted, it may also reduce CPU overhead a little bit. We took this approach on Gen4-5, but switched to using the batch area on Gen6+ because buffer 0 is relative to Dynamic State Base Address by default, which is set to the start of the batch. On Gen9+, we already use a relocation due to a workaround, so this is trivial to change and has basically no downside. Unfortunately we can't change compute shader push constants because MEDIA_CURBE_LOAD always uses an offset from dynamic state base address. Improves performance in GLBenchmark 2.7/TRex Offscreen by: - Skylake GT4e: 0.52821% +/- 0.113402% (n = 190) - Apollolake: 0.510225% +/- 0.273064% (n = 70) Reviewed-by: Chris Forbes <[email protected]>
* i965: Drop BRW_NEW_PUSH_CONSTANT_ALLOCATION from CS packets.Kenneth Graunke2017-05-202-3/+1
| | | | | | | | | | | I don't think CS push constant uploading uses the section of L3 controlled by 3DSTATE_PUSH_CONSTANT_ALLOC_XS. So I don't think it needs to be re-emitted when that space is reallocated. The programming note in gen7_allocate_push_constants doesn't indicate this is necessary, at least. Reviewed-by: Chris Forbes <[email protected]>
* i965/formats: Update the three-channel DXT1 mappingsNanley Chery2017-05-182-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The procedure for decompressing an opaque DXT1 OpenGL format is dependant on the comparison of two colors stored in the first 32 bits of the compressed block. Here's the specified OpenGL behavior for reference: The RGB color for a texel at location (x,y) in the block is given by: RGB0, if color0 > color1 and code(x,y) == 0 RGB1, if color0 > color1 and code(x,y) == 1 (2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2 (RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3 RGB0, if color0 <= color1 and code(x,y) == 0 RGB1, if color0 <= color1 and code(x,y) == 1 (RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2 BLACK, if color0 <= color1 and code(x,y) == 3 The sampling operation performed on an opaque DXT1 Intel format essentially hard-codes the comparison result of the two colors as color0 > color1. This means that the behavior is incompatible with OpenGL. This is stated in the SKL PRM, Vol 5: Memory Views: Opaque Textures (DXT1_RGB) Texture format DXT1_RGB is identical to DXT1, with the exception that the One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and the resulting texel color is derived strictly from the Opaque Color Encoding. The alpha channel defaults to 1.0. Programming Note Context: Opaque Textures (DXT1_RGB) The behavior of this format is not compliant with the OGL spec. The opaque and non-opaque DXT1 OpenGL formats are specified to be decoded in exactly the same way except the BLACK value must have a transparent alpha channel in the latter. Use the four-channel BC1 Intel formats with the alpha set to 1 to provide the behavior required by the spec. Note that the alpha is already set to 1 for RGB formats in brw_get_texture_swizzle(). v2: Provide a more detailed commit message (Kenneth Graunke). v3: Ensure the alpha channel is set to 1 for DXT1 formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925 Cc: <[email protected]> Acked-by: Tapani Pälli <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965: Mark shader programs for capture in the error state.Matt Turner2017-05-156-1/+26
| | | | | | | | | | | | | When the GPU hangs, the kernel saves some state for us. Until now it has not included the shader programs, which are very often the reason the GPU hang occurred. With the programs saved in the error state, we should be more capable of debugging hangs. Thanks to Chris Wilson and Ben Widawsky who provided the kernel support for this feature ("drm/i915: Copy user requested buffers into the error state"), which will be in kernel v4.13. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: perf: fix pointer to integer castLionel Landwerlin2017-05-151-1/+1
| | | | | | | | v2: Just use cast to uintptr_t (Chris) Reported-by: Mauro Rossi <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: Port 3DSTATE_VF_TOPOLOGY on gen8+ to genxml.Rafael Antognolli2017-05-114-56/+21
| | | | | | | With this last state ported, we can get rid of gen8_draw_upload.c. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Port 3DSTATE_INDEX_BUFFER to genxml.Rafael Antognolli2017-05-115-74/+40
| | | | | | | | Also make the brw_get_index_type() function not shift its return, since that is genxml's job now. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Port brw_cs_state tracked state to genxml.Rafael Antognolli2017-05-113-164/+145
| | | | | | | Emit the respective commands using genxml code. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/genxml: Mostly style fixes for emit_vertices code.Rafael Antognolli2017-05-111-25/+17
| | | | | | | | | | | | | | Several issues were caught on review after the original patch landed. This commit fixes them. v2: - Fix padding (Topi) - Remove .DestinationElementOffset change from this patch (Topi) Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop brw_context::viewport_transform_enable.Kenneth Graunke2017-05-113-3/+1
| | | | | | | | | | This was used by the meta fast clear code. Now that we've switched back to BLORP, it's always true. We might want it back when we add a RECTLIST extension to GL, but that's someday in the future... Reviewed-by: Kristian H. Kristensen <[email protected]>
* i965: Port Gen4-5 VS_STATE to genxml.Kenneth Graunke2017-05-115-235/+68
| | | | | | It's actually not that much code. Reviewed-by: Rafael Antognolli <[email protected]>
* i965: Change GEN_GEN < 7 to GEN_GEN == 6 in 3DSTATE_VS code.Kenneth Graunke2017-05-111-5/+4
| | | | | | | | This whole code is surrounded in #if GEN_GEN >= 6, and this code only applies on Sandybridge. So, use GEN_GEN == 6 to reduce the delta in the next patch, when we add Gen4-5 support. Reviewed-by: Rafael Antognolli <[email protected]>
* mesa: remove _CurrentFragmentProgram from gl_pipeline_objectTimothy Arceri2017-05-111-1/+1
| | | | | | | | | | | | | This was added in b527dd65c830a as a work around because fixed function fragment shaders were tracked in ctx->FragmentProgram._Current as a gl_program rather than gl_shader_program. However after my refactoring of the program and shader structs at the end of 2016 which culminated in c505d6d85222, we no longer need gl_shader_program to track the current program making _CurrentFragmentProgram obsolete. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make INTEL_DEBUG=bat decode VS/CLIP/GS/SF/WM/CC_STATE on Gen4-5.Kenneth Graunke2017-05-101-1/+21
| | | | | | | | | | | | | | This is something the original decoder did, but I didn't bother with until now. I recently had to debug an Ironlake issue, and wanted to inspect VS_STATE. So, now it's back. The other packets in the switch statement are all Gen6/7+, where we use offsets from dynamic state base address, so we don't need the gtt_offset subtraction introduced here. We might want to make a helper for this hack at some point - perhaps when we introduce the next occurance. Acked-by: Jason Ekstrand <[email protected]>
* i965: Switch BRW_NEW_CURBE_OFFSETS to BRW_NEW_PUSH_CONSTANT_ALLOCATION.Kenneth Graunke2017-05-108-18/+14
| | | | | | | | | | | | The BRW_NEW_CURBE_OFFSETS dirty bit is signalled when changing the partitioning of the Constant Buffer URB section between the various shader stages, on Gen4-5. BRW_NEW_PUSH_CONSTANT_ALLOCATION is basically the same thing on Gen7+. So, save a bit, and use the new name. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop BRW_NEW_PUSH_CONSTANT_ALLOCATION from Gen6 code.Kenneth Graunke2017-05-101-9/+3
| | | | | | | Gen6 doesn't have a configurable push constant region. This is only used on Gen7+. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Only #if...#endif a single function or related section at a time.Kenneth Graunke2017-05-101-3/+38
| | | | | | | | | | | Previously we guarded large swathes of code with #if GEN ... #endif blocks. This made it difficult to see which generations include what. This patch splits up the #if..#endif sections so they surround a small section of code - usually a single function/atom, or sometimes a group of related functions. It should make the code easier to work on. Reviewed-by: Rafael Antognolli <[email protected]>