summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen7: Disallow Y tiling of renderable surfaces with valign of 2.Paul Berry2013-11-191-0/+17
| | | | | | | | Gen7 does not allow render targets to have a vertical alignment of 2. So, when creating a surface, if its format is renderable, and its vertical alignment is 2, force it to use X tiling. Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Prefer vertical alignment of 4 when possible.Paul Berry2013-11-191-3/+22
| | | | | | | | | | | | | Gen6+ allows for color buffers to use a vertical alignment of either 4 or 2. Previously we defaulted to 2. This may have caused problems on Gen7 because Y-tiled render targets are not allowed to use a vertical alignment of 2. This patch changes the vertical alignment to 4 on Gen7, except for the few formats where a vertical alignment of 2 is required. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Fix broken IR annotation in debug output.Paul Berry2013-11-191-1/+0
| | | | | | | | | | | | Commit 70953b5 (i965: Initialize all member variables of vec4_instruction on construction) inadvertently added a line to the vec4_instruction constructor setting this->ir to NULL, wiping out the previously set value. As a result, ever since then, the output of INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations. Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Emit workaround flush when changing GS enable state.Paul Berry2013-11-187-22/+72
| | | | | | | | | | v2: Don't go to extra work to avoid extraneous flushes. (Previous experiments in the kernel have suggested that flushing the pipeline when it is already empty is extremely cheap). Cc: "10.0" <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix broken assertsChris Forbes2013-11-172-2/+2
| | | | | | | These would never fire. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make swizzle_to_scs non-static.Kenneth Graunke2013-11-162-6/+7
| | | | | | | | | | | | | | We'll need this for Broadwell code as well. Normally, when we make things public, we add the "brw" prefix. I'm not crazy about that in this case, since it deals with prog_instruction.h's SWIZZLE_XYZW values, rather than the BRW_SWIZZLE_XYZW enums. However, I can't think of a better name, and at least the comments and code make it clear. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Move enum brw_urb_write_flags from brw_eu.h to brw_defines.h.Kenneth Graunke2013-11-162-71/+71
| | | | | | | | | Broadwell code should not include brw_eu.h (since it is for Gen4-7 assembly encoding), but needs the URB write flags enum. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965/fs: Remove force_sechalf stackKenneth Graunke2013-11-163-22/+1
| | | | | | | | | | | Only Gen4 color write setup uses the force_sechalf flag, and it only sets it on a single instruction. It also already has to get a pointer to the instruction and manually set the saturate flag, so we may as well just set force_sechalf the same way and avoid the complexity of a stack. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* dri/common: move source file lists to Makefile.sourcesEmil Velikov2013-11-163-11/+9
| | | | | | | | * Allow the lists to be shared among build systems. * Update automake and Android build systems. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* scons: move SConscript from gallium/targets/ to mesa/drivers/dri/common/Emil Velikov2013-11-161-0/+83
| | | | | | | | Store scons side by side with the other build systems. v2: cleanup after a failed rebase Signed-off-by: Emil Velikov <[email protected]>
* i965: Assert that IF with cmod is Gen6 only.Matt Turner2013-11-152-4/+4
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case.Vinson Lee2013-11-151-0/+2
| | | | | | | | | Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]>
* mesa: Remove PROGRAM_ENV_PARAM enum.Eric Anholt2013-11-152-9/+0
| | | | | | | | This has been replaced with referring to env parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Remove PROGRAM_LOCAL_PARAM enum.Eric Anholt2013-11-152-4/+0
| | | | | | | | This has been replaced with referring to local parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Rework brw_new_batch to actually start a new batch.Kenneth Graunke2013-11-151-4/+5
| | | | | | | | | | | | | | | Previously, brw_new_batch was called just after execbuf, but before intel_batchbuffer_reset. Essentially, it prepared for the creation of a new batch, that wasn't yet available, and which it didn't create. This was a bit awkward. This patch makes brw_new_batch call intel_batchbuffer_reset as the very first operation. This means that brw_new_batch actually creates a new batchbuffer, and thus has it available. It brings the creation of the new batchbuffer and BRW_NEW_BATCH flagging together into one place. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move cache_used_by_gpu flag setting to brw_finish_batch.Kenneth Graunke2013-11-151-6/+6
| | | | | | | It really makes more sense here. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i915: Actually enable __DRI2rendererQueryExtensionRecIan Romanick2013-11-151-0/+1
| | | | | | | | | | | | | | | | | | More rebase fail. This code was written long before i915 and i965 were split, so most of the code in i9[16]5/intel_screen.c only needed to exist in one place. It looks like I fixed n-1 of those places after rebasing on the split. I only found this from the defined-but-not-used warning for intelRendererQueryExtension. I noticed this while fixing the other, related warnings. (Note: During review, we decided to *not* pick this back to 10.0.) Signed-off-by: Ian Romanick <[email protected]> Cc: Daniel Vetter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Fix vertical alignment for multisampled buffers.Paul Berry2013-11-151-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit Size): j [vertical alignment] = 4 for any render target surface is multisampled (4x) From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most messages), under the "Surface Vertical Alignment" heading: This field is intended to be set to VALIGN_4 if the surface was rendered as a depth buffer, for a multisampled (4x) render target, or for a multisampled (8x) render target, since these surfaces support only alignment of 4. Back in 2012 when we added multisampling support to the i965 driver, we forgot to update the logic for computing the vertical alignment, so we were often using a vertical alignment of 2 for multisampled buffers, leading to subtle rendering errors. Note that the specs also require a vertical alignment of 4 for all Y-tiled render target surfaces; I plan to address that in a separate patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Initialize schedule_node::delay.Vinson Lee2013-11-141-0/+1
| | | | | | | Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* dri: Change value param to unsignedIan Romanick2013-11-134-4/+4
| | | | | | | | | This silences some compiler warnings in i915 and i965. See also 75982a5. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.0" <[email protected]>
* i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiBIan Romanick2013-11-131-3/+7
| | | | | | | | | | Systems with little physical memory installed will report less than 2GiB, and some systems may (hypothetically?) have a larger address space for the GPU. My IVB still reports 1534. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]>
* i915: Use drm_intel_get_aperture_sizes instead of drmAgpSizeIan Romanick2013-11-131-2/+6
| | | | | | | | Send the zombie back to the grave before it infects the townsfolk. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: "10.0" <[email protected]>
* i965: implement blit path for PBO glDrawPixelsAlexander Monakov2013-11-132-0/+120
| | | | | | | | | | | | This patch implements accelerated path for glDrawPixels from a PBO in i965. The code follows what intel_pixel_read, intel_pixel_copy, intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests show no regressions. In my testing on IVB, performance improvement is huge (about 30x, didn't measure exactly) since generic path goes via _mesa_unpack_color_span_float, memcpy, extract_float_rgba. Signed-off-by: Alexander Monakov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri: Remove redundant createNewContext function from __DRIimageDriverExtensionKristian Høgsberg2013-11-121-1/+0
| | | | | | | | | | createContextAttribs is a superset of what createNewContext provides. Also remove the function typedef, since createNewContext is deprecated and no longer used in multiple interfaces. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]>
* dri/i915, dri/i965: Fix support for planar imagesAnder Conselvan de Oliveira2013-11-122-2/+4
| | | | | | | | | | | Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that moved the conversion from dri_format to the mesa format made it impossible to allocate a image with that format. Signed-off-by: Ander Conselvan de Oliveira <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.0" <[email protected]>
* i965/fs: Try a different pre-scheduling heuristic if the first spills.Eric Anholt2013-11-125-54/+76
| | | | | | | | | | | | | | | | | | | | | | | Since LIFO fails on some shaders in one particular way, and non-LIFO systematically fails in another way on different kinds of shaders, try them both, and pick whichever one successfully register allocates first. Slightly prefer non-LIFO in case we produce extra dependencies in register allocation, since it should start out with fewer stalls than LIFO. This is madness, but I haven't come up with another way to get unigine tropics to not spill while keeping other programs from not spilling and retaining the non-unigine performance wins from texture-grf. total instructions in shared programs: 1626728 -> 1626288 (-0.03%) instructions in affected programs: 1015 -> 575 (-43.35%) GAINED: 50 LOST: 0 Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445 Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Do instruction pre-scheduling just before register allocation.Eric Anholt2013-11-121-2/+2
| | | | | | | | | | | Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling barriers, so we had to run scheduler before them in order for it to be able to do basically anything. Now that that's fixed, we can delay the scheduling until we go to allocate (which will make the next change less scary). Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Ignore actual latency pre-reg-alloc.Eric Anholt2013-11-121-21/+29
| | | | | | | | | | | | | | | | We care about depth-until-program-end, as a proxy for "make sure I schedule those early instructions that open up the other things that can make progress while keeping register pressure low", not actual latency (since we're relying on the post-register-alloc scheduling to actually schedule for the hardware). total instructions in shared programs: 1609931 -> 1609931 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 55 LOST: 43 Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Fix message setup for SIMD8 spills.Eric Anholt2013-11-121-1/+1
| | | | | | | | | | | | | | In the SIMD16 spilling changes, I replaced a "1" in the spill path with "mlen", but obviously it wasn't mlen before because spills have the g0 header along with the payload. The interface I was trying to use was asking for how many physical regs we're writing, so we're looking for "1" or "2". I'm guessing this actually passed piglit because the high 8 bits of the execution mask in SIMD8 mode are all 0s. Cc: "10.0" <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Prefer things we know reduce reg pressure when pre-scheduling.Eric Anholt2013-11-121-0/+144
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, the best thing we had was to schedule the things unblocked by the last chosen instruction, on the hope that it would be consuming two values at the end of their live intervals while only producing one new value. But that's just a guess, and we can do counting of usage of registers to know when an instruction would (almost surely) reduce register pressure. The only failure mode I know of in this new dominant heuristic is that inside of a loop when scheduling the iterator (for example), choosing the last use of the iterator doesn't actually reduce the live interval of the iterator. But it doesn't seem to matter in shader-db: total instructions in shared programs: 1618700 -> 1618700 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 13 LOST: 0 Note: The new functions are made virtual because I expect we'll soon lift the pre-regalloc scheduling heuristic over to the vec4 backend. Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix undefined value usage in ABO setup.Eric Anholt2013-11-121-1/+1
| | | | | | | Fixes a compiler warning. Cc: "10.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add a warning if something ever hits a bug I noticed.Eric Anholt2013-11-121-0/+4
| | | | | | We'd have to map the VBO and rewrite things to a lower stride to fix it. Reviewed-by: Matt Turner <[email protected]>
* i965: Move #define's inside function as local variablesAnuj Phogat2013-11-111-8/+5
| | | | | | | | | X_f, Y_f, Xp_f, Yp_f variables are used just inside translate_dst_to_src().So, they can be defined just as local variables. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i915, i965: Fix memory leak in intel_miptree_create_for_bo.Vinson Lee2013-11-112-2/+6
| | | | | | | Fixes "Resource leak" defects reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitorChris Forbes2013-11-101-7/+11
| | | | | | | | | | Previously, we would bogusly replace the entire statement containing the ir_texture node with an ir_dereference_variable. Correct this to just replace the ir_texture node itself as intended. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the driver compile until a proper libdrm can be released.Eric Anholt2013-11-092-5/+10
| | | | No depending on unreleased code.
* i965/fs: Don't perform CSE on inst HW_REG dests (unless it's null)Matt Turner2013-11-091-1/+2
| | | | | | | | | | | | | | | Commit b16b3c87 began performing CSE on CMP instructions with null destinations. I relaxed the restrictions a bit too much, thereby allowing CSE to be performed on instructions with, for instance, an explicit accumulator destination. This broke the arb_gpu_shader5/fs-imulExtended shader tests because they emit MUL instructions with the accumulator as the destination. CSE would instead cause the MUL to write to a GRF, which is lower precision than the accumulator. Reviewed-by: Eric Anholt <[email protected]> Cc: 10.0 <[email protected]>
* i965: Remove some tiny dead code from intel_miptree_map_movntdqaChad Versace2013-11-081-3/+0
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* swrast: add missing notify_reset parameter to dri_create_context()Brian Paul2013-11-081-0/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* dri: add __DRIimageLoaderExtension and __DRIimageDriverExtensionKeith Packard2013-11-0711-11/+352
| | | | | | | | | | | | | | | | | | | | | These provide an interface between the driver and the loader to allocate color buffers through the DRIimage extension interface rather than through a loader-specific extension (as is used by DRI2, for instance). The driver uses the loader 'getBuffers' interface to allocate color buffers. The loader uses the createNewScreen2, createNewDrawable, createNewContext, getAPIMask and createContextAttribs APIS (mostly shared with DRI2). This interface will work with the DRI3 loader, and should also work with GBM and other loaders so that drivers need not be customized for each new loader interface, as long as they provide this image interface. v2: Fix build of i915 and i965 together (by anholt) Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* dri/i915,dri/i965: Use driGLFormatToImageFormat and driImageFormatToGLFormatKeith Packard2013-11-072-108/+8
| | | | | | | | Remove private versions of these functions Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* dri/common: Add functions mapping MESA_FORMAT_* <-> __DRI_IMAGE_FORMAT_*Keith Packard2013-11-072-0/+68
| | | | | | | | | | The __DRI_IMAGE_FORMAT codes are used by the image extension, drivers need to be able to translate between them. Instead of duplicating this translation in each driver, create a shared version. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* dri/intel: Add explicit size parameter to intel_region_alloc_for_fdKeith Packard2013-11-076-9/+10
| | | | | | | | | Instead of assuming that the size will be height * pitch, have the caller pass in the size explicitly. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* dri/intel: Split out DRI2 buffer update code to separate functionKeith Packard2013-11-072-44/+68
| | | | | | | | Make an easy place to splice in a DRI3 version of this function Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* drivers/dri/common: A few dri2 functions are not actually DRI2 specificKeith Packard2013-11-071-37/+37
| | | | | | | | | This just renames them so that they can be used with the DRI3 extension without causing too much confusion. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Wire up initial support for DRI_RENDERER_QUERY extensionIan Romanick2013-11-071-0/+83
| | | | | | | | v2: Use sysconf instead of sysinfo for improved portability. Suggested by Ken. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Wire up initial support for DRI_RENDERER_QUERY extensionIan Romanick2013-11-071-0/+81
| | | | | | | | v2: Use sysconf instead of sysinfo for improved portability. Suggested by Ken. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri: Add function to implement queries common to all Mesa driversIan Romanick2013-11-072-0/+67
| | | | | | | | | | v2: Add assertions that the version string has the expected format. This will catch build errors (or changes to the version string format) in debug build without exposing release builds to buffer over-runs. Suggested by Ken. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Refactor the renderer string creation out of intelGetStringIan Romanick2013-11-072-13/+23
| | | | | | | | | | | This will soon be used in intel_screen.c from a function that doesn't have a gl_context. v2: Delete local variables that are now unused. This matches v1 of the changes to the i915 driver. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Refactor the renderer string creation out of intelGetStringIan Romanick2013-11-072-13/+23
| | | | | | | | This will soon be used in intel_screen.c from a function that doesn't have a gl_context. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>