aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/iris/iris_screen.c
Commit message (Collapse)AuthorAgeFilesLines
* iris/perf: implement routines to return counter infoMark Janes2019-08-091-0/+3
| | | | | | | With this commit, Iris will report that AMD_performance_monitor is supported, and will allow the caller to query the available metrics. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: bump compat profile support to 4.6Timothy Arceri2019-08-021-2/+1
| | | | | | All of the current piglit compat profile tests pass. Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: Implement GL_EXT_shader_samples_identical via a new capabilityKenneth Graunke2019-08-011-0/+1
| | | | | | | | | This exposes the textureSamplesIdenticalEXT function in GLSL. We enable it for iris and radeonsi, because their compilers already have support for this. Tested on Intel Kabylake and AMD Vega 64. Reviewed-by: Marek Olšák <[email protected]>
* iris/screen: use initialization routine for gen_device_infoMark Janes2019-08-011-5/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/common: provide common ioctl routineMark Janes2019-08-011-1/+2
| | | | | | | | | | | i965 links against libdrm for drmIoctl, but anv and iris both re-implement this routine to avoid the dependency. intel/dev also needs an ioctl wrapper, so lets share the same implementation everywhere. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* iris: Enable EXT_texture_shadow_lodSagar Ghuge2019-07-301-0/+1
| | | | | Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Use a system value for gl_FragCoordJason Ekstrand2019-07-291-0/+1
| | | | | | | | | | | | It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Drop copy and pasted iris_timebase_scaleKenneth Graunke2019-07-161-1/+1
| | | | | Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few months ago, so we should just use that, and not our own copy in iris.
* gallium: get rid of PIPE_CAP_SM3Erik Faye-Lund2019-07-101-1/+3
| | | | | | | | | | | | | | | | | | | | | PIPE_CAP_SM3 has always been an odd one out of all our caps. While most other caps are fine-grained and single-purpose, this cap encode several features in one. And since OpenGL cares more about single features, it'd be nice to get rid of this one. As it turns, this is now relatively simple. We only really care about three features using this cap, and those already got their own caps. So we can remove it, and make sure all current drivers just give the same response to all of them. The only place we *really* care about SM3 is in nine, and there we can instead just re-construct the information based on the finer-grained caps. This avoids DX9 semantics from needlessly leaking into all of the drivers, most of who doesn't care a whole lot about DX9 specifically. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* iris: Minor tidyingKenneth Graunke2019-07-031-1/+1
|
* iris: Disable loop unrolling in GLSL IR.Kenneth Graunke2019-06-261-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leave it to NIR instead, like i965 does. Thanks to Tim Arceri for noticing that I'd left this enabled by accident. shader-db results on Skylake: total instructions in shared programs: 15522628 -> 15521642 (<.01%) instructions in affected programs: 94008 -> 93022 (-1.05%) helped: 34 HURT: 33 helped stats (abs) min: 12 max: 48 x̄: 33.82 x̃: 42 helped stats (rel) min: 0.06% max: 22.14% x̄: 9.86% x̃: 10.89% HURT stats (abs) min: 1 max: 16 x̄: 4.97 x̃: 3t HURT stats (rel) min: 0.82% max: 3.77% x̄: 1.73% x̃: 1.53% 95% mean confidence interval for instructions value: -20.08 -9.35 95% mean confidence interval for instructions %-change: -5.95% -2.36% Instructions are helped. total cycles in shared programs: 367105221 -> 367074230 (<.01%) cycles in affected programs: 10017660 -> 9986669 (-0.31%) helped: 266 HURT: 184 helped stats (abs) min: 1 max: 9556 x̄: 151.35 x̃: 12 helped stats (rel) min: 0.08% max: 59.91% x̄: 4.66% x̃: 1.67% HURT stats (abs) min: 1 max: 1716 x̄: 50.37 x̃: 6 HURT stats (rel) min: <.01% max: 24.40% x̄: 2.42% x̃: 0.85% 95% mean confidence interval for cycles value: -133.90 -3.84 95% mean confidence interval for cycles %-change: -2.44% -1.10% Cycles are helped. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* iris: Enable INTEL_shader_atomic_float_minmaxCaio Marcelo de Oliveira Filho2019-06-131-0/+1
| | | | | | | Supported only for gen >= 9. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* iris: Enable PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTEDCaio Marcelo de Oliveira Filho2019-06-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids lowering of CS system values by GLSL (configured by state tracker). In i965 we don't use that lowering, and we also shouldn't need that in Iris. Using it cause some unnecessary round trip between values, e.g.: shader uses gl_LocalInvocationIndex, GLSL rewrites it in terms of gl_LocalInvocationID, then driver rewrites those in terms of gl_LocalInvocationIndex again. Copy propagation can make some of those go away, but not all as seen below. Intel SKL shader-db results: total instructions in shared programs: 15595189 -> 15594556 (<.01%) instructions in affected programs: 74880 -> 74247 (-0.85%) helped: 81 HURT: 4 helped stats (abs) min: 2 max: 172 x̄: 7.88 x̃: 4 helped stats (rel) min: 0.19% max: 5.66% x̄: 1.71% x̃: 1.23% HURT stats (abs) min: 1 max: 2 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.45% max: 1.65% x̄: 0.76% x̃: 0.46% 95% mean confidence interval for instructions value: -11.56 -3.34 95% mean confidence interval for instructions %-change: -1.91% -1.28% Instructions are helped. total loops in shared programs: 4831 -> 4831 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 372136618 -> 372145628 (<.01%) cycles in affected programs: 9218230 -> 9227240 (0.10%) helped: 131 HURT: 86 helped stats (abs) min: 1 max: 798 x̄: 39.79 x̃: 12 helped stats (rel) min: <.01% max: 6.75% x̄: 0.42% x̃: 0.13% HURT stats (abs) min: 2 max: 2442 x̄: 165.38 x̃: 6 HURT stats (rel) min: <.01% max: 20.83% x̄: 0.74% x̃: 0.12% 95% mean confidence interval for cycles value: -2.07 85.11 95% mean confidence interval for cycles %-change: -0.22% 0.30% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 11956 -> 11950 (-0.05%) spills in affected programs: 77 -> 71 (-7.79%) helped: 3 HURT: 0 total fills in shared programs: 25619 -> 25549 (-0.27%) fills in affected programs: 593 -> 523 (-11.80%) helped: 4 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 1695.69 -> 1706.03 (0.61%) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Bypass half-float pack/unpack lowering.Kenneth Graunke2019-06-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This skips GLSL IR lowering of pack/unpackHalf operations, allowing the NIR optimizer to see them Improves performance in Synmark2's OglCSDof by about 2x, by cutting about 90% of the cycles from one of the compute shaders. shader-db statistics on Skylake: 4 compute shaders went from SIMD8 to SIMD16. total instructions in shared programs: 15598871 -> 15542568 (-0.36%) instructions in affected programs: 143016 -> 86713 (-39.37%) helped: 144 HURT: 0 helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164 helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22% 95% mean confidence interval for instructions value: -510.50 -271.49 95% mean confidence interval for instructions %-change: -32.70% -27.65% Instructions are helped. total cycles in shared programs: 371973958 -> 368902103 (-0.83%) cycles in affected programs: 5557722 -> 2485867 (-55.27%) helped: 144 HURT: 0 helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697 helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67% 95% mean confidence interval for cycles value: -41570.02 -1094.64 95% mean confidence interval for cycles %-change: -38.44% -33.80% Cycles are helped. total spills in shared programs: 11936 -> 11903 (-0.28%) spills in affected programs: 110 -> 77 (-30.00%) helped: 3 HURT: 2 total fills in shared programs: 25644 -> 25178 (-1.82%) fills in affected programs: 677 -> 211 (-68.83%) helped: 5 HURT: 0 total loops in shared programs: 4830 -> 4829 (-0.02%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0
* iris: Enable nir_opt_large_constantsJason Ekstrand2019-05-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | Shader-db results on Kaby Lake: total instructions in shared programs: 15306230 -> 15304726 (<.01%) instructions in affected programs: 4570 -> 3066 (-32.91%) helped: 16 HURT: 0 total cycles in shared programs: 361703436 -> 361680041 (<.01%) cycles in affected programs: 129388 -> 105993 (-18.08%) helped: 16 HURT: 0 LOST: 0 GAINED: 2 The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal Space Program Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Advertise coherent framebuffer fetchesKenneth Graunke2019-05-231-0/+2
| | | | | This lets us advertise GL_EXT_shader_framebuffer_fetch and GL_KHR_blend_equation_advanced_coherent support.
* gallium: Change PIPE_CAP_TGSI_FS_FBFETCH bool to PIPE_CAP_FBFETCH countKenneth Graunke2019-05-231-1/+2
| | | | | | | | | | | | | | TGSI's FBFETCH instruction currently only supports reading from a single render target, but NIR intrinsics can support multiple render targets. radeonsi can only support fetching from RT 0, but other drivers may be able to support fetching from any render target. To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?" to an integer number of render targets which can be fetched. Reviewed-by: Marek Olšák <[email protected]>
* iris: Expose the disk cache to the state tracker as well.Kenneth Graunke2019-05-211-0/+8
| | | | | | | | This lets st/nir cache the NIR for shaders, based on the shader source string hash, allowing us to skip initial compiles altogether, and also letting us start from there should we need to recompile for NOS. Reviewed-by: Dylan Baker <[email protected]>
* iris: Start wiring up on-disk shader cacheDylan Baker2019-05-211-0/+3
| | | | | | | | This creates the on-disk shader cache data structure, and handles the build-id keying aspects. The next commits will fill it out so it's actually used. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Dodge more GLSL IR loweringKenneth Graunke2019-05-151-2/+3
| | | | This avoids some lower_instructions bits in st.
* iris: Enable fragment shader interlock on Gen9+.Kenneth Graunke2019-05-141-0/+1
| | | | | | | | There's some debate about whether we should support this on older hardware as well. Currently i965 turns it off on Gen8- though, so we follow suit. If this changes, we can update this as well. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.Eric Anholt2019-05-131-1/+2
| | | | | | | | The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <[email protected]>
* iris: Implement ARB_indirect_parametersIllia Iorin2019-05-111-0/+2
| | | | | | | | | | | | | | | | | | | | iris_draw_vbo is divided into two functions to remove unnecessary operations from the loop. This implementation of ARB_indirect_parameters takes into account NV_conditional_render by saving MI_PREDICATE_RESULT at the start of a draw call and restoring it at the end also the result of NV_conditional_render is taken into account when computing predicates that limit draw calls for ARB_indirect_parameters in a similar way to 1952fd8d in ANV. v2: Optimize indirect draws (suggested by Kenneth Graunke) v3: (by Kenneth Graunke) - Fix an issue where indirect draws wouldn't set patch information before updating the compiled TCS. - Move some code back to iris_draw_vbo to avoid duplicating it. - Fix minor indentation issues. Signed-off-by: Illia Iorin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERYKenneth Graunke2019-05-091-0/+1
| | | | | | This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.
* iris: Report the same video memory settings as i965.Kenneth Graunke2019-05-081-2/+32
| | | | This just copy and pastes Ian's code from i965.
* iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKSKenneth Graunke2019-05-061-0/+1
| | | | | | | | | | | | | | | | This makes CompressedTexSubImage from a PBO source do proper GPU rendering to upload instead of stalling to map the PBO source on the CPU (then copying it on the CPU). Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this functionality, and to Jason Ekstrand for writing the code I adapted. Vulkan only supports a single layer, however, and this code tries to support multiple layers as long as it's miplevel 0. Improves performance in Sid Meier's Civilization VI: Average frame time (ms): -3.67423% +/- 1.46201% (n=5) 99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)
* iris: Only enable GL_AMD_depth_clamp_separate on Gen9+Kenneth Graunke2019-04-291-1/+1
| | | | | The hardware feature is new as of Gen9+. I accidentally enabled it on Gen8.
* iris: Enable GL_AMD_depth_clamp_separateKenneth Graunke2019-04-241-0/+1
| | | | We support this, we just forgot to turn it on.
* iris: Actually put Mesa in GL_RENDERER stringKenneth Graunke2019-04-241-1/+1
| | | | I constructed the right thing and then returned the other one.
* iris: add support for INTEL_conservative_rasterizationMike Blumenkrantz2019-04-231-0/+1
| | | | | | | | | this hooks up the iris gallium driver to existing mesa bits which handle the implementation resolves kwg/mesa#8 Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Replace buffer backing storage and rebind to update addresses.Kenneth Graunke2019-04-231-0/+1
| | | | | | | | | | | | | | | | This implements PIPE_CAP_INVALIDATE_BUFFER and invalidate_resource(), as well as the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag. When either of these happen, we swap out the backing storage of the buffer for a new idle BO, allowing us to write to it immediately without stalling or queueing a blit. On my Skylake GT4e at 1920x1080, this improves performance in games: ----------------------------------------------- | DiRT Rally | +25% (avg) | +17% (max) | | Bioshock Infinite | +22% (avg) | +11% (max) | | Shadow of Mordor | +27% (avg) | +83% (max) | -----------------------------------------------
* iris: Enable the dual_color_blend_by_location driconf option.Kenneth Graunke2019-04-221-0/+4
| | | | This fixes rendering in Unigine Valley 1.0 and Heaven 4.0.
* iris: Add mechanism for iris-specific driconf optionsKenneth Graunke2019-04-221-1/+1
| | | | | | Based on Nicolai's 0f8c5de8690e7c87aa2e24383065efaca7e6fe78. Reviewed-by: Dylan Baker <[email protected]>
* iris: Change vendor and renderer stringsKenneth Graunke2019-04-161-1/+4
| | | | | | | | | | | | | | | | | This patch changes the GL_VENDOR string from "Mesa Project" to "Intel". This makes GLX_MESA_query_renderer report "Vendor: Intel (0x8086)" instead of "Vendor: Mesa Project (0x8086)" which is arguably wrong. We now also use a consistent vendor string across Windows and Linux. It also prepends "Mesa" to the GL_RENDERER string, both to credit the community and have a distinguishing mark between the two drivers. We drop "DRI" compared to i965, as it's not really that important. Improves performance in Portal by 1.8x. Iris is now 3.86% faster than i965 at the portal-d1.dem timedemo on my Kabylake laptop. One change is that Portal selects the MapBufferRange path based on the vendor string, and iris's BufferSubData path is still missing the storage invalidation optimization.
* iris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is setKenneth Graunke2019-04-151-4/+11
| | | | | This matches i965's behavior, and makes sure that shader compiler messages are visible when setting INTEL_DEBUG=perf.
* iris: support INTEL_NO_HW environment variableMike Blumenkrantz2019-04-101-0/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Enable NV_compute_shader_derivativesCaio Marcelo de Oliveira Filho2019-04-081-0/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* iris: Clean up compiler warnings about unusedCaio Marcelo de Oliveira Filho2019-03-291-10/+0
| | | | | | | Removed a few unused variables and iris_getparam_boolean(). Kept 'name' around since there's a commented debug that make use of it. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Face should be a system value.Timur Kristóf2019-03-111-0/+1
| | | | | | | | | | | | This patch adds PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL which despite its name is not a TGSI-specific capability, just lets the state tracker know that it should generate a system value for FACE. This is needed if we want to run tgsi_to_nir on iris. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Use copy_region and staging resources to avoid transfer stallsKenneth Graunke2019-03-081-0/+2
| | | | | | | | | | | | This is similar to intel_miptree_map_blit and intel_buffer_object.c's temporary blits in i965. Improves performance of DiRT Rally by 20-25% by eliminating stalls. Breaks piglit's spec/arb_shader_image_load_store/host-mem-barrier, by using the GPU to do uploads, exposing a st/mesa issue where it doesn't give us memory_barrier() calls. This is a pre-existing issue and will be fixed by a later patch (currently out for review).
* iris: Wire up EGL_IMG_context_priorityChris Wilson2019-03-071-0/+5
| | | | | | | | | Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context construction flags. Testcase: piglit/egl-context-priority Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLYKenneth Graunke2019-03-071-1/+0
| | | | | | | | | | | This cap is mainly for working around a r600 texture swizzle issue, but it also controls whether ARB_texture_buffer_object (with legacy formats) is enabled. I suspect the missing I/L/A/LA faking is why I had it set in the first place. Thanks to Ilia for pointing out that I shouldn't be setting this. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Enable ARB_shader_draw_parameters supportJose Maria Casanova Crespo2019-02-261-0/+1
| | | | | | | | | | | | | | | | | | Additional VERTEX_ELEMENT_STATE are used to store basevertex and baseinstance and drawid updating the DWordLength of the 3DSTATE_VERTEX_ELEMENTS command. This passes all piglit tests for spec.*draw_parameters.* tests and VK-GL-CTS KHR-GL45.shader_draw_parameters_tests.* tests. Now we only mark a dirty_update when parameters are changed or when we have an indirect draw. We enable PIPE_CAP_DRAW_PARAMETERS on Iris. There is no edge flag support in the Vertex Elements setup. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Make an IRIS_MAX_MIPLEVELS defineKenneth Graunke2019-02-211-1/+1
|
* iris: Drop XXX about checking for swizzlingKenneth Graunke2019-02-211-2/+1
| | | | | | | | | | | | | Caio noted that this is not necessary on Gen8+: "Before Gen8, there was a historical configuration control field to swizzle address bit[6] for in X/Y tiling modes. This was set in three different places: TILECTL[1:0], ARB_MODE[5:4], and DISP_ARB_CTL[14:13]. For Gen8 and subsequent generations, the swizzle fields are all reserved, and the CPU's memory controller performs all address swizzling modifications." Since we don't support earlier hardware, we can skip it entirely.
* iris: improve PIPE_CAP_VIDEO_MEMORY bogus valueAndre Heider2019-02-211-1/+1
| | | | | | -1 is a little too bogus for most games ;) Signed-off-by: Andre Heider <[email protected]>
* iris: Stop chopping off the first nine characters of the renderer stringKenneth Graunke2019-02-211-1/+1
|
* iris: Add PIPE_CAP_MAX_VARYINGSKenneth Graunke2019-02-211-0/+1
|
* iris: minor tidyingKenneth Graunke2019-02-211-2/+0
|
* iris: Enable PIPE_CAP_COMPACT_ARRAYSKenneth Graunke2019-02-211-0/+1
|