summaryrefslogtreecommitdiffstats
path: root/src/intel/blorp
Commit message (Collapse)AuthorAgeFilesLines
* intel/blorp: Add a blorp_hiz_clear_depth_stencil helperJason Ekstrand2018-02-202-0/+64
| | | | | | | This is similar to blorp_gen8_hiz_clear_attachments except that it takes actual images instead of trusting in the already set depth state. Reviewed-by: Nanley Chery <[email protected]>
* i965/icl: Add assertions to check dispatch mode is SIMD8Anuj Phogat2018-02-151-0/+2
| | | | | | | | | | SIMD4x2 dispatch mode has been removed in GEN11. We're not using it anyways in Mesa. Adding few asserts to make it explicit. Use GEN_GEN macro in place of devinfo->gen (Ken) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/icl: Update the comment for maximum number of threads per PSDAnuj Phogat2018-02-151-4/+5
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/icl: Do StateCacheInvalidation for indirect clear colorAnuj Phogat2018-02-151-1/+1
| | | | | | | | | | StateCacheInvalidation is required on all gen7+ platforms. We don't need to update this check for every new gen h/w unless this requirement is changed. So, dropping the check for latest gen h/w. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_hiz_opJason Ekstrand2018-02-085-39/+25
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_fast_clear_opJason Ekstrand2018-02-084-22/+15
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Add a CCS ambiguation passJason Ekstrand2018-02-082-0/+158
| | | | | | | | | | | | This pass performs an "ambiguate" operation on a CCS-compressed surface by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly detailed notion of how the CCS is laid out so this is fairly simple to do. On gen7, the CCS tiling is quite crazy but that isn't an issue because we can only do CCS on single-slice images so we can just blast over the entire CCS buffer if we want to. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965: Drop render_target_start from binding table struct.Kenneth Graunke2018-01-221-2/+4
| | | | | | | | | We have to start render targets at binding table index 0 in order to use headerless FB write messages, and in fact already assume this in a bunch of places in the code. Let's finish that off, and not bother storing 0 in a struct to pretend to add it in a few places. Reviewed-by: Iago Toral Quiroga <[email protected]>
* meson: Use dependencies for nirDylan Baker2018-01-111-1/+2
| | | | | | | | | | | | | | | | | This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Dylan Baker <[email protected]>
* i965: Drop support for the legacy SNORM -> Float equation.Kenneth Graunke2018-01-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Older OpenGL defines two equations for converting from signed-normalized to floating point data. These are: f = (2c + 1)/(2^b - 1) (equation 2.2) f = max{c/2^(b-1) - 1), -1.0} (equation 2.3) Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be used in all scenarios, and remove equation 2.2. DirectX uses equation 2.3 as well. Intel hardware only supports equation 2.3, so Gen7.5+ systems that use the vertex fetcher hardware to do the conversions always get formula 2.3. This can make a big difference for 10-10-10-2 formats - the 2-bit value can represent 0 with equation 2.3, and cannot with equation 2.2. Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES. Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use the new rules, at least in core profile. That would leave Gen4-6 doing something different than all other hardware, which seems...lame. With context version promotion, applications that requested a pre-4.2 context may get promoted to 4.2, and thus get the new rules. Zero cases have been reported of this being a problem. However, we've received a report that following the old rules breaks expectations. SuperTuxKart apparently renders the cars red when following equation 2.2, and works correctly when following equation 2.3: https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405 So, this patch deletes the legacy equation 2.2 support entirely, making all hardware and APIs consistently use the new equation 2.3 rules. If we ever find an application that truly requires the old formula, then we'd likely want that application to work on modern hardware, too. We'd likely restore this support as a driconf option. Until then, drop it. This commit will regress Piglit's draw-vertices-2101010 test on pre-Haswell without the corresponding Piglit patch to accept either formula (commit 35daaa1695ea01eb85bc02f9be9b6ebd1a7113a1): draw-vertices-2101010: Accept either SNORM conversion formula. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* intel/blorp: Fix possible NULL pointer dereferencingVadym Shovkoplias2017-11-301-2/+2
| | | | | | | | | | | | Fix incomplete check of input params in blorp_surf_convert_to_uncompressed() which can lead to NULL pointer dereferencing. Fixes: 5ae8043fed2 ("intel/blorp: Add an entrypoint for doing bit-for-bit copies") Fixes: f395d0abc83 ("intel/blorp: Internally expose surf_convert_to_uncompressed") Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* intel/blorp: Drop blorp_resolve_ccs_attachmentJason Ekstrand2017-11-272-61/+20
| | | | | | | | | | The only reason why we needed that version was because the Vulkan driver needed to be able to create the surface states so it could handle indirect clear colors. Now that blorp handles them natively, there's no need for the extra entrypoint. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Take a range of layers in blorp_ccs_resolveJason Ekstrand2017-11-272-3/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Add initial support for indirect clear colorsJason Ekstrand2017-11-274-0/+86
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Add fast-clear to the special case in MSAA resolvesJason Ekstrand2017-11-271-2/+9
| | | | | | | | | | | | This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp/blit: Rename blorp_nir_txf_ms_mcsJason Ekstrand2017-11-271-4/+5
| | | | | | | | | That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Make the MOCS setting part of blorp_addressJason Ekstrand2017-11-132-15/+8
| | | | | | | | This makes our MOCS settings significantly more flexible. Cc: "17.3" <[email protected]> Tested-by: Lyude Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use mocs.tex for depth stencilJason Ekstrand2017-11-131-5/+1
| | | | | | Cc: "17.3" <[email protected]> Tested-by: Lyude Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meson: Don't build intel shared components by defaultDylan Baker2017-11-131-1/+0
| | | | | | | | It's a neat idea, and still useful in some cases, but the intel common code is used by i965 and anvil only, this is a little clearer. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel/compiler: Remove final_program_size from brw_compile_*Jordan Justen2017-10-314-22/+14
| | | | | | | | | The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: enable R32G32B32X32 blorp ccs copiesLionel Landwerlin2017-10-211-0/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-201-1/+1
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel: Rewrite the world of push/pull paramsJason Ekstrand2017-10-121-1/+1
| | | | | | | | | | | | | | | | | This moves us away to the array of pointers model and onto a model where each param is represented by a generic uint32_t handle. We reserve 2^16 of these handles for builtins that get generated by somewhere inside the compiler and have well-defined meanings. Generic params have handles whose meanings are defined by the driver. The primary downside to this new approach is that it moves a little bit of the work that we would normally do at compile time to draw time. On my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to have any measurable affect on OglBatch7. So, while this may come back to bite us, it doesn't look too bad. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meson: Add build Intel "anv" vulkan driverDylan Baker2017-09-271-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows building and installing the Intel "anv" Vulkan driver using meson and ninja, the driver has been tested against the CTS and has seems to pass the same series of tests (they both segfault when the CTS tries to run wayland wsi tests). There are still a mess of TODO, XXX, and FIXME comments in here. Those are mostly for meson bugs I'm trying to fix, or for additional things to implement for other drivers/features. I have configured all intermediate libraries and optional tools to not build by default, meaning they will only be built if they're pulled in as a dependency of a target that will actually be installed) this allows us to avoid massive if chains, while ensuring that only the bits that need to be built are. v2: - enable anv, x11, and wayland by default - add configure option to disable valgrind v3: - fix typo in meson_options (Nicholas) v4: - Remove dead code (Eric) - Remove change to generator that was from v0 (Eric) - replace if chain with loop (Eric) - Fix typos (Eric) - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric) v5: - rebase on util string buffer implementation Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v4)
* intel/blorp/hiz: Always set sample numberTopi Pohjolainen2017-09-211-0/+11
| | | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Handle clearing compressed surfacesJason Ekstrand2017-09-201-7/+17
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Internally expose surf_convert_to_uncompressedJason Ekstrand2017-09-202-13/+21
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Support clearing L8_UNORM_SRGB surfacesJason Ekstrand2017-09-191-0/+4
| | | | | | | | Vulkan needs to be able to clear any texture you can create. We want to add support for VK_FORMAT_R8_SRGB and we need to use L8_UNORM_SRGB to do that so we need to be able to clear it. Reviewed-by: Kenneth Graunke <[email protected]>
* blorp: Make blorp_buffer_copy work on Gen4-6.Kenneth Graunke2017-08-301-9/+10
| | | | | | Gen4-6 can only handle surfaces up to 8192. Only Gen7+ can do 16384. Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Turn anv_CmdCopyBuffer into a blorp_buffer_copy() helper.Kenneth Graunke2017-08-302-0/+125
| | | | | | | | | | | I want to be able to copy between buffer objects using BLORP in the i965 driver. Anvil already had code to do this, in a reasonably efficient manner - first using large bpp copies, then smaller bpp copies. This patch moves that logic into BLORP as blorp_buffer_copy(), so we can use it in both drivers. Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Explicitly cast between different enumsMatt Turner2017-08-291-5/+5
| | | | | | | | | | | | | Fixes warnings like warning: implicit conversion from enumeration type 'enum isl_format' to different enumeration type 'enum GEN10_SURFACE_FORMAT' [-Wenum-conversion] .SourceElementFormat = ISL_FORMAT_R32_UINT, ^~~~~~~~~~~~~~~~~~~ Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965: Do not store SRC after 0 on component control.Rafael Antognolli2017-08-251-2/+2
| | | | | | | | | | | | | | | | | | The PRM SKL-Vol 2b-05.16 says: "Within a VERTEX_ELEMENT_STATE structure, if a Component Control field is set to something other than VFCOMP_STORE_SRC, no higher-numbered Component Control fields may be set to VFCOMP_STORE_SRC. In other words, only trailing components can be set to something other than VFCOMP_STORE_SRC." Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3, let's also set them to VFCOMP_STORE_0. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/blorp: Adjust intra-tile x when faking rgb with red-onlyTopi Pohjolainen2017-08-211-0/+1
| | | | | | | | | | v2 (Jason): Adjust directly in surf_fake_rgb_with_red() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910 CC: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Reduce passing 2x32b of reloc_domains to 2 bitsChris Wilson2017-08-041-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel only cares about whether the object is to be written to or not, only reduces (reloc.read_domains, reloc.write_domain) down to just !!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read those relocs and instead userspace has to pass that information in the execobject.flags. We can simplify our reloc api by also removing the unused read/write domains and only pass the resultant flags. The caveat to the above are when we need to make the kernel aware that certain objects need to take into account different work arounds. Previously, this was done using the magic (INSTRUCTION, INSTRUCTION) reloc domains. NO_RELOC requires this to be passed in the execobject flags as well, and now we push that up the callstack. The API is more compact, more expressive of what happens underneath, but unfortunately requires more knowledge of the system at the point of use. Conversely it also means that knowledge is specific and not generally applied and so not overused. text data bss dec hex filename 8502991 356912 424944 9284847 8dacef lib/i965_dri.so (before) 8500455 356912 424944 9282311 8da307 lib/i965_dri.so (after) v2: (by Ken) Rebase. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl: Add a helper to get a subimage surfaceJason Ekstrand2017-07-221-30/+12
| | | | | | | We already have a helper for doing this in BLORP, this just moves the logic into ISL where we can share it with other components. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Allow blorp_copy on sRGB formatsJason Ekstrand2017-07-221-2/+16
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add a partial resolve pass for MCSJason Ekstrand2017-07-224-1/+213
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Allow BLORP calls to be predicatedNanley Chery2017-07-222-0/+6
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp/gen4: Drop cube map flag for single face copyTopi Pohjolainen2017-07-181-1/+7
| | | | | | | | This will falsely trigger an assert on number of layers once isl is used for 3D layouts of Gen4 cube maps. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* blorp: Use normalized coordinates on Gen6Ian Romanick2017-06-262-5/+8
| | | | | | | | | | | | | | Apparently, the sampler has some sort of precision issues for non-normalized texture coordinates with linear filtering. This caused some small precision issues in scaled blits. Work around this by using normalized coordinates. There is some extra work necessary because Gen6 uses TEX (instead of TXF) for some multisample resolve blits. Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* blorp/clear: Add a binding-table-based CCS resolve functionNanley Chery2017-06-262-17/+57
| | | | | | | | | | | | | v2: - Do layered resolves. (Jason Ekstrand): - Replace "bt" suffix with "attachment". - Rename helper function to prepare_ccs_resolve. - Move blorp_params_init() into helper function. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Check for layer fast-clear restrictionNanley Chery2017-06-261-0/+5
| | | | | | | | v2: Update commit title (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Assert levels and layers are in rangeNanley Chery2017-06-262-4/+7
| | | | | | | | | | | | v2 (Jason Ekstrand): - Update commit title. - Check aux level and layer as well. v3 (Jason Ekstrand): - Move the non-aux layer check. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Apply source offset in the TEX caseIan Romanick2017-06-201-0/+3
| | | | | | | | Previously the offset was only applied in the TXF case. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Apply Gen4 coord. normalization after cubemap sizes are adjustedIan Romanick2017-06-201-9/+11
| | | | | | | | | Otherwise the values used for coordinate normalization use the wrong sizes. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Set needs_(dst|src)_offset for Gen4 cubemapsJason Ekstrand2017-06-201-2/+6
| | | | | | | | | | | | | | | We call convert_to_single_slice so they may end up with a non-trivial offset that needs to be taken into account. v2 (idr): Also set needs_src_offset. Suggested by Jason. Fixes ES2-CTS.functional.texture.specification.basic_copyteximage2d.cube_rgba and ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba on G45. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101284 Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Work around Sandy Bridge occlusion query issueJason Ekstrand2017-06-141-0/+10
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* blorp: Use FullSurfaceDepthandStencilClear for blorp_hiz_opJason Ekstrand2017-06-073-0/+5
| | | | | | | The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ buffer so we can just set the flag unconditionally. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Plumb through access to the workaround BOJason Ekstrand2017-06-071-2/+7
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283 Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/blorp: Move the depth cache flush outside of BLORPNanley Chery2017-06-071-8/+0
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>