aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/blorp/blorp.c
Commit message (Collapse)AuthorAgeFilesLines
* nir: add callback to nir_remove_dead_variables()Timothy Arceri2020-06-031-1/+1
| | | | | | | | | | | | This allows us to do API specific checks before removing variable without filling nir_remove_dead_variables() with API specific code. In the following patches we will use this to support the removal of dead uniforms in GLSL. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
* intel/fs: Allow multiple slots for positionCaio Marcelo de Oliveira Filho2020-04-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | Change brw_compute_vue_map() to also take the number of pos slots. If more than one slot is used, the VARYING_SLOT_POS is treated as an array. When using Primitive Replication, instead of a single position, the VUE must contain an array of positions. Padding might be necessary (after clip distance) to ensure rest of attributes start aligned. v2: Add note about array in the commit message and assert that pos_slots >= 1 to make clear 0 is invalid. (Jason) Move padding to be after the clip distance. v3: Apply the correct offset when gathering the sources from outputs. Reviewed-by: Jason Ekstrand <[email protected]> [v2] Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
* intel/blorp: Plumb the stage through blorp upload_shaderCaio Marcelo de Oliveira Filho2020-03-171-1/+2
| | | | | | | | | | Vulkan uses that for its own upload function -- even though for BLORP it doesn't really currently care. Neither Iris and i965 makes use of it at the moment. Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
* intel/blorp: Add helper function for stencil buffer resolveSagar Ghuge2019-10-291-0/+30
| | | | | | | | On Gen12+, Stencil buffer's lossless compression should be resolved with WM_HZ_OP packet. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Don't assert aux slices match main slicesNanley Chery2019-10-281-3/+0
| | | | | | | | | This isn't accurate enough for HiZ which can have a discontiguous range of supported aux slices. This also won't work with the plan to represent Gen12 CCS as a single slice surface. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* blorp: Memset surface info to zero when initializing itJason Ekstrand2019-09-061-0/+1
| | | | | | | | This isn't known to fix any current bugs but it does prevent a regression in a subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/fs: Drop the gl_program from fs_visitorJason Ekstrand2019-08-251-1/+1
| | | | | | | | | It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Fill a compiler statistics structJason Ekstrand2019-08-121-2/+2
| | | | | | | | | This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/compiler: Add a "base class" for program keysJason Ekstrand2019-07-101-1/+1
| | | | | | | | | Right now, all keys have two things in common: a program string ID and a sampler_prog_key_data. I'd like to add another thing or two and need a place to put it. This commit adds a new brw_base_prog_key struct which contains those two common bits. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/nir: Stop returning the shader from helpersJason Ekstrand2019-06-051-2/+2
| | | | | | | | Now that NIR_TEST_* doesn't swap the shader out from under us, it's sufficient to just modify the shader rather than having to return in case we're testing serialization or cloning. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-061-2/+2
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* blorp: Pass the batch to lookup/upload_shader instead of contextKenneth Graunke2019-01-101-3/+4
| | | | | | | | | This will allow drivers to pin shader buffers if necessary. i965 and anv do not need to do this today, but iris will. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Properly handle Z24X8 blits.Kenneth Graunke2018-08-111-12/+0
| | | | | | | | | | | | | | | | | | | One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS destinations were broken was that an earlier layer was swapping it out for B8G8R8A8_UNORM. That made Z24X8 -> Z24X8 blits work. However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken. The old code only considered one format at a time, without thinking that format conversion may need to occur. This patch moves the translation out to a place where it can consider both formats. If both are Z24X8, we continue using B8G8R8A8_UNORM to avoid having to do shader math workarounds. If we have a Z24X8 destination, but a non-matching source, we use our shader hacks to actually render to it properly. Fixes: 804856fa5735164cc0733ad0ea62adad39b00ae2 (intel/blorp: Handle more exotic destination formats) Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add plumbing for shader time in 32-wide FS dispatch mode.Francisco Jerez2018-06-281-1/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move nir_lower_deref_instrs to right before locals_to_regsJason Ekstrand2018-06-221-2/+0
| | | | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand2018-06-221-0/+2
| | | | | | | | | | | This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Support blits and clears on surfaces with offsetsJason Ekstrand2018-05-251-0/+22
| | | | | | | | | | | | | For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_hiz_opJason Ekstrand2018-02-081-1/+1
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965: Drop render_target_start from binding table struct.Kenneth Graunke2018-01-221-2/+4
| | | | | | | | | We have to start render targets at binding table index 0 in order to use headerless FB write messages, and in fact already assume this in a bunch of places in the code. Let's finish that off, and not bother storing 0 in a struct to pretend to add it in a few places. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Drop support for the legacy SNORM -> Float equation.Kenneth Graunke2018-01-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Older OpenGL defines two equations for converting from signed-normalized to floating point data. These are: f = (2c + 1)/(2^b - 1) (equation 2.2) f = max{c/2^(b-1) - 1), -1.0} (equation 2.3) Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be used in all scenarios, and remove equation 2.2. DirectX uses equation 2.3 as well. Intel hardware only supports equation 2.3, so Gen7.5+ systems that use the vertex fetcher hardware to do the conversions always get formula 2.3. This can make a big difference for 10-10-10-2 formats - the 2-bit value can represent 0 with equation 2.3, and cannot with equation 2.2. Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES. Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use the new rules, at least in core profile. That would leave Gen4-6 doing something different than all other hardware, which seems...lame. With context version promotion, applications that requested a pre-4.2 context may get promoted to 4.2, and thus get the new rules. Zero cases have been reported of this being a problem. However, we've received a report that following the old rules breaks expectations. SuperTuxKart apparently renders the cars red when following equation 2.2, and works correctly when following equation 2.3: https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405 So, this patch deletes the legacy equation 2.2 support entirely, making all hardware and APIs consistently use the new equation 2.3 rules. If we ever find an application that truly requires the old formula, then we'd likely want that application to work on modern hardware, too. We'd likely restore this support as a driconf option. Until then, drop it. This commit will regress Piglit's draw-vertices-2101010 test on pre-Haswell without the corresponding Piglit patch to accept either formula (commit 35daaa1695ea01eb85bc02f9be9b6ebd1a7113a1): draw-vertices-2101010: Accept either SNORM conversion formula. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* intel/blorp: Add initial support for indirect clear colorsJason Ekstrand2017-11-271-0/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/compiler: Remove final_program_size from brw_compile_*Jordan Justen2017-10-311-6/+4
| | | | | | | | | The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Rewrite the world of push/pull paramsJason Ekstrand2017-10-121-1/+1
| | | | | | | | | | | | | | | | | This moves us away to the array of pointers model and onto a model where each param is represented by a generic uint32_t handle. We reserve 2^16 of these handles for builtins that get generated by somewhere inside the compiler and have well-defined meanings. Generic params have handles whose meanings are defined by the driver. The primary downside to this new approach is that it moves a little bit of the work that we would normally do at compile time to draw time. On my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to have any measurable affect on OglBatch7. So, while this may come back to bite us, it doesn't look too bad. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Assert levels and layers are in rangeNanley Chery2017-06-261-0/+7
| | | | | | | | | | | | v2 (Jason Ekstrand): - Update commit title. - Check aux level and layer as well. v3 (Jason Ekstrand): - Move the non-aux layer check. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Use FullSurfaceDepthandStencilClear for blorp_hiz_opJason Ekstrand2017-06-071-0/+1
| | | | | | | The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ buffer so we can just set the flag unconditionally. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Refactor the HiZ op interfaceJason Ekstrand2017-06-071-49/+55
| | | | | | | | | | This commit does a few things: 1) Now that BLORP can do HiZ ops on gen8+, drop the gen6 prefix. 2) Switch parameters to uint32_t to match the rest of blorp. 3) Take a range of layers and loop internally. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add blorp support for gen4-5Jason Ekstrand2017-05-261-3/+3
| | | | | | | | | | Due to complications with things such as URB setup on gen4-5, it's easier to keep gen4 support in blorp completely internal to i965. This makes things a bit awkward because that means there's a file in i965 that includes blorp_priv.h but it's either that or have a file in blorp that includes brw_context.h. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Set additional brw_wm_prog_key fields on gen4-5Jason Ekstrand2017-05-261-1/+8
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add support for gen4-5 SF programsJason Ekstrand2017-05-261-0/+62
| | | | | | | | As part of enabling support for SF programs, we plumb the SF URB size through to emit_urb_config. For now, it's always zero but, on gen4, it may be something larger. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Move the gen7 stencil format workaround to blorp_blitJason Ekstrand2017-05-261-5/+0
| | | | | | | | It's not needed for blorp_copy because it already overrides formats. It's also not needed for blorp_clear because it clears stencil as stencil. Reviewed-by: Topi Pohjolainen <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move the back-end compiler to src/intel/compilerJason Ekstrand2017-03-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Add support for vertex shadersJason Ekstrand2016-11-161-0/+31
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Remove NIR support for uniformsJason Ekstrand2016-11-161-23/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Make the number of samples an explicit parameterJason Ekstrand2016-11-161-0/+2
| | | | | | | | | Previously, we always inferred it from params->dst which meant that references to params->dst were scattered all throughout the state upload code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use wm_prog_data instead of hand-rolling our ownJason Ekstrand2016-11-021-21/+7
| | | | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012 Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: "13.0" <[email protected]>
* intel/blorp: Rework our usage of ralloc when compiling shadersJason Ekstrand2016-10-271-8/+2
| | | | | | | | | | | | | | | | Previously, we were creating the shader with a NULL ralloc context and then trusting in blorp_compile_fs to clean it up. The only problem was that blorp_compile_fs didn't clean up its context properly so we were leaking. When I went to fix that, I realized that it couldn't because it has to return the shader binary which is allocated off of that context and used by the caller. The solution is to make blorp_compile_fs take a ralloc context, allocate the nir_shaders directly off that context, and clean it all up in whatever function creates the shader and calls blorp_compile_fs. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: "12.0, 13.0" <[email protected]>
* intel/blorp: Rename compile_nir_shader to compile_fsJason Ekstrand2016-10-271-5/+5
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: rewrite brw_setup_vue_interpolation()Timothy Arceri2016-10-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier array in gl_fragment_program which will allow us to remove it. This change also makes the code which is only used by gen4/5 more self contained as it now has its own gen5_fragment_program struct rather than storing the map in brw_context. This means the interpolation map will only get processed once and will get stored in the in memory cache rather than being processed everytime the fs changes. Also by calling this from the fs compile code rather than from the upload code and using the interpolation assigned there we can get rid of the BRW_NEW_INTERPOLATION_MAP flag. It might not seem ideal to add a gen5_fragment_program struct however by the end of this series we will have gotten rid of all the brw_{shader_stage}_program structs and replaced them with a generic brw_program struct so there will only be two program structs which is better than what we have now. V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following patch to fix build error. V3 - Suggestions by Jason: - name struct gen4_fragment_program rather than gen5_fragment_program - don't use enum with memset() - create interp mode set helper and simplify logic to call it - add assert when calling function to show prog will never be NULL for gen4/5 i.e. no Vulkan Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/i965/anv/radv/gallium: make shader info a pointerTimothy Arceri2016-10-261-1/+1
| | | | | | | | | | When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Add a flag to make blorp not re-emit dept/stencil buffersJason Ekstrand2016-10-141-1/+3
| | | | | | | | | | | In Vulkan, we want to be able to use blorp to perform clears inside of a render pass. If blorp stomps the depth/stencil buffers packets then we'll have to re-emit them. This gets tricky when secondary command buffers get involved. Instead, we'll simply guarantee that the depth and stencil buffers we pass to blorp (if any) match those already set in the hardware. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add an "enabled" bit to surface_infoJason Ekstrand2016-10-141-0/+2
| | | | | | | | This gives a slightly smarter way to check whether or not a particular surface exists than looking at the address. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* nir: Add a flag to lower_io to force "sample" interpolationJason Ekstrand2016-09-151-1/+1
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Work in terms of logical array layersJason Ekstrand2016-09-121-12/+2
| | | | | | | | | | | | | | | | | | When Ivy Bridge introduced array multisampling, someone made the decision to do lots of stuff throughout the driver in terms of physical array layers rather than logical array layers. In ISL, we use logical array layers most of the time and it really makes no sense to use physical array layers in the blorp API. Every time someone passes physical array layers into blorp for an array multisampled surface, they're always divisible by the number of samples and we divide right away. Eventually, I'd like to rework most of the GL driver internals to use logical array layers but that's going to be a big project and will probably happen as part of the ISL conversion. For now, we'll do the conversion in brw_blorp and let blorp just use the logical layers. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add an isl_swizzle structure and use it for isl_view swizzlesJason Ekstrand2016-09-121-6/+1
| | | | | | | | | This should be more compact than the enum isl_channel_select[4] that we were using before. It's also very convenient because we already had such a structure in the Vulkan driver we just needed to pull it over. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Handle the 512 layers restriction on Sandy BridgeJason Ekstrand2016-09-121-0/+6
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Allow multiple layersTopi Pohjolainen2016-09-121-3/+6
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Move blorp into src/intel/blorpJason Ekstrand2016-08-291-0/+292
At this point, blorp is completely driver agnostic and can be safely moved into its own folder. Soon, we hope to start using it for doing blits in the Vulkan driver. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>